| Age | Commit message (Collapse) | Author | Files | Lines |
|
The function in fact searches the nearest node for a given one,
based on a N_ONLINE state. This is a common pattern to search
for a nearest node.
This patch converts numa_map_to_online_node() to numa_nearest_node()
so that others won't need to opencode the logic.
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The last RCU stall fix caused a massive throughput regression of the
hwrng on Raspberry Pi 0 - 3. hwrng_msleep doesn't sleep precisely enough
and usleep_range doesn't allow scheduling. So try to restore the
best possible throughput by introducing hwrng_yield which interruptable
sleeps for one jiffy.
Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
cpu_relax ~138025 Bytes / sec
hwrng_msleep(1000) ~13 Bytes / sec
hwrng_yield ~2510 Bytes / sec
Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/
Signed-off-by: Stefan Wahren <[email protected]>
Reviewed-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
|
|
Add inclusion of linux/errno.h as otherwise the reference to EINVAL
may be invalid.
Fixes: f3cf4134c5c6 ("bpf: Add bpf_lookup_*_key() and bpf_key_put() kfuncs")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Herbert Xu <[email protected]>
|
|
np->sndflow reads are racy.
Use one bit ftom atomic inet->inet_flags instead,
IPV6_FLOWINFO_SEND setsockopt() can be lockless.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Most np->pmtudisc reads are racy.
Move this 3bit field on a full byte, add annotations
and make IPV6_MTU_DISCOVER setsockopt() lockless.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Reads from np->rtalert_isolate are racy.
Move this flag to inet->inet_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move np->repflow to inet->inet_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
np->recverr is moved to inet->inet_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move np->dontfrag flag to inet->inet_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move np->autoflowlabel and np->autoflowlabel_set in inet->inet_flags,
to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move np->mc_all to an atomic flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move np->recverr_rfc4884 to an atomic flag to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This fixes data-races around np->mcast_hops,
and make IPV6_MULTICAST_HOPS lockless.
Note that np->mcast_hops is never negative,
thus can fit an u8 field instead of s16.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add inet6_{test|set|clear|assign}_bit() helpers.
Note that I am using bits from inet->inet_flags,
this might change in the future if we need more flags.
While solving data-races accessing np->mc_loop,
this patch also allows to implement lockless accesses
to np->mcast_hops in the following patch.
Also constify sk_mc_loop() argument.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Some np->hop_limit accesses are racy, when socket lock is not held.
Add missing annotations and switch to full lockless implementation.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
list_for_each_entry_rcu() takes an optional fourth argument which
allows RCU to assert that the correct lock is held. Several callers
of for_each_thread() rely on their caller to be holding the appropriate
lock, so this is a useful assertion to include.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The "sb_kern_mount" hook has implementation registered in SELinux.
Looking at the function implementation we observe that the "sb"
parameter is not changing.
Mark the "sb" parameter of LSM hook security_sb_kern_mount() as "const"
since it will not be changing in the LSM hook.
Signed-off-by: Khadija Kamran <[email protected]>
[PM: minor merge fuzzing due to other constification patches]
Signed-off-by: Paul Moore <[email protected]>
|
|
Three LSMs register the implementations for the 'bprm_committed_creds()'
hook: AppArmor, SELinux and tomoyo. Looking at the function
implementations we may observe that the 'bprm' parameter is not changing.
Mark the 'bprm' parameter of LSM hook security_bprm_committed_creds() as
'const' since it will not be changing in the LSM hook.
Signed-off-by: Khadija Kamran <[email protected]>
[PM: minor merge fuzzing due to other constification patches]
Signed-off-by: Paul Moore <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The 'extern' specifier in front of a function declaration has no effect.
Remove all of them from the qcom-scm header.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
udp->pcflag, udp->pcslen and udp->pcrlen reads/writes are racy.
Move udp->pcflag to udp->udp_flags for atomicity,
and add READ_ONCE()/WRITE_ONCE() annotations for pcslen and pcrlen.
Fixes: ba4e58eca8aa ("[NET]: Supporting UDP-Lite (RFC 3828) in Linux")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
This flag is set but never read, we can remove it.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Move udp->encap_enabled to udp->udp_flags.
Add udp_test_and_set_bit() helper to allow lockless
udp_tunnel_encap_enable() implementation.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
These are read locklessly, move them to udp_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
syzbot reported that udp->gro_enabled can be read locklessly.
Use one atomic bit from udp->udp_flags.
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
syzbot reported that udp->no_check6_rx can be read locklessly.
Use one atomic bit from udp->udp_flags.
Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
syzbot reported that udp->no_check6_tx can be read locklessly.
Use one atomic bit from udp->udp_flags
Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
According to syzbot, it is time to use proper atomic flags
for various UDP flags.
Add udp_flags field, and convert udp->corkflag to first
bit in it.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Pull in staged fixes for 6.6.
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The 'bprm_committing_creds' hook has implementations registered in
SELinux and Apparmor. Looking at the function implementations we observe
that the 'bprm' parameter is not changing.
Mark the 'bprm' parameter of LSM hook security_bprm_committing_creds()
as 'const' since it will not be changing in the LSM hook.
Signed-off-by: Khadija Kamran <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
The 'bprm_creds_from_file' hook has implementation registered in
commoncap. Looking at the function implementation we observe that the
'file' parameter is not changing.
Mark the 'file' parameter of LSM hook security_bprm_creds_from_file() as
'const' since it will not be changing in the LSM hook.
Signed-off-by: Khadija Kamran <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
SELinux registers the implementation for the "quotactl" hook. Looking at
the function implementation we observe that the parameter "sb" is not
changing.
Mark the "sb" parameter of LSM hook security_quotactl() as "const" since
it will not be changing in the LSM hook.
Signed-off-by: Khadija Kamran <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
Function kmem_dump_obj() will splat if passed a pointer to a non-slab
object. So nothing calls it directly, instead calling kmem_valid_obj()
first to determine whether the passed pointer to a valid slab object. This
means that merging kmem_valid_obj() into kmem_dump_obj() will make the
code more concise. Therefore, convert kmem_dump_obj() to work the same
way as vmalloc_dump_obj(), removing the need for the kmem_dump_obj()
caller to check kmem_valid_obj(). After this, there are no remaining
calls to kmem_valid_obj() anymore, and it can be safely removed.
Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: Zhen Lei <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
Commit a86baa69c2b7 ("rcu: Remove special bit at the bottom of the ->dynticks counter")
left behind this, remove it.
Signed-off-by: Yue Haibing <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
- fix reference to exported symbols for parisc64 [Masahiro Yamada]
- Block-TLB (BTLB) support on 32-bit CPUs
- sparse and build-warning fixes
* tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
linux/export: fix reference to exported functions for parisc64
parisc: BTLB: Initialize BTLB tables at CPU startup
parisc: firmware: Simplify calling non-PA20 functions
parisc: BTLB: _edata symbol has to be page aligned for BTLB support
parisc: BTLB: Add BTLB insert and purge firmware function wrappers
parisc: BTLB: Clear possibly existing BTLB entries
parisc: Prepare for Block-TLB support on 32-bit kernel
parisc: shmparam.h: Document aliasing requirements of PA-RISC
parisc: irq: Make irq_stack_union static to avoid sparse warning
parisc: drivers: Fix sparse warning
parisc: iosapic.c: Fix sparse warnings
parisc: ccio-dma: Fix sparse warnings
parisc: sba-iommu: Fix sparse warnigs
parisc: sba: Fix compile warning wrt list of SBA devices
parisc: sba_iommu: Fix build warning if procfs if disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Add missing LOCKDOWN checks for eventfs callers
When LOCKDOWN is active for tracing, it causes inconsistent state
when some functions succeed and others fail.
- Use dput() to free the top level eventfs descriptor
There was a race between accesses and freeing it.
- Fix a long standing bug that eventfs exposed due to changing timings
by dynamically creating files. That is, If a event file is opened for
an instance, there's nothing preventing the instance from being
removed which will make accessing the files cause use-after-free
bugs.
- Fix a ring buffer race that happens when iterating over the ring
buffer while writers are active. Check to make sure not to read the
event meta data if it's beyond the end of the ring buffer sub buffer.
- Fix the print trigger that disappeared because the test to create it
was looking for the event dir field being filled, but now it has the
"ef" field filled for the eventfs structure.
- Remove the unused "dir" field from the event structure.
- Fix the order of the trace_dynamic_info as it had it backwards for
the offset and len fields for which one was for which endianess.
- Fix NULL pointer dereference with eventfs_remove_rec()
If an allocation fails in one of the eventfs_add_*() functions, the
caller of it in event_subsystem_dir() or event_create_dir() assigns
the result to the structure. But it's assigning the ERR_PTR and not
NULL. This was passed to eventfs_remove_rec() which expects either a
good pointer or a NULL, not ERR_PTR. The fix is to not assign the
ERR_PTR to the structure, but to keep it NULL on error.
- Fix list_for_each_rcu() to use list_for_each_srcu() in
dcache_dir_open_wrapper(). One iteration of the code used RCU but
because it had to call sleepable code, it had to be changed to use
SRCU, but one of the iterations was missed.
- Fix synthetic event print function to use "as_u64" instead of passing
in a pointer to the union. To fix big/little endian issues, the u64
that represented several types was turned into a union to define the
types properly.
* tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()
tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()
tracing/synthetic: Print out u64 values properly
tracing/synthetic: Fix order of struct trace_dynamic_info
selftests/ftrace: Fix dependencies for some of the synthetic event tests
tracing: Remove unused trace_event_file dir field
tracing: Use the new eventfs descriptor for print trigger
ring-buffer: Do not attempt to read past "commit"
tracefs/eventfs: Free top level files on removal
ring-buffer: Avoid softlockup in ring_buffer_resize()
tracing: Have event inject files inc the trace array ref count
tracing: Have option files inc the trace array ref count
tracing: Have current_trace inc the trace array ref count
tracing: Have tracing_max_latency inc the trace array ref count
tracing: Increase trace array ref count on enable and filter files
tracefs/eventfs: Use dput to free the toplevel events directory
tracefs/eventfs: Add missing lockdown checks
tracefs: Add missing lockdown check to tracefs_create_dir()
|
|
SCM interface
Add support for SCM calls to Secure OS and the Secure Execution
Environment (SEE) residing in the TrustZone (TZ) via the QSEECOM
interface. This allows communication with Secure/TZ applications, for
example 'uefisecapp' managing access to UEFI variables.
For better separation, make qcom_scm spin up a dedicated child
(platform) device in case QSEECOM support has been detected. The
corresponding driver for this device is then responsible for managing
any QSEECOM clients. Specifically, this driver attempts to automatically
detect known and supported applications, creating a client (auxiliary)
device for each one. The respective client/auxiliary driver is then
responsible for managing and communicating with the application.
While this patch introduces only a very basic interface without the more
advanced features (such as re-entrant and blocking SCM calls and
listeners/callbacks), this is enough to talk to the aforementioned
'uefisecapp'.
Signed-off-by: Maximilian Luz <[email protected]>
Reviewed-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Add a ucs2_strscpy() function for UCS-2 strings. The behavior is
equivalent to the standard strscpy() function, just for 16-bit character
UCS-2 strings.
Signed-off-by: Maximilian Luz <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Currently, when GETDEVICEINFO returns multiple locations where each
is a different IP but the server's identity is same as MDS, then
nfs4_set_ds_client() finds the existing nfs_client structure which
has the MDS's max_connect value (and if it's 1), then the 1st IP
on the DS's list will get dropped due to MDS trunking rules. Other
IPs would be added as they fall under the pnfs trunking rules.
For the list of IPs the 1st goes thru calling nfs4_set_ds_client()
which will eventually call nfs4_add_trunk() and call into
rpc_clnt_test_and_add_xprt() which has the check for MDS trunking.
The other IPs (after the 1st one), would call rpc_clnt_add_xprt()
which doesn't go thru that check.
nfs4_add_trunk() is called when MDS trunking is happening and it
needs to enforce the usage of max_connect mount option of the
1st mount. However, this shouldn't be applied to pnfs flow.
Instead, this patch proposed to treat MDS=DS as DS trunking and
make sure that MDS's max_connect limit does not apply to the
1st IP returned in the GETDEVICEINFO list. It does so by
marking the newly created client with a new flag NFS_CS_PNFS
which then used to pass max_connect value to use into the
rpc_clnt_test_and_add_xprt() instead of the existing rpc
client's max_connect value set by the MDS connection.
For example, mount was done without max_connect value set
so MDS's rpc client has cl_max_connect=1. Upon calling into
rpc_clnt_test_and_add_xprt() and using rpc client's value,
the caller passes in max_connect value which is previously
been set in the pnfs path (as a part of handling
GETDEVICEINFO list of IPs) in nfs4_set_ds_client().
However, when NFS_CS_PNFS flag is not set and we know we
are doing MDS trunking, comparing a new IP of the same
server, we then set the max_connect value to the
existing MDS's value and pass that into
rpc_clnt_test_and_add_xprt().
Fixes: dc48e0abee24 ("SUNRPC enforce creation of no more than max_connect xprts")
Signed-off-by: Olga Kornievskaia <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Ensure that nfs_clear_request_commit() updates the correct counters when
it removes them from the commit list.
Fixes: ed5d588fe47f ("NFS: Try to join page groups before an O_DIRECT retransmission")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Some VFs may want to disable CRC stripping on incoming packets so create
an offload for that. The VF already sends information about configuring
its RX queues so use that structure to indicate that the CRC stripping
should be enabled or not.
Signed-off-by: Paul M Stillwell Jr <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Paul Menzel <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Use guards to reduce gotos and simplify control flow.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
recent discussion brought about the realization that it makes sense for
no_free_ptr() to have __must_check semantics in order to avoid leaking
the resource.
Additionally, add a few comments to clarify why/how things work.
All credit to Linus on how to combine __must_check and the
stmt-expression.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
call_single_data_t is a size-aligned typedef of struct __call_single_data.
This alignment is desirable in order to have smp_call_function*() avoid
bouncing an extra cacheline in case of an unaligned csd, given this
would hurt performance.
Since the removal of struct request->csd in commit 660e802c76c8
("blk-mq: use percpu csd to remote complete instead of per-rq csd") there
are no current users of smp_call_function*() with unaligned csd.
Change every 'struct __call_single_data' function parameter to
'call_single_data_t', so we have warnings if any new code tries to
introduce an smp_call_function*() call with unaligned csd.
Signed-off-by: Leonardo Bras <[email protected]>
Reviewed-by: Guo Ren <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
AMD PMF driver can cause the following warning:
[ 196.159546] ------------[ cut here ]------------
[ 196.159556] Voluntary context switch within RCU read-side critical section!
[ 196.159571] WARNING: CPU: 0 PID: 9 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x43d/0x560
[ 196.159604] Modules linked in: nvme_fabrics ccm rfcomm snd_hda_scodec_cs35l41_spi cmac algif_hash algif_skcipher af_alg bnep joydev btusb btrtl uvcvideo btintel btbcm videobuf2_vmalloc intel_rapl_msr btmtk videobuf2_memops uvc videobuf2_v4l2 intel_rapl_common binfmt_misc hid_sensor_als snd_sof_amd_vangogh hid_sensor_trigger bluetooth industrialio_triggered_buffer videodev snd_sof_amd_rembrandt hid_sensor_iio_common amdgpu ecdh_generic kfifo_buf videobuf2_common hp_wmi kvm_amd sparse_keymap snd_sof_amd_renoir wmi_bmof industrialio ecc mc nls_iso8859_1 kvm snd_sof_amd_acp irqbypass snd_sof_xtensa_dsp crct10dif_pclmul crc32_pclmul mt7921e snd_sof_pci snd_ctl_led polyval_clmulni mt7921_common polyval_generic snd_sof ghash_clmulni_intel mt792x_lib mt76_connac_lib sha512_ssse3 snd_sof_utils aesni_intel snd_hda_codec_realtek crypto_simd mt76 snd_hda_codec_generic cryptd snd_soc_core snd_hda_codec_hdmi rapl ledtrig_audio input_leds snd_compress i2c_algo_bit drm_ttm_helper mac80211 snd_pci_ps hid_multitouch ttm drm_exec
[ 196.159970] drm_suballoc_helper snd_rpl_pci_acp6x amdxcp drm_buddy snd_hda_intel snd_acp_pci snd_hda_scodec_cs35l41_i2c serio_raw gpu_sched snd_hda_scodec_cs35l41 snd_acp_legacy_common snd_intel_dspcfg snd_hda_cs_dsp_ctls snd_hda_codec libarc4 drm_display_helper snd_pci_acp6x cs_dsp snd_hwdep snd_soc_cs35l41_lib video k10temp snd_pci_acp5x thunderbolt snd_hda_core drm_kms_helper cfg80211 snd_seq snd_rn_pci_acp3x snd_pcm snd_acp_config cec snd_soc_acpi snd_seq_device rc_core ccp snd_pci_acp3x snd_timer snd soundcore wmi amd_pmf platform_profile amd_pmc mac_hid serial_multi_instantiate wireless_hotkey hid_sensor_hub sch_fq_codel msr parport_pc ppdev lp parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log cdc_ether usbnet r8152 mii hid_generic nvme i2c_hid_acpi i2c_hid nvme_core i2c_piix4 xhci_pci amd_sfh drm xhci_pci_renesas nvme_common hid
[ 196.160382] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc1 #4
[ 196.160397] Hardware name: HP HP EliteBook 845 14 inch G10 Notebook PC/8B6E, BIOS V82 Ver. 01.02.00 08/24/2023
[ 196.160405] Workqueue: events power_supply_changed_work
[ 196.160426] RIP: 0010:rcu_note_context_switch+0x43d/0x560
[ 196.160440] Code: 00 48 89 be 40 08 00 00 48 89 86 48 08 00 00 48 89 10 e9 63 fe ff ff 48 c7 c7 10 e7 b0 9e c6 05 e8 d8 20 02 01 e8 13 0f f3 ff <0f> 0b e9 27 fc ff ff a9 ff ff ff 7f 0f 84 cf fc ff ff 65 48 8b 3c
[ 196.160450] RSP: 0018:ffffc900001878f0 EFLAGS: 00010046
[ 196.160462] RAX: 0000000000000000 RBX: ffff88885e834040 RCX: 0000000000000000
[ 196.160470] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 196.160476] RBP: ffffc90000187910 R08: 0000000000000000 R09: 0000000000000000
[ 196.160482] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 196.160488] R13: 0000000000000000 R14: ffff888100990000 R15: ffff888100990000
[ 196.160495] FS: 0000000000000000(0000) GS:ffff88885e800000(0000) knlGS:0000000000000000
[ 196.160504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 196.160512] CR2: 000055cb053c8246 CR3: 000000013443a000 CR4: 0000000000750ef0
[ 196.160520] PKRU: 55555554
[ 196.160526] Call Trace:
[ 196.160532] <TASK>
[ 196.160548] ? show_regs+0x72/0x90
[ 196.160570] ? rcu_note_context_switch+0x43d/0x560
[ 196.160580] ? __warn+0x8d/0x160
[ 196.160600] ? rcu_note_context_switch+0x43d/0x560
[ 196.160613] ? report_bug+0x1bb/0x1d0
[ 196.160637] ? handle_bug+0x46/0x90
[ 196.160658] ? exc_invalid_op+0x19/0x80
[ 196.160675] ? asm_exc_invalid_op+0x1b/0x20
[ 196.160709] ? rcu_note_context_switch+0x43d/0x560
[ 196.160727] __schedule+0xb9/0x15f0
[ 196.160746] ? srso_alias_return_thunk+0x5/0x7f
[ 196.160765] ? srso_alias_return_thunk+0x5/0x7f
[ 196.160778] ? acpi_ns_search_one_scope+0xbe/0x270
[ 196.160806] schedule+0x68/0x110
[ 196.160820] schedule_timeout+0x151/0x160
[ 196.160829] ? srso_alias_return_thunk+0x5/0x7f
[ 196.160842] ? srso_alias_return_thunk+0x5/0x7f
[ 196.160855] ? acpi_ns_lookup+0x3c5/0xa90
[ 196.160878] __down_common+0xff/0x220
[ 196.160905] __down_timeout+0x16/0x30
[ 196.160920] down_timeout+0x64/0x70
[ 196.160938] acpi_os_wait_semaphore+0x85/0x200
[ 196.160959] acpi_ut_acquire_mutex+0x9e/0x280
[ 196.160979] acpi_ex_enter_interpreter+0x2d/0xb0
[ 196.160992] acpi_ns_evaluate+0x2f0/0x5f0
[ 196.161005] acpi_evaluate_object+0x172/0x490
[ 196.161018] ? acpi_os_signal_semaphore+0x8a/0xd0
[ 196.161038] acpi_evaluate_integer+0x52/0xe0
[ 196.161055] ? kfree+0x79/0x120
[ 196.161071] ? srso_alias_return_thunk+0x5/0x7f
[ 196.161089] acpi_ac_get_state.part.0+0x27/0x80
[ 196.161110] get_ac_property+0x5c/0x70
[ 196.161127] ? __pfx___power_supply_is_system_supplied+0x10/0x10
[ 196.161146] __power_supply_is_system_supplied+0x44/0xb0
[ 196.161166] class_for_each_device+0x124/0x160
[ 196.161184] ? acpi_ac_get_state.part.0+0x27/0x80
[ 196.161203] ? srso_alias_return_thunk+0x5/0x7f
[ 196.161223] power_supply_is_system_supplied+0x3c/0x70
[ 196.161243] amd_pmf_get_power_source+0xe/0x20 [amd_pmf]
[ 196.161276] amd_pmf_power_slider_update_event+0x49/0x90 [amd_pmf]
[ 196.161310] amd_pmf_pwr_src_notify_call+0xe7/0x100 [amd_pmf]
[ 196.161340] notifier_call_chain+0x5f/0xe0
[ 196.161362] atomic_notifier_call_chain+0x33/0x60
[ 196.161378] power_supply_changed_work+0x84/0x110
[ 196.161394] process_one_work+0x178/0x360
[ 196.161412] ? __pfx_worker_thread+0x10/0x10
[ 196.161424] worker_thread+0x307/0x430
[ 196.161440] ? __pfx_worker_thread+0x10/0x10
[ 196.161451] kthread+0xf4/0x130
[ 196.161467] ? __pfx_kthread+0x10/0x10
[ 196.161486] ret_from_fork+0x43/0x70
[ 196.161502] ? __pfx_kthread+0x10/0x10
[ 196.161518] ret_from_fork_asm+0x1b/0x30
[ 196.161558] </TASK>
[ 196.161562] ---[ end trace 0000000000000000 ]---
Since there's no guarantee that all the callbacks can work in atomic
context, switch to use blocking_notifier_call_chain to relax the
constraint.
Signed-off-by: Kai-Heng Feng <[email protected]>
Reported-by: Allen Zhong <[email protected]>
Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571
Reviewed-by: Mario Limonciello <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
Now that all drivers are converted to the (new) .probe() callback, the
temporary .probe_new() can go away. \o/
Link: https://lore.kernel.org/linux-i2c/[email protected]
Reviewed-by: Javier Martinez Canillas <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
numa_fill_memblks() fills in the gaps in numa_meminfo memblks
over an physical address range.
The ACPI driver will use numa_fill_memblks() to implement a new Linux
policy that prescribes extending proximity domains in a portion of a
CFMWS window to the entire window.
Dan Williams offered this explanation of the policy:
A CFWMS is an ACPI data structure that indicates *potential* locations
where CXL memory can be placed. It is the playground where the CXL
driver has free reign to establish regions. That space can be populated
by BIOS created regions, or driver created regions, after hotplug or
other reconfiguration.
When BIOS creates a region in a CXL Window it additionally describes
that subset of the Window range in the other typical ACPI tables SRAT,
SLIT, and HMAT. The rationale for BIOS not pre-describing the entire
CXL Window in SRAT, SLIT, and HMAT is that it can not predict the
future. I.e. there is nothing stopping higher or lower performance
devices being placed in the same Window. Compare that to ACPI memory
hotplug that just onlines additional capacity in the proximity domain
with little freedom for dynamic performance differentiation.
That leaves the OS with a choice, should unpopulated window capacity
match the proximity domain of an existing region, or should it allocate
a new one? This patch takes the simple position of minimizing proximity
domain proliferation by reusing any proximity domain intersection for
the entire Window. If the Window has no intersections then allocate a
new proximity domain. Note that SRAT, SLIT and HMAT information can be
enumerated dynamically in a standard way from device provided data.
Think of CXL as the end of ACPI needing to describe memory attributes,
CXL offers a standard discovery model for performance attributes, but
Linux still needs to interoperate with the old regime.
Reported-by: Derick Marks <[email protected]>
Suggested-by: Dan Williams <[email protected]>
Signed-off-by: Alison Schofield <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Tested-by: Derick Marks <[email protected]>
Link: https://lore.kernel.org/all/ef078a6f056ca974e5af85997013c0fda9e3326d.1689018477.git.alison.schofield%40intel.com
|
|
From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall
handling in JIT"), the tailcall on x64 works better than before.
From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms
for x64 JIT"), tailcall is able to run in BPF subprograms on x64.
From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program
to other BPF programs"), BPF program is able to trace other BPF programs.
How about combining them all together?
1. FENTRY/FEXIT on a BPF subprogram.
2. A tailcall runs in the BPF subprogram.
3. The tailcall calls the subprogram's caller.
As a result, a tailcall infinite loop comes up. And the loop would halt
the machine.
As we know, in tail call context, the tail_call_cnt propagates by stack
and rax register between BPF subprograms. So do in trampolines.
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT")
Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: Leon Hwang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This idea came after a particular workload requested
the quickack attribute set on routes, and a performance
drop was noticed for large bulk transfers.
For high throughput flows, it is best to use one cpu
running the user thread issuing socket system calls,
and a separate cpu to process incoming packets from BH context.
(With TSO/GRO, bottleneck is usually the 'user' cpu)
Problem is the user thread can spend a lot of time while holding
the socket lock, forcing BH handler to queue most of incoming
packets in the socket backlog.
Whenever the user thread releases the socket lock, it must first
process all accumulated packets in the backlog, potentially
adding latency spikes. Due to flood mitigation, having too many
packets in the backlog increases chance of unexpected drops.
Backlog processing unfortunately shifts a fair amount of cpu cycles
from the BH cpu to the 'user' cpu, thus reducing max throughput.
This patch takes advantage of the backlog processing,
and the fact that ACK are mostly cumulative.
The idea is to detect we are in the backlog processing
and defer all eligible ACK into a single one,
sent from tcp_release_cb().
This saves cpu cycles on both sides, and network resources.
Performance of a single TCP flow on a 200Gbit NIC:
- Throughput is increased by 20% (100Gbit -> 120Gbit).
- Number of generated ACK per second shrinks from 240,000 to 40,000.
- Number of backlog drops per second shrinks from 230 to 0.
Benchmark context:
- Regular netperf TCP_STREAM (no zerocopy)
- Intel(R) Xeon(R) Platinum 8481C (Saphire Rapids)
- MAX_SKB_FRAGS = 17 (~60KB per GRO packet)
This feature is guarded by a new sysctl, and enabled by default:
/proc/sys/net/ipv4/tcp_backlog_ack_defer
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Acked-by: Dave Taht <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
It was reported that under certain circumstances GCC emits ENDBR
instructions for _THIS_IP_ usage. Specifically, when it appears at the
start of a basic block -- but not elsewhere.
Since _THIS_IP_ is never used for control flow, these ENDBR
instructions are completely superfluous. Override the _THIS_IP_
definition for x86_64 to avoid this.
Less ENDBR instructions is better.
Fixes: 156ff4a544ae ("x86/ibt: Base IBT bits")
Reported-by: David Kaplan <[email protected]>
Reviewed-by: Andrew Cooper <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|