Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity fixes from Mimi Zohar:
"Two additional patches to fix the removal of the deprecated
IMA_TRUSTED_KEYRING Kconfig"
* tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: rework CONFIG_IMA dependency block
ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Pull LED fix from Lee Jones:
"Just the one bug-fix:
- Fix regression affecting LED_COLOR_ID_MULTI users"
* tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
leds: Drop BUG_ON check for LED_COLOR_ID_MULTI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fixes from Lee Jones:
"A couple of small fixes:
- Potential build failure in CS42L43
- Device Tree bindings clean-up for a superseded patch"
* tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
dt-bindings: mfd: Revert "dt-bindings: mfd: maxim,max77693: Add USB connector"
mfd: cs42l43: Fix MFD_CS42L43 dependency on REGMAP_IRQ
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Amir Goldstein:
- Fix for file reference leak regression
- Fix for NULL pointer deref regression
- Fixes for RCU-walk race regressions:
Two of the fixes were taken from Al's RCU pathwalk race fixes series
with his consent [1].
Note that unlike most of Al's series, these two patches are not about
racing with ->kill_sb() and they are also very recent regressions
from v6.5, so I think it's worth getting them into v6.5.y.
There is also a fix for an RCU pathwalk race with ->kill_sb(), which
may have been solved in vfs generic code as you suggested, but it
also rids overlayfs from a nasty hack, so I think it's worth anyway.
Link: https://lore.kernel.org/linux-fsdevel/20231003204749.GA800259@ZenIV/ [1]
* tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: fix NULL pointer defer when encoding non-decodable lower fid
ovl: make use of ->layers safe in rcu pathwalk
ovl: fetch inode once in ovl_dentry_revalidate_common()
ovl: move freeing ovl_entry past rcu delay
ovl: fix file reference leak when submitting aio
|
|
Like any other set command, require admin permissions to do it.
Cc: [email protected]
Fixes: 2b34c5580226 ("RDMA/core: Add command to set ib_core device net namspace sharing mode")
Link: https://lore.kernel.org/r/75d329fdd7381b52cbdf87910bef16c9965abb1f.1696443438.git.leon@kernel.org
Reviewed-by: Parav Pandit <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Mat Martineau says:
====================
mptcp: Fixes and maintainer email update for v6.6
Patch 1 addresses a race condition in MPTCP "delegated actions"
infrastructure. Affects v5.19 and later.
Patch 2 removes an unnecessary restriction that did not allow additional
outgoing subflows using the local address of the initial MPTCP subflow.
v5.16 and later.
Patch 3 updates Matthieu's email address.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use my kernel.org account instead.
The other one will bounce by the end of the year.
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow
creating additional subflows with the local addr ID 0.
There is no reason not to allow additional subflows from this local
address: we should be able to create new subflows from the initial
endpoint. This limitation was breaking fullmesh support from userspace.
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391
Cc: [email protected]
Suggested-by: Matthieu Baerts <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The delegated action infrastructure is prone to the following
race: different CPUs can try to schedule different delegated
actions on the same subflow at the same time.
Each of them will check different bits via mptcp_subflow_delegate(),
and will try to schedule the action on the related per-cpu napi
instance.
Depending on the timing, both can observe an empty delegated list
node, causing the same entry to be added simultaneously on two different
lists.
The root cause is that the delegated actions infra does not provide
a single synchronization point. Address the issue reserving an additional
bit to mark the subflow as scheduled for delegation. Acquiring such bit
guarantee the caller to own the delegated list node, and being able to
safely schedule the subflow.
Clear such bit only when the subflow scheduling is completed, ensuring
proper barrier in place.
Additionally swap the meaning of the delegated_action bitmask, to allow
the usage of the existing helper to set multiple bit at once.
Fixes: bcd97734318d ("mptcp: use delegate action to schedule 3rd ack retrans")
Cc: [email protected]
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Using netconsole netpoll_poll_dev could be called from interrupt
context, thus using disable_irq() would cause the following kernel
warning with CONFIG_DEBUG_ATOMIC_SLEEP enabled:
BUG: sleeping function called from invalid context at kernel/irq/manage.c:137
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 10, name: ksoftirqd/0
CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G W 5.15.42-00075-g816b502b2298-dirty #117
Hardware name: aml (r1) (DT)
Call trace:
dump_backtrace+0x0/0x270
show_stack+0x14/0x20
dump_stack_lvl+0x8c/0xac
dump_stack+0x18/0x30
___might_sleep+0x150/0x194
__might_sleep+0x64/0xbc
synchronize_irq+0x8c/0x150
disable_irq+0x2c/0x40
stmmac_poll_controller+0x140/0x1a0
netpoll_poll_dev+0x6c/0x220
netpoll_send_skb+0x308/0x390
netpoll_send_udp+0x418/0x760
write_msg+0x118/0x140 [netconsole]
console_unlock+0x404/0x500
vprintk_emit+0x118/0x250
dev_vprintk_emit+0x19c/0x1cc
dev_printk_emit+0x90/0xa8
__dev_printk+0x78/0x9c
_dev_warn+0xa4/0xbc
ath10k_warn+0xe8/0xf0 [ath10k_core]
ath10k_htt_txrx_compl_task+0x790/0x7fc [ath10k_core]
ath10k_pci_napi_poll+0x98/0x1f4 [ath10k_pci]
__napi_poll+0x58/0x1f4
net_rx_action+0x504/0x590
_stext+0x1b8/0x418
run_ksoftirqd+0x74/0xa4
smpboot_thread_fn+0x210/0x3c0
kthread+0x1fc/0x210
ret_from_fork+0x10/0x20
Since [0] .ndo_poll_controller is only needed if driver doesn't or
partially use NAPI. Because stmmac does so, stmmac_poll_controller
can be removed fixing the above warning.
[0] commit ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional")
Cc: <[email protected]> # 5.15.x
Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers")
Signed-off-by: Remi Pommarel <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/1c156a6d8c9170bd6a17825f2277115525b4d50f.1696429960.git.repk@triplefau.lt
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Since FIXED_PHY depends on PHYLIB, PHYLIB needs to be set to avoid
a kconfig warning:
WARNING: unmet direct dependencies detected for FIXED_PHY
Depends on [n]: NETDEVICES [=y] && PHYLIB [=n]
Selected by [y]:
- LAN743X [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && PCI [=y] && PTP_1588_CLOCK_OPTIONAL [=y]
Fixes: 73c4d1b307ae ("net: lan743x: select FIXED_PHY")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: lore.kernel.org/r/[email protected]
Cc: Bryan Whitehead <[email protected]>
Cc: [email protected]
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Simon Horman <[email protected]> # build-tested
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
While searching for possible refactor of napi_schedule_prep and
__napi_schedule it was notice that the mtk eth driver disable the
interrupt for rx and tx AFTER napi is scheduled.
While this is a very hard to repro case it might happen to have
situation where the interrupt is disabled and never enabled again as the
napi completes and the interrupt is enabled before.
This is caused by the fact that a napi driven by interrupt expect a
logic with:
1. interrupt received. napi prepared -> interrupt disabled -> napi
scheduled
2. napi triggered. ring cleared -> interrupt enabled -> wait for new
interrupt
To prevent this case, disable the interrupt BEFORE the napi is
scheduled.
Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet")
Cc: [email protected]
Signed-off-by: Christian Marangi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Like various other devices using similar hardware, this model reports a
perpetually empty battery (0-1%).
Join the others and apply HID_BATTERY_QUIRK_IGNORE.
Signed-off-by: Fabian Vogt <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
The EHL (Elkhart Lake) based platforms provide a OOB (Out of band)
service, which allows to wakup device when the system is in S5 (Soft-Off
state). This OOB service can be enabled/disabled from BIOS settings. When
enabled, the ISH device gets PME wake capability. To enable PME wakeup,
driver also needs to enable ACPI GPE bit.
On resume, BIOS will clear the wakeup bit. So driver need to re-enable it
in resume function to keep the next wakeup capability. But this BIOS
clearing of wakeup bit doesn't decrement internal OS GPE reference count,
so this reenabling on every resume will cause reference count to overflow.
So first disable and reenable ACPI GPE bit using acpi_disable_gpe().
Fixes: 2e23a70edabe ("HID: intel-ish-hid: ipc: finish power flow for EHL OOB")
Reported-by: Kai-Heng Feng <[email protected]>
Closes: https://lore.kernel.org/lkml/CAAd53p4=oLYiH2YbVSmrPNj1zpMcfp=Wxbasb5vhMXOWCArLCg@mail.gmail.com/T/
Tested-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
usb_free_urb() does the NULL check itself, so there is no need to duplicate
it prior to calling.
Reported-by: kernel test robot <[email protected]>
Fixes: e1cd4004cde7c9 ("HID: sony: Fix a potential memory leak in sony_probe()")
Signed-off-by: Jiri Kosina <[email protected]>
|
|
When suspending the computer, a Switch Pro Controller connected via USB will
lose its internal status. However, because the USB connection was technically
never lost, when resuming the computer, the driver will attempt to communicate
with the controller as if nothing happened (and fail).
Because of this, the user was forced to manually disconnect the controller
(or to press the sync button on the controller to power it off), so that it
can be re-initialized.
With this patch, the controller will be automatically re-initialized after
resuming from suspend.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216233
Signed-off-by: Martino Fontana <[email protected]>
Reviewed-by: Daniel J. Ogorchock <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
handling path
The commit in Fixes updated the error handling path of
thunderstrike_create() and the remove function but not the error handling
path of shield_probe(), should an error occur after a successful
thunderstrike_create() call.
Add the missing calls.
Fixes: 3ab196f88237 ("HID: nvidia-shield: Add battery support for Thunderstrike")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
error handling path
The commit in Fixes updated the error handling path of
thunderstrike_create() and the remove function but not the error handling
path of shield_probe(), should an error occur after a successful
thunderstrike_create() call.
Add the missing call. Make sure it is safe to call in the probe error
handling path by preventing the led_classdev from attempting to set the LED
brightness to the off state on unregister.
Fixes: f88af60e74a5 ("HID: nvidia-shield: Support LED functionality for Thunderstrike")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
Register the Synaptics device as a special multitouch device with certain
quirks that may improve usability of the touchpad device.
Reported-by: Rain <[email protected]>
Closes: https://lore.kernel.org/linux-input/[email protected]/
Signed-off-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
Haiyang Zhang says:
====================
net: mana: Fix some TX processing bugs
Fix TX processing bugs on error handling, tso_bytes calculation,
and sge0 size.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Handle the case when GSO SKB linear length is too large.
MANA NIC requires GSO packets to put only the header part to SGE0,
otherwise the TX queue may stop at the HW level.
So, use 2 SGEs for the skb linear part which contains more than the
packet header.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Shradha Gupta <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove
the subtraction from header size.
Cc: [email protected]
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Shradha Gupta <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
For an unknown TX CQE error type (probably from a newer hardware),
still free the SKB, update the queue tail, etc., otherwise the
accounting will be wrong.
Also, TX errors can be triggered by injecting corrupted packets, so
replace the WARN_ONCE to ratelimited error logging.
Cc: [email protected]
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Shradha Gupta <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Need to set the private data with encoder device, or will access
NULL pointer in encoder handler.
Fixes: 1972e32431ed ("media: mediatek: vcodec: Fix possible invalid memory access for encoder")
Signed-off-by: Irui Wang <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
|
|
pinctrl_gpio_set_config() expects the GPIO number from the global GPIO
numberspace, not the controller-relative offset, which needs to be added
to the chip base.
Fixes: 5ae4cb94b313 ("gpio: aspeed: Add debounce support")
Signed-off-by: Bartosz Golaszewski <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Reviewed-by: Andrew Jeffery <[email protected]>
|
|
if thread A in smb2_write is using work-tcon, other thread B use
smb2_tree_disconnect free the tcon, then thread A will use free'd tcon.
Time
+
Thread A | Thread A
smb2_write | smb2_tree_disconnect
|
|
| kfree(tree_conn)
|
// UAF! |
work->tcon->share_conf |
+
This patch add state, reference count and lock for tree conn to fix race
condition issue.
Reported-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
pipes only
[Why]
The edge-case DISPCLK WDIVIDER changes call stream_enc functions.
But with MPC pipes, downstream pipes have null stream_enc and will
cause crash.
[How]
Only call stream_enc functions for pipes that are OTG master.
Reviewed-by: Alvin Lee <[email protected]>
Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Acked-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Samson Tam <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On some systems with Navi3x dGPU will attempt to use BACO for runtime
PM but fails to resume properly. This is because on these systems
the root port goes into D3cold which is incompatible with BACO.
This happens because in this case dGPU is connected to a bridge between
root port which causes BOCO detection logic to fail. Fix the intent of
the logic by looking at root port, not the immediate upstream bridge for
_PR3.
Cc: [email protected]
Suggested-by: Jun Ma <[email protected]>
Tested-by: David Perry <[email protected]>
Fixes: b10c1c5b3a4e ("drm/amdgpu: add check for ACPI power resources")
Signed-off-by: Mario Limonciello <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
While aligning SMU11 with SMU13 implementation an assumption was made that
`dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
like in SMU13 but it isn't.
So restore some of the original logic and instead just check for
amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
values; erring on the side of performance.
Cc: [email protected] # 6.1+
Reported-and-tested-by: Umio Yasuno <[email protected]>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
Fixes: e701156ccc6c ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
Signed-off-by: Mario Limonciello <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix a memory leak in amdgpu_fru_get_product_info().
Cc: Alex Deucher <[email protected]>
Reported-by: Yang Wang <[email protected]>
Fixes: 0dbf2c562625 ("drm/amdgpu: Interpret IPMI data for product information (v2)")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add unique_id for gc 11.0.3
Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
There is a race condition issue between parallel smb2 lock request.
Time
+
Thread A | Thread A
smb2_lock | smb2_lock
|
insert smb_lock to lock_list |
spin_unlock(&work->conn->llist_lock) |
|
| spin_lock(&conn->llist_lock);
| kfree(cmp_lock);
|
// UAF! |
list_add(&smb_lock->llist, &rollback_list) +
This patch swaps the line for adding the smb lock to the rollback list and
adding the lock list of connection to fix the race issue.
Reported-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
If parallel smb2 logoff requests come in before closing door, running
request count becomes more than 1 even though connection status is set to
KSMBD_SESS_NEED_RECONNECT. It can't get condition true, and sleep forever.
This patch fix race condition problem by returning error if connection
status was already set to KSMBD_SESS_NEED_RECONNECT.
Reported-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
drop reference after use opinfo.
Signed-off-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
fp can used in each command. If smb2_close command is coming at the
same time, UAF issue can happen by race condition.
Time
+
Thread A | Thread B1 B2 .... B5
smb2_open | smb2_close
|
__open_id |
insert fp to file_table |
|
| atomic_dec_and_test(&fp->refcount)
| if fp->refcount == 0, free fp by kfree.
// UAF! |
use fp |
+
This patch add f_state not to use freed fp is used and not to free fp in
use.
Reported-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Thread A + Thread B
ksmbd_session_lookup | smb2_sess_setup
sess = xa_load |
|
| xa_erase(&conn->sessions, sess->id);
|
| ksmbd_session_destroy(sess) --> kfree(sess)
|
// UAF! |
sess->last_active = jiffies |
+
This patch add rwsem to fix race condition between ksmbd_session_lookup
and ksmbd_expire_session.
Reported-by: luosili <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux
Pull rtla fixes from Daniel Bristot de Oliveira:
"rtla (Real-Time Linux Analysis) tool fixes.
Timerlat auto-analysis:
- Timerlat is reporting thread interference time without thread noise
events occurrence. It was caused because the thread interference
variable was not reset after the analysis of a timerlat activation
that did not hit the threshold.
- The IRQ handler delay is estimated from the delta of the IRQ
latency reported by timerlat, and the timestamp from IRQ handler
start event. If the delta is near-zero, the drift from the external
clock and the trace event and/or the overhead can cause the value
to be negative. If the value is negative, print a zero-delay.
- IRQ handlers happening after the timerlat thread event but before
the stop tracing were being reported as IRQ that happened before
the *current* IRQ occurrence. Ignore Previous IRQ noise in this
condition because they are valid only for the *next* timerlat
activation.
Timerlat user-space:
- Timerlat is stopping all user-space thread if a CPU becomes
offline. Do not stop the entire tool if a CPU is/become offline,
but only the thread of the unavailable CPU. Stop the tool only, if
all threads leave because the CPUs become/are offline.
man-pages:
- Fix command line example in timerlat hist man page"
* tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux:
rtla: fix a example in rtla-timerlat-hist.rst
rtla/timerlat: Do not stop user-space if a cpu is offline
rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample
rtla/timerlat_aa: Fix negative IRQ delay
rtla/timerlat_aa: Zero thread sum after every sample analysis
|
|
syzbot caught another data-race in netlink when
setting sk->sk_err.
Annotate all of them for good measure.
BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x00000000 -> 0x00000016
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, when hb_interval is changed by users, it won't take effect
until the next expiry of hb timer. As the default value is 30s, users
have to wait up to 30s to wait its hb_interval update to work.
This becomes pretty bad in containers where a much smaller value is
usually set on hb_interval. This patch improves it by resetting the
hb timer immediately once the value of hb_interval is updated by users.
Note that we don't address the already existing 'problem' when sending
a heartbeat 'on demand' if one hb has just been sent(from the timer)
mentioned in:
https://www.mail-archive.com/[email protected]/msg590224.html
Signed-off-by: Xin Long <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
During the 4-way handshake, the transport's state is set to ACTIVE in
sctp_process_init() when processing INIT_ACK chunk on client or
COOKIE_ECHO chunk on server.
In the collision scenario below:
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021]
when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state,
sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it
creates a new association and sets its transport to ACTIVE then updates
to the old association in sctp_assoc_update().
However, in sctp_assoc_update(), it will skip the transport update if it
finds a transport with the same ipaddr already existing in the old asoc,
and this causes the old asoc's transport state not to move to ACTIVE
after the handshake.
This means if DATA retransmission happens at this moment, it won't be able
to enter PF state because of the check 'transport->state == SCTP_ACTIVE'
in sctp_do_8_2_transport_strike().
This patch fixes it by updating the transport in sctp_assoc_update() with
sctp_assoc_add_peer() where it updates the transport state if there is
already a transport with the same ipaddr exists in the old asoc.
Signed-off-by: Xin Long <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This commit fixes poor delayed ACK behavior that can cause poor TCP
latency in a particular boundary condition: when an application makes
a TCP socket write that is an exact multiple of the MSS size.
The problem is that there is painful boundary discontinuity in the
current delayed ACK behavior. With the current delayed ACK behavior,
we have:
(1) If an app reads data when > 1*MSS is unacknowledged, then
tcp_cleanup_rbuf() ACKs immediately because of:
tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss ||
(2) If an app reads all received data, and the packets were < 1*MSS,
and either (a) the app is not ping-pong or (b) we received two
packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause
of:
((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) ||
((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
!inet_csk_in_pingpong_mode(sk))) &&
(3) *However*: if an app reads exactly 1*MSS of data,
tcp_cleanup_rbuf() does not send an immediate ACK. This is true
even if the app is not ping-pong and the 1*MSS of data had the PSH
bit set, suggesting the sending application completed an
application write.
Thus if the app is not ping-pong, we have this painful case where
>1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a
write whose last skb is an exact multiple of 1*MSS can get a 40ms
delayed ACK. This means that any app that transfers data in one
direction and takes care to align write size or packet size with MSS
can suffer this problem. With receive zero copy making 4KB MSS values
more common, it is becoming more common to have application writes
naturally align with MSS, and more applications are likely to
encounter this delayed ACK problem.
The fix in this commit is to refine the delayed ACK heuristics with a
simple check: immediately ACK a received 1*MSS skb with PSH bit set if
the app reads all data. Why? If an skb has a len of exactly 1*MSS and
has the PSH bit set then it is likely the end of an application
write. So more data may not be arriving soon, and yet the data sender
may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we
set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send
an ACK immediately if the app reads all of the data and is not
ping-pong. Note that this logic is also executed for the case where
len > MSS, but in that case this logic does not matter (and does not
hurt) because tcp_cleanup_rbuf() will always ACK immediately if the
app reads data and there is more than an MSS of unACKed data.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Neal Cardwell <[email protected]>
Reviewed-by: Yuchung Cheng <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Cc: Xin Guo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This commit fixes quick-ack counting so that it only considers that a
quick-ack has been provided if we are sending an ACK that newly
acknowledges data.
The code was erroneously using the number of data segments in outgoing
skbs when deciding how many quick-ack credits to remove. This logic
does not make sense, and could cause poor performance in
request-response workloads, like RPC traffic, where requests or
responses can be multi-segment skbs.
When a TCP connection decides to send N quick-acks, that is to
accelerate the cwnd growth of the congestion control module
controlling the remote endpoint of the TCP connection. That quick-ack
decision is purely about the incoming data and outgoing ACKs. It has
nothing to do with the outgoing data or the size of outgoing data.
And in particular, an ACK only serves the intended purpose of allowing
the remote congestion control to grow the congestion window quickly if
the ACK is ACKing or SACKing new data.
The fix is simple: only count packets as serving the goal of the
quickack mechanism if they are ACKing/SACKing new data. We can tell
whether this is the case by checking inet_csk_ack_scheduled(), since
we schedule an ACK exactly when we are ACKing/SACKing new data.
Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.")
Signed-off-by: Neal Cardwell <[email protected]>
Reviewed-by: Yuchung Cheng <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter patches for net
First patch resolves a regression with vlan header matching, this was
broken since 6.5 release. From myself.
Second patch fixes an ancient problem with sctp connection tracking in
case INIT_ACK packets are delayed. This comes with a selftest, both
patches from Xin Long.
Patch 4 extends the existing nftables audit selftest, from
Phil Sutter.
Patch 5, also from Phil, avoids a situation where nftables
would emit an audit record twice. This was broken since 5.13 days.
Patch 6, from myself, avoids spurious insertion failure if we encounter an
overlapping but expired range during element insertion with the
'nft_set_rbtree' backend. This problem exists since 6.2.
* tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
netfilter: nf_tables: Deduplicate nft_register_obj audit logs
selftests: netfilter: Extend nft_audit.sh
selftests: netfilter: test for sctp collision processing in nf_conntrack
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
netfilter: nft_payload: rebuild vlan header on h_proto access
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Correct grammar for better readability.
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Acked-by: Ilias Apalodimas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Honor 'nohandlecache' mount option by not starting laundromat thread
even when SMB server supports directory leases. Do not waste system
resources by having laundromat thread running with no directory
caching at all.
Fixes: 2da338ff752a ("smb3: do not start laundromat thread when dir leases disabled")
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Recent changes to kernel_connect() and kernel_bind() ensure that
callers are insulated from changes to the address parameter made by BPF
SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and
ops->bind() with kernel_connect() and kernel_bind() to ensure that SMB
mounts do not see their mount address overwritten in such cases.
Link: https://lore.kernel.org/netdev/[email protected]/
Cc: <[email protected]> # 6.0+
Signed-off-by: Jordan Rife <[email protected]>
Acked-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
It seems that tipc_crypto_key_revoke() could be be invoked by
wokequeue tipc_crypto_work_rx() under process context and
timer/rx callback under softirq context, thus the lock acquisition
on &tx->lock seems better use spin_lock_bh() to prevent possible
deadlock.
This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.
tipc_crypto_work_rx() <workqueue>
--> tipc_crypto_key_distr()
--> tipc_bcast_xmit()
--> tipc_bcbase_xmit()
--> tipc_bearer_bc_xmit()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock)
<timer interrupt>
--> tipc_disc_timeout()
--> tipc_bearer_xmit_skb()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock) <deadlock here>
Signed-off-by: Chengfeng Ye <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The STM32MP1 keeps clk_rx enabled during suspend, and therefore the
driver does not enable the clock in stm32_dwmac_init() if the device was
suspended. The problem is that this same code runs on STM32 MCUs, which
do disable clk_rx during suspend, causing the clock to never be
re-enabled on resume.
This patch adds a variant flag to indicate that clk_rx remains enabled
during suspend, and uses this to decide whether to enable the clock in
stm32_dwmac_init() if the device was suspended.
This approach fixes this specific bug with limited opportunity for
unintended side-effects, but I have a follow up patch that will refactor
the clock configuration and hopefully make it less error prone.
Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.")
Signed-off-by: Ben Wolsieffer <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Battery information reported by the driver depends on the power supply
subsystem. Select the required subsystem when the HID_NVIDIA_SHIELD Kconfig
option is enabled.
Fixes: 3ab196f88237 ("HID: nvidia-shield: Add battery support for Thunderstrike")
Signed-off-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
The following crash is observed 100% of the time during resume from
the hibernation on a x86 QEMU system.
[ 12.931887] ? __die_body+0x1a/0x60
[ 12.932324] ? page_fault_oops+0x156/0x420
[ 12.932824] ? search_exception_tables+0x37/0x50
[ 12.933389] ? fixup_exception+0x21/0x300
[ 12.933889] ? exc_page_fault+0x69/0x150
[ 12.934371] ? asm_exc_page_fault+0x26/0x30
[ 12.934869] ? get_buffer.constprop.0+0xac/0x100
[ 12.935428] snapshot_write_next+0x7c/0x9f0
[ 12.935929] ? submit_bio_noacct_nocheck+0x2c2/0x370
[ 12.936530] ? submit_bio_noacct+0x44/0x2c0
[ 12.937035] ? hib_submit_io+0xa5/0x110
[ 12.937501] load_image+0x83/0x1a0
[ 12.937919] swsusp_read+0x17f/0x1d0
[ 12.938355] ? create_basic_memory_bitmaps+0x1b7/0x240
[ 12.938967] load_image_and_restore+0x45/0xc0
[ 12.939494] software_resume+0x13c/0x180
[ 12.939994] resume_store+0xa3/0x1d0
The commit being fixed introduced a bug in copying the zero bitmap
to safe pages. A temporary bitmap is allocated with PG_ANY flag in
prepare_image() to make a copy of zero bitmap after the unsafe pages
are marked. Freeing this temporary bitmap with PG_UNSAFE_KEEP later
results in an inconsistent state of unsafe pages. Since free bit is
left as is for this temporary bitmap after free, these pages are
treated as unsafe pages when they are allocated again. This results
in incorrect calculation of the number of pages pre-allocated for the
image.
nr_pages = (nr_zero_pages + nr_copy_pages) - nr_highmem - allocated_unsafe_pages;
The allocate_unsafe_pages is estimated to be higher than the actual
which results in running short of pages in safe_pages_list. Hence the
crash is observed in get_buffer() due to NULL pointer access of
safe_pages_list.
Fix this issue by creating the temporary zero bitmap from safe pages
(free bit not set) so that the corresponding free bits can be cleared
while freeing this bitmap.
Fixes: 005e8dddd497 ("PM: hibernate: don't store zero pages in the image file")
Suggested-by:: Brian Geffon <[email protected]>
Signed-off-by: Pavankumar Kondeti <[email protected]>
Reviewed-by: Brian Geffon <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|