Age | Commit message (Collapse) | Author | Files | Lines |
|
The MadCatz variant of the MMO7 mouse has the ID 0738:1713 and the same
quirks as the Saitek variant.
Signed-off-by: Samuel Bailey <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
Add documentation for user recovery feature of ublk subsystem.
Signed-off-by: ZiyangZhang <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
NMI interrupts should exit with EXCEPTION_RESTORE_REGS not with
interrupt_return_srr, which is what the perf NMI handler currently does.
This breaks if a PMI hits after interrupt_exit_user_prepare_main() has
switched the context tracking to user mode, then the CT_WARN_ON() in
interrupt_exit_kernel_prepare() fires because it returns to kernel with
context set to user.
This could possibly be solved by soft-disabling PMIs in the exit path,
but that reduces our ability to profile that code. The warning could be
removed, but it's potentially useful.
All other NMIs and soft-NMIs return using EXCEPTION_RESTORE_REGS, so
this makes perf interrupts consistent with that and seems like the best
fix.
Signed-off-by: Nicholas Piggin <[email protected]>
[mpe: Squash in fixups from Nick]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
NMI PMIs really should not return using the normal interrupt_return
function. If such a PMI hits in code returning to user with the context
switched to user mode, this warning can fire. This was enough to cause
crashes when reproducing on 64s, because another perf interrupt would
hit while reporting bug, and that would cause another bug, and so on
until smashing the stack.
Work around that particular crash for now by just disabling that context
warning for PMIs. This is a hack and not a complete fix, there could be
other such problems lurking in corners. But it does fix the known crash.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The context tracking code in PR-KVM and BookE implementations is not
complete, and can cause host crashes if context tracking is enabled.
Make these implementations depend on !CONTEXT_TRACKING_USER.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
schedule must not be explicitly called while KUAP is unlocked, because
the AMR register will not be saved across the context switch on
64s (preemption is allowed because that is driven by interrupts which do
save the AMR).
exit_vmx_usercopy() runs inside an unlocked user access region, and it
calls preempt_enable() which will call schedule() if need_resched() was
set while non-preemptible. This can cause tasks to run unprotected when
the should not, and can cause the user copy to be improperly blocked
when scheduling back to it.
Fix this by avoiding the explicit resched for preempt kernels by
generating an interrupt to reschedule the context if need_resched() got
set.
Reported-by: Samuel Holland <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
stop_machine_cpuslocked takes a mutex so it must be called in a
preemptible context, so it can't simply be fixed by disabling
preemption.
This is not a bug, because CPU hotplug is locked, so this processor will
call in to the stop machine function. So raw_smp_processor_id() could be
used. This leaves a small chance that this thread will be migrated to
another CPU, so the master work would be done by a CPU from a different
context. Better for test coverage to make that a common case by just
having the first CPU to call in become the master.
Signed-off-by: Nicholas Piggin <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
apply_to_page_range on kernel pages does not disable preemption, which
is a requirement for hash's lazy mmu mode, which keeps track of the
TLBs to flush with a per-cpu array.
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This lock is taken while the raw kfence_freelist_lock is held, so it
must also be a raw spinlock, as reported by lockdep when raw lock
nesting checking is enabled.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
With kfence enabled, there are several cases where HPTE and TLBIE locks
are called from softirq context, for example:
WARNING: inconsistent lock state
6.0.0-11845-g0cbbc95b12ac #1 Tainted: G N
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
c000000002734de8 (native_tlbie_lock){+.?.}-{2:2}, at: .native_hpte_updateboltedpp+0x1a4/0x600
{IN-SOFTIRQ-W} state was registered at:
.lock_acquire+0x20c/0x520
._raw_spin_lock+0x4c/0x70
.native_hpte_invalidate+0x62c/0x840
.hash__kernel_map_pages+0x450/0x640
.kfence_protect+0x58/0xc0
.kfence_guarded_free+0x374/0x5a0
.__slab_free+0x3d0/0x630
.put_cred_rcu+0xcc/0x120
.rcu_core+0x3c4/0x14e0
.__do_softirq+0x1dc/0x7dc
.do_softirq_own_stack+0x40/0x60
Fix this by consistently disabling irqs while taking either of these
locks. Don't just disable bh because several of the more common cases
already disable irqs, so this just makes the locks always irq-safe.
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add lockdep annotation for the HPTE bit-spinlock. Modern systems don't
take the tlbie lock, so this shows up some of the same lockdep warnings
that were being reported by the ppc970. And they're not taken in exactly
the same places so this is nice to have in its own right.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The hypervisor assigns VAS (Virtual Accelerator Switchboard)
windows depends on cores configured in LPAR. The kernel uses
OF reconfig notifier to reconfig VAS windows for DLPAR CPU event.
In the case of shared CPU mode partition, the hypervisor assigns
VAS windows depends on CPU entitled capacity, not based on vcpus.
When the user changes CPU entitled capacity for the partition,
drmgr uses /proc/ppc64/lparcfg interface to notify the kernel.
This patch adds the following changes to update VAS resources
for shared mode:
- Call vas reconfig windows from lparcfg_write()
- Ignore reconfig changes in the VAS notifier
Signed-off-by: Haren Myneni <[email protected]>
[mpe: Rework error handling, report any errors as EIO]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
irq_default_primary_handler() can be used only with IRQF_ONESHOT
flag, but the flag disables IRQ before executing the thread handler
and enables it after the interrupt is handled. But this IRQ disable
sets the VAS IRQ OFF state in the hypervisor. In case if NX faults
during this window, the hypervisor will not deliver the fault
interrupt to the partition and the user space may wait continuously
for the CSB update. So use VAS specific IRQ handler instead of
calling the default primary handler.
Increment pending_faults counter in IRQ handler and the bottom
thread handler will process all faults based on this counter.
In case if the another interrupt is received while the thread is
running, it will be processed using this counter. The synchronization
of top and bottom handlers will be done with IRQTF_RUNTHREAD flag
and will re-enter to bottom half if this flag is set.
Signed-off-by: Haren Myneni <[email protected]>
Reviewed-by: Frederic Barrat <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
If SOUNDWIRE is enabled, then SND_SOC_SC7180 should depend on
SOUNDWIRE to prevent SOUNDWIRE=m and SND_SOC_SC7180=y, which causes
build errors:
s390-linux-ld: sound/soc/qcom/common.o: in function `qcom_snd_sdw_prepare':
common.c:(.text+0x140): undefined reference to `sdw_disable_stream'
s390-linux-ld: common.c:(.text+0x14a): undefined reference to `sdw_deprepare_stream'
s390-linux-ld: common.c:(.text+0x158): undefined reference to `sdw_prepare_stream'
s390-linux-ld: common.c:(.text+0x16a): undefined reference to `sdw_enable_stream'
s390-linux-ld: common.c:(.text+0x17c): undefined reference to `sdw_deprepare_stream'
s390-linux-ld: sound/soc/qcom/common.o: in function `qcom_snd_sdw_hw_free':
common.c:(.text+0x344): undefined reference to `sdw_disable_stream'
s390-linux-ld: common.c:(.text+0x34e): undefined reference to `sdw_deprepare_stream'
Fixes: 3bd975f3ae0a ("ASoC: qcom: sm8250: move some code to common")
Fixes: 9e3ecb5b1681 ("ASoC: qcom: sc7180: Add machine driver for sound card registration")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Cc: Srinivas Kandagatla <[email protected]>
Cc: Banajit Goswami <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Liam Girdwood <[email protected]>
Cc: Ajit Pandey <[email protected]>
Cc: Cheng-Yi Chiang <[email protected]>
Cc: Jaroslav Kysela <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
If CONFIG_SND_SOC_TLV320ADC3XXX=y:
`.exit.text' referenced in section `.data' of sound/soc/codecs/tlv320adc3xxx.o: defined in discarded section `.exit.text' of sound/soc/codecs/tlv320adc3xxx.o
Fix this by wrapping the adc3xxx_i2c_remove() pointer in __exit_p().
Fixes: e9a3b57efd28fe88 ("ASoC: codec: tlv320adc3xxx: New codec driver")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/3225ba4cfe558d9380155e75385954dd21d4e7eb.1665909132.git.geert@linux-m68k.org
Signed-off-by: Mark Brown <[email protected]>
|
|
In the probe path, convert pr_err() to dev_err_probe() which will
check if error code is -EPROBE_DEFER and prints the error name.
It also sets the defer probe reason which can be checked later
through debugfs. It's more simple in error path.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
In the probe path, dev_err() can be replaced with dev_err_probe()
which will check if error code is -EPROBE_DEFER and prints the
error name. It also sets the defer probe reason which can be
checked later through debugfs. It's more simple in error path.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
In the probe path, dev_err() can be replaced with dev_err_probe()
which will check if error code is -EPROBE_DEFER and prints the
error name. It also sets the defer probe reason which can be
checked later through debugfs. It's more simple in error path.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
In the probe path, dev_err() can be replaced with dev_err_probe()
which will check if error code is -EPROBE_DEFER and prints the
error name. It also sets the defer probe reason which can be
checked later through debugfs. It's more simple in error path.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
The speedbin_nvmem parameter is not used for
get_krait_bin_format_{a,b}. Let's remove the parameter to make the code
cleaner.
Signed-off-by: Fabien Parent <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
This commit fixes a kernel oops because of a write in some read-only memory:
[ 9.068287] Unable to handle kernel write to read-only memory at virtual address ffff800009240ad8
..snip..
[ 9.138790] Internal error: Oops: 9600004f [#1] PREEMPT SMP
..snip..
[ 9.269161] Call trace:
[ 9.276271] __memcpy+0x5c/0x230
[ 9.278531] snprintf+0x58/0x80
[ 9.282002] qcom_cpufreq_msm8939_name_version+0xb4/0x190
[ 9.284869] qcom_cpufreq_probe+0xc8/0x39c
..snip..
The following line defines a pointer that point to a char buffer stored
in read-only memory:
char *pvs_name = "speedXX-pvsXX-vXX";
This pointer is meant to hold a template "speedXX-pvsXX-vXX" where the
XX values get overridden by the qcom_cpufreq_krait_name_version function. Since
the template is actually stored in read-only memory, when the function
executes the following call we get an oops:
snprintf(*pvs_name, sizeof("speedXX-pvsXX-vXX"), "speed%d-pvs%d-v%d",
speed, pvs, pvs_ver);
To fix this issue, we instead store the template name onto the stack by
using the following syntax:
char pvs_name_buffer[] = "speedXX-pvsXX-vXX";
Because the `pvs_name` needs to be able to be assigned to NULL, the
template buffer is stored in the pvs_name_buffer and not under the
pvs_name variable.
Cc: v5.7+ <[email protected]> # v5.7+
Fixes: a8811ec764f9 ("cpufreq: qcom: Add support for krait based socs")
Signed-off-by: Fabien Parent <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
If for some reason the speedbin length is incorrect, then there is a
memory leak in the error path because we never free the speedbin buffer.
This commit fixes the error path to always free the speedbin buffer.
Cc: v5.7+ <[email protected]> # v5.7+
Fixes: a8811ec764f9 ("cpufreq: qcom: Add support for krait based socs")
Signed-off-by: Fabien Parent <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
When the Tegra194 CPUFREQ driver is built as a module it is not
automatically loaded as expected on Tegra194 devices. Populate the
MODULE_DEVICE_TABLE to fix this.
Cc: v5.9+ <[email protected]> # v5.9+
Fixes: df320f89359c ("cpufreq: Add Tegra194 cpufreq driver")
Signed-off-by: Jon Hunter <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
There is some code in the parser that tries to read 0x8000
bytes into a block to "read in the middle" of the block. Well
that only works if the block is also 0x10000 bytes all the time,
else we get these parse errors as we reach the end of the flash:
spi-nor spi0.0: mx25l1606e (2048 Kbytes)
mtd_read error while parsing (offset: 0x200000): -22
mtd_read error while parsing (offset: 0x201000): -22
(...)
Fix the code to do what I think was intended.
Cc: [email protected]
Fixes: f0501e81fbaa ("mtd: bcm47xxpart: alternative MAGIC for board_data partition")
Cc: Rafał Miłecki <[email protected]>
Cc: Florian Fainelli <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
If the initialization fails in calling addrconf_init_net(), devconf_all is
the pointer that has been released. Then ip6mr_sk_done() is called to
release the net, accessing devconf->mc_forwarding directly causes invalid
pointer access.
The process is as follows:
setup_net()
ops_init()
addrconf_init_net()
all = kmemdup(...) ---> alloc "all"
...
net->ipv6.devconf_all = all;
__addrconf_sysctl_register() ---> failed
...
kfree(all); ---> ipv6.devconf_all invalid
...
ops_exit_list()
...
ip6mr_sk_done()
devconf = net->ipv6.devconf_all;
//devconf is invalid pointer
if (!devconf || !atomic_read(&devconf->mc_forwarding))
The following is the Call Trace information:
BUG: KASAN: use-after-free in ip6mr_sk_done+0x112/0x3a0
Read of size 4 at addr ffff888075508e88 by task ip/14554
Call Trace:
<TASK>
dump_stack_lvl+0x8e/0xd1
print_report+0x155/0x454
kasan_report+0xba/0x1f0
kasan_check_range+0x35/0x1b0
ip6mr_sk_done+0x112/0x3a0
rawv6_close+0x48/0x70
inet_release+0x109/0x230
inet6_release+0x4c/0x70
sock_release+0x87/0x1b0
igmp6_net_exit+0x6b/0x170
ops_exit_list+0xb0/0x170
setup_net+0x7ac/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f7963322547
</TASK>
Allocated by task 14554:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0xa1/0xb0
__kmalloc_node_track_caller+0x4a/0xb0
kmemdup+0x28/0x60
addrconf_init_net+0x1be/0x840
ops_init+0xa5/0x410
setup_net+0x5aa/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 14554:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
____kasan_slab_free+0x155/0x1b0
slab_free_freelist_hook+0x11b/0x220
__kmem_cache_free+0xa4/0x360
addrconf_init_net+0x623/0x840
ops_init+0xa5/0x410
setup_net+0x5aa/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixes: 7d9b1b578d67 ("ip6mr: fix use-after-free in ip6mr_sk_done()")
Signed-off-by: Zhengchao Shao <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Currently, the patch application logic checks whether the revision
needs to be applied on each logical CPU (SMT thread). Therefore, on SMT
designs where the microcode engine is shared between the two threads,
the application happens only on one of them as that is enough to update
the shared microcode engine.
However, there are microcode patches which do per-thread modification,
see Link tag below.
Therefore, drop the revision check and try applying on each thread. This
is what the BIOS does too so this method is very much tested.
Btw, change only the early paths. On the late loading paths, there's no
point in doing per-thread modification because if is it some case like
in the bugzilla below - removing a CPUID flag - the kernel cannot go and
un-use features it has detected are there early. For that, one should
use early loading anyway.
[ bp: Fixes does not contain the oldest commit which did check for
equality but that is good enough. ]
Fixes: 8801b3fcb574 ("x86/microcode/AMD: Rework container parsing")
Reported-by: Ștefan Talpalaru <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Tested-by: Ștefan Talpalaru <[email protected]>
Cc: <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
|
|
The interrupt controller can detect only link changes. So in case an
external device generated a level based interrupt, then the interrupt
controller detected correctly the first edge. But the problem was that
the interrupt controller was detecting also the edge when the interrupt
was cleared. So it would generate another interrupt.
The fix for this is to clear the second interrupt but still check the
interrupt line status.
Fixes: c297561bc98a ("pinctrl: ocelot: Fix interrupt controller")
Signed-off-by: Horatiu Vultur <[email protected]>
Tested-by: Michael Walle <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
Originally the absence of the marvell,nand-keep-config property caused
the setup_data_interface function to be provided. However when
setup_data_interface was moved into nand_controller_ops the logic was
unintentionally inverted. Update the logic so that only if the
marvell,nand-keep-config property is present the bootloader NAND config
kept.
Cc: [email protected]
Fixes: 7a08dbaedd36 ("mtd: rawnand: Move ->setup_data_interface() to nand_controller_ops")
Signed-off-by: Tony O'Brien <[email protected]>
Signed-off-by: Chris Packham <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context.
Cc: [email protected]
Fixes: d7d9f8ec77fe9 ("mtd: rawnand: add NVIDIA Tegra NAND Flash controller driver")
Signed-off-by: Zhang Qilong <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
Follow the advice of the Documentation/filesystems/sysfs.rst
and show() should only use sysfs_emit() or sysfs_emit_at()
when formatting the value to be returned to user space.
Signed-off-by: Xuezhi Zhang <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
|
|
The 'chip_np' returned by of_get_next_child() with refcount decremented,
of_node_put() need be called in error path to decrease the refcount.
Fixes: bfc618fcc3f1 ("mtd: rawnand: intel: Read the chip-select line from the correct OF node")
Signed-off-by: Yang Yingliang <[email protected]>
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
Follow the advice of the Documentation/filesystems/sysfs.rst
and show() should only use sysfs_emit() or sysfs_emit_at()
when formatting the value to be returned to user space.
Signed-off-by: Xuezhi Zhang <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
|
|
When the text console is scrolling text upwards it calls the fillrect()
function to empty the new line. The current implementation doesn't seem
to work correctly on HCRX cards in 32-bit mode and leave garbage in that
line instead. Fix it by falling back to standard cfb_fillrect() in that
case.
Signed-off-by: Helge Deller <[email protected]>
Cc: <[email protected]>
|
|
Even in the presence of problems (here: regulator_disable() might fail),
it's important to unregister all resources acquired during .probe() and
disable the device (i.e. DMA activity) because even if .remove() returns
an error code, the device is removed and the .remove() callback is never
called again later to catch up.
This is a preparation for making platform remove callbacks return void.
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
Fixes: 611097d5daea ("fbdev: da8xx: add support for a regulator")
|
|
When we call connect() for a UDP socket in a reuseport group, we have
to update sk->sk_reuseport_cb->has_conns to 1. Otherwise, the kernel
could select a unconnected socket wrongly for packets sent to the
connected socket.
However, the current way to set has_conns is illegal and possible to
trigger that problem. reuseport_has_conns() changes has_conns under
rcu_read_lock(), which upgrades the RCU reader to the updater. Then,
it must do the update under the updater's lock, reuseport_lock, but
it doesn't for now.
For this reason, there is a race below where we fail to set has_conns
resulting in the wrong socket selection. To avoid the race, let's split
the reader and updater with proper locking.
cpu1 cpu2
+----+ +----+
__ip[46]_datagram_connect() reuseport_grow()
. .
|- reuseport_has_conns(sk, true) |- more_reuse = __reuseport_alloc(more_socks_size)
| . |
| |- rcu_read_lock()
| |- reuse = rcu_dereference(sk->sk_reuseport_cb)
| |
| | | /* reuse->has_conns == 0 here */
| | |- more_reuse->has_conns = reuse->has_conns
| |- reuse->has_conns = 1 | /* more_reuse->has_conns SHOULD BE 1 HERE */
| | |
| | |- rcu_assign_pointer(reuse->socks[i]->sk_reuseport_cb,
| | | more_reuse)
| `- rcu_read_unlock() `- kfree_rcu(reuse, rcu)
|
|- sk->sk_state = TCP_ESTABLISHED
Note the likely(reuse) in reuseport_has_conns_set() is always true,
but we put the test there for ease of review. [0]
For the record, usually, sk_reuseport_cb is changed under lock_sock().
The only exception is reuseport_grow() & TCP reqsk migration case.
1) shutdown() TCP listener, which is moved into the latter part of
reuse->socks[] to migrate reqsk.
2) New listen() overflows reuse->socks[] and call reuseport_grow().
3) reuse->max_socks overflows u16 with the new listener.
4) reuseport_grow() pops the old shutdown()ed listener from the array
and update its sk->sk_reuseport_cb as NULL without lock_sock().
shutdown()ed TCP sk->sk_reuseport_cb can be changed without lock_sock(),
but, reuseport_has_conns_set() is called only for UDP under lock_sock(),
so likely(reuse) never be false in reuseport_has_conns_set().
[0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/
Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
This fixes unbalanced of_node_put():
[ 1.078910] 6 cmdlinepart partitions found on MTD device gpmi-nand
[ 1.085116] Creating 6 MTD partitions on "gpmi-nand":
[ 1.090181] 0x000000000000-0x000008000000 : "nandboot"
[ 1.096952] 0x000008000000-0x000009000000 : "nandfit"
[ 1.103547] 0x000009000000-0x00000b000000 : "nandkernel"
[ 1.110317] 0x00000b000000-0x00000c000000 : "nanddtb"
[ 1.115525] ------------[ cut here ]------------
[ 1.120141] refcount_t: addition on 0; use-after-free.
[ 1.125328] WARNING: CPU: 0 PID: 1 at lib/refcount.c:25 refcount_warn_saturate+0xdc/0x148
[ 1.133528] Modules linked in:
[ 1.136589] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc7-next-20220930-04543-g8cf3f7
[ 1.146342] Hardware name: Freescale i.MX8DXL DDR3L EVK (DT)
[ 1.151999] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1.158965] pc : refcount_warn_saturate+0xdc/0x148
[ 1.163760] lr : refcount_warn_saturate+0xdc/0x148
[ 1.168556] sp : ffff800009ddb080
[ 1.171866] x29: ffff800009ddb080 x28: ffff800009ddb35a x27: 0000000000000002
[ 1.179015] x26: ffff8000098b06ad x25: ffffffffffffffff x24: ffff0a00ffffff05
[ 1.186165] x23: ffff00001fdf6470 x22: ffff800009ddb367 x21: 0000000000000000
[ 1.193314] x20: ffff00001fdfebe8 x19: ffff00001fdfec50 x18: ffffffffffffffff
[ 1.200464] x17: 0000000000000000 x16: 0000000000000118 x15: 0000000000000004
[ 1.207614] x14: 0000000000000fff x13: ffff800009bca248 x12: 0000000000000003
[ 1.214764] x11: 00000000ffffefff x10: c0000000ffffefff x9 : 4762cb2ccb52de00
[ 1.221914] x8 : 4762cb2ccb52de00 x7 : 205d313431303231 x6 : 312e31202020205b
[ 1.229063] x5 : ffff800009d55c1f x4 : 0000000000000001 x3 : 0000000000000000
[ 1.236213] x2 : 0000000000000000 x1 : ffff800009954be6 x0 : 000000000000002a
[ 1.243365] Call trace:
[ 1.245806] refcount_warn_saturate+0xdc/0x148
[ 1.250253] kobject_get+0x98/0x9c
[ 1.253658] of_node_get+0x20/0x34
[ 1.257072] of_fwnode_get+0x3c/0x54
[ 1.260652] fwnode_get_nth_parent+0xd8/0xf4
[ 1.264926] fwnode_full_name_string+0x3c/0xb4
[ 1.269373] device_node_string+0x498/0x5b4
[ 1.273561] pointer+0x41c/0x5d0
[ 1.276793] vsnprintf+0x4d8/0x694
[ 1.280198] vprintk_store+0x164/0x528
[ 1.283951] vprintk_emit+0x98/0x164
[ 1.287530] vprintk_default+0x44/0x6c
[ 1.291284] vprintk+0xf0/0x134
[ 1.294428] _printk+0x54/0x7c
[ 1.297486] of_node_release+0xe8/0x128
[ 1.301326] kobject_put+0x98/0xfc
[ 1.304732] of_node_put+0x1c/0x28
[ 1.308137] add_mtd_device+0x484/0x6d4
[ 1.311977] add_mtd_partitions+0xf0/0x1d0
[ 1.316078] parse_mtd_partitions+0x45c/0x518
[ 1.320439] mtd_device_parse_register+0xb0/0x274
[ 1.325147] gpmi_nand_probe+0x51c/0x650
[ 1.329074] platform_probe+0xa8/0xd0
[ 1.332740] really_probe+0x130/0x334
[ 1.336406] __driver_probe_device+0xb4/0xe0
[ 1.340681] driver_probe_device+0x3c/0x1f8
[ 1.344869] __driver_attach+0xdc/0x1a4
[ 1.348708] bus_for_each_dev+0x80/0xcc
[ 1.352548] driver_attach+0x24/0x30
[ 1.356127] bus_add_driver+0x108/0x1f4
[ 1.359967] driver_register+0x78/0x114
[ 1.363807] __platform_driver_register+0x24/0x30
[ 1.368515] gpmi_nand_driver_init+0x1c/0x28
[ 1.372798] do_one_initcall+0xbc/0x238
[ 1.376638] do_initcall_level+0x94/0xb4
[ 1.380565] do_initcalls+0x54/0x94
[ 1.384058] do_basic_setup+0x1c/0x28
[ 1.387724] kernel_init_freeable+0x110/0x188
[ 1.392084] kernel_init+0x20/0x1a0
[ 1.395578] ret_from_fork+0x10/0x20
[ 1.399157] ---[ end trace 0000000000000000 ]---
[ 1.403782] ------------[ cut here ]------------
Reported-by: Han Xu <[email protected]>
Fixes: ad9b10d1eaada169 ("mtd: core: introduce of support for dynamic partitions")
Signed-off-by: Rafał Miłecki <[email protected]>
Tested-by: Han Xu <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
The Intel SPI-NOR controller does not support the 4-byte address opcode
so ->set_4byte_addr_mode() ends up returning -ENOTSUPP and the SPI flash
chip probe fail like this:
[ 12.291082] spi-nor: probe of spi0.0 failed with error -524
Whereas previously before commit 08412e72afba ("mtd: spi-nor: core:
Return error code from set_4byte_addr_mode()") it worked just fine.
Fix this by ignoring -ENOTSUPP in spi_nor_init().
Fixes: 08412e72afba ("mtd: spi-nor: core: Return error code from set_4byte_addr_mode()")
Cc: [email protected]
Reported-by: Hongyu Ning <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Michael Walle <[email protected]>
Acked-by: Tudor Ambarus <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
|
|
This reverts commit 133ad0d9af99bdca90705dadd8d31c20bfc9919f.
On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware)
using these pinctrl properties can cause system hang because there is
missing feature autodetection.
When this feature is implemented, support for these two properties should
bring back.
Cc: [email protected]
Signed-off-by: Sai Krishna Potthuri <[email protected]>
Acked-by: Michal Simek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
bias-high-impedance"
This reverts commit ad2bea79ef0144043721d4893eef719c907e2e63.
On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware)
using these pinctrl properties can cause system hang because there is
missing feature autodetection.
When this feature is implemented in the PMUFW, support for these two
properties should bring back.
Cc: [email protected]
Signed-off-by: Sai Krishna Potthuri <[email protected]>
Acked-by: Michal Simek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
Commit 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox
command") introduced allocations for the VMID resources in
lpfc_create_port() after the call to scsi_host_alloc(). Upon failure on the
VMID allocations, the new code would branch to the 'out' label, which
returns NULL without unwinding anything, thus skipping the call to
scsi_host_put().
Fix the problem by creating a separate label 'out_free_vmid' to unwind the
VMID resources and make the 'out_put_shost' label call only
scsi_host_put(), as was done before the introduction of allocations for
VMID.
Fixes: 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command")
Signed-off-by: Rafael Mendonca <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: James Smart <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Userspace can currently write to sysfs to transition sdev_state to RUNNING
or OFFLINE from any source state. This causes issues because proper
transitioning out of some states involves steps besides just changing
sdev_state, so allowing userspace to change sdev_state regardless of the
source state can result in inconsistencies; e.g. with ISCSI we can end up
with sdev_state == SDEV_RUNNING while the device queue is quiesced. Any
task attempting I/O on the device will then hang, and in more recent
kernels, iscsid will hang as well.
More detail about this bug is provided in my first attempt:
https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Uday Shankar <[email protected]>
Suggested-by: Mike Christie <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix a recent regression where a sleeping kernfs function is called
with css_set_lock (spinlock) held
- Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file()
Multiple users assume that the lookup only works for cgroup2 and
breaks when fed a cgroup1 file. Instead, introduce a separate set of
functions to lookup both v1 and v2 and use them where the user
explicitly wants to support both versions.
- Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c.
- Add Josef Bacik as a blkcg maintainer.
* tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
blkcg: Update MAINTAINERS entry
mm: cgroup: fix comments for get from fd/file helpers
perf stat: Support old kernels for bperf cgroup counting
bpf: cgroup_iter: support cgroup1 using cgroup fd
cgroup: add cgroup_v1v2_get_from_[fd/file]()
Revert "cgroup: enable cgroup_get_from_file() on cgroup1"
cgroup: Reorganize css_set_lock and kernfs path processing
|
|
Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *")
a memory leakage problem occurs. Memory allocated for page table entries
not released during process termination. This issue can be reproduced by
a small program that allocates a large amount of memory. After several
runs, you'll see that the amount of free memory has reduced and will
continue to reduce after each run. All ARC CPUs are effected by this
issue. The issue was introduced since the kernel stable release v5.15-rc1.
As described in commit d9820ff after switch pgtable_t back to struct
page *, a pointer to "struct page" and appropriate functions are used to
allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't
changed and returns the direct virtual address from the PMD (PGD) entry.
Than this address used as a parameter in the __pte_free() and as a result
this function couldn't release memory page allocated for PTEs.
Fix this issue by changing the pmd_pgtable macro and returning pointer to
struct page.
Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *")
Cc: Mike Rapoport <[email protected]>
Cc: <[email protected]> # 5.15.x
Signed-off-by: Pavel Kozlov <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
Clean up config files by:
- removing configs that were deleted in the past
- removing configs not in tree and without recently pending patches
- adding new configs that are replacements for old configs in the file
For some detailed information, see Link.
Link: https://lore.kernel.org/kernel-janitors/[email protected]/
Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
Add 'volatile' to iounmap()'s argument to prevent build warnings.
This make it the same as other major architectures.
Placates these warnings: (12 such warnings)
../drivers/video/fbdev/riva/fbdev.c: In function 'rivafb_probe':
../drivers/video/fbdev/riva/fbdev.c:2067:42: error: passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type [-Werror=discarded-qualifiers]
2067 | iounmap(default_par->riva.PRAMIN);
Fixes: 1162b0701b14b ("ARC: I/O and DMA Mappings")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: [email protected]
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
In accordance with the Generic EHCI/OHCI bindings the corresponding node
name is suppose to comply with the Generic USB HCD DT schema, which
requires the USB nodes to have the name acceptable by the regexp:
"^usb(@.*)?" . Make sure the "generic-ehci" and "generic-ohci"-compatible
nodes are correctly named.
Signed-off-by: Serge Semin <[email protected]>
Acked-by: Alexey Brodkin <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
As per asm-generic definition and other architectures __fls should
return unsigned long.
No functional change is expected as return value should fit in unsigned
long.
Reviewed-by: Cezary Rojewski <[email protected]>
Signed-off-by: Amadeusz Sławiński <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
Change 'seperate' to 'separate'.
Signed-off-by: Zhang Jiaming <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
- Remove one of the repeated 'call' in comment line 396.
- Delete the redundant word 'to', 'since'
Signed-off-by: Jilin Yuan <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
When compiling with clang and W=1, the following warning is generated:
drivers/ata/ahci_qoriq.c:283:22: error: cast to smaller integer type
'enum ahci_qoriq_type' from 'const void *'
[-Werror,-Wvoid-pointer-to-enum-cast]
qoriq_priv->type = (enum ahci_qoriq_type)of_id->data;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by using a cast to unsigned long to match the "void *" type
size of of_id->data.
Signed-off-by: Damien Le Moal <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
|