Age | Commit message (Collapse) | Author | Files | Lines |
|
This testcase
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <pthread.h>
#include <assert.h>
void *tf(void *arg)
{
return NULL;
}
int main(void)
{
int pid = fork();
if (!pid) {
kill(getpid(), SIGSTOP);
pthread_t th;
pthread_create(&th, NULL, tf, NULL);
return 0;
}
waitpid(pid, NULL, WSTOPPED);
ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE);
waitpid(pid, NULL, 0);
ptrace(PTRACE_CONT, pid, 0,0);
waitpid(pid, NULL, 0);
int status;
int thread = waitpid(-1, &status, 0);
assert(thread > 0 && thread != pid);
assert(status == 0x80137f);
return 0;
}
fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap().
This is because task_join_group_stop() has 2 problems when current is traced:
1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee
can be woken up by debugger and it can clone another thread which
should join the group-stop.
We need to check group_stop_count || SIGNAL_STOP_STOPPED.
2. If SIGNAL_STOP_STOPPED is already set, we should not increment
sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread
should stop without another do_notify_parent_cldstop() report.
To clarify, the problem is very old and we should blame
ptrace_init_task(). But now that we have task_join_group_stop() it makes
more sense to fix this helper to avoid the code duplication.
Reported-by: [email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: Zhiqiang Liu <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <[email protected]>
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Feilong Lin <[email protected]>
Cc: Shijie Luo <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Now that we have KASAN-KUNIT tests integration, it's easy to see that
some KASAN tests are not adopted to the SW_TAGS mode and are failing.
Adjust the allocation size for kasan_memchr() and kasan_memcmp() by
roung it up to OOB_TAG_OFF so the bad access ends up in a separate
memory granule.
Add a new kmalloc_uaf_16() tests that relies on UAF, and a new
kasan_bitops_tags() test that is tailored to tag-based mode, as it's
hard to adopt the existing kmalloc_oob_16() and kasan_bitops_generic()
(renamed from kasan_bitops()) without losing the precision.
Add new kmalloc_uaf_16() and kasan_bitops_uaf() tests that rely on UAFs,
as it's hard to adopt the existing kmalloc_oob_16() and
kasan_bitops_oob() (rename from kasan_bitops()) without losing the
precision.
Disable kasan_global_oob() and kasan_alloca_oob_left/right() as SW_TAGS
mode doesn't instrument globals nor dynamic allocas.
Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: David Gow <[email protected]>
Link: https://lkml.kernel.org/r/76eee17b6531ca8b3ca92b240cb2fd23204aaff7.1603129942.git.andreyknvl@google.com
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Richard reported a warning which can be reproduced by running the LTP
madvise6 test (cgroup v1 in the non-hierarchical mode should be used):
WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Modules linked in:
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014
Workqueue: events drain_local_stock
RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Call Trace:
__memcg_kmem_uncharge (mm/memcontrol.c:3022)
drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114)
drain_local_stock (mm/memcontrol.c:2255)
process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274)
worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416)
kthread (kernel/kthread.c:292)
ret_from_fork (arch/x86/entry/entry_64.S:300)
The problem occurs because in the non-hierarchical mode non-root page
counters are not linked to root page counters, so the charge is not
propagated to the root memory cgroup.
After the removal of the original memory cgroup and reparenting of the
object cgroup, the root cgroup might be uncharged by draining a objcg
stock, for example. It leads to an eventual underflow of the charge and
triggers a warning.
Fix it by linking all page counters to corresponding root page counters
in the non-hierarchical mode.
Please note, that in the non-hierarchical mode all objcgs are always
reparented to the root memory cgroup, even if the hierarchy has more
than 1 level. This patch doesn't change it.
The patch also doesn't affect how the hierarchical mode is working,
which is the only sane and truly supported mode now.
Thanks to Richard for reporting, debugging and providing an alternative
version of the fix!
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Reported-by: <[email protected]>
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Reviewed-by: Michal Koutný <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Debugged-by: Richard Palethorpe <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
memcg_page_state will get the specified number in hierarchical memcg, It
should multiply by HPAGE_PMD_NR rather than an page if the item is
NR_ANON_THPS.
[[email protected]: fix printk warning]
[[email protected]: use u64 cast, per Michal]
Fixes: 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter")
Signed-off-by: zhongjiang-ali <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: https://lkml.kernel.org/r/1603722395-72443-1-git-send-email-zhongjiang-ali@linux.alibaba.com
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Michal Privoznik was using "free page reporting" in QEMU/virtio-balloon
with hugetlbfs and hit the warning below. QEMU with free page hinting
uses fallocate(FALLOC_FL_PUNCH_HOLE) to discard pages that are reported
as free by a VM. The reporting granularity is in pageblock granularity.
So when the guest reports 2M chunks, we fallocate(FALLOC_FL_PUNCH_HOLE)
one huge page in QEMU.
WARNING: CPU: 7 PID: 6636 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x50
Modules linked in: ...
CPU: 7 PID: 6636 Comm: qemu-system-x86 Not tainted 5.9.0 #137
Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F21 07/31/2020
RIP: 0010:page_counter_uncharge+0x4b/0x50
...
Call Trace:
hugetlb_cgroup_uncharge_file_region+0x4b/0x80
region_del+0x1d3/0x300
hugetlb_unreserve_pages+0x39/0xb0
remove_inode_hugepages+0x1a8/0x3d0
hugetlbfs_fallocate+0x3c4/0x5c0
vfs_fallocate+0x146/0x290
__x64_sys_fallocate+0x3e/0x70
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Investigation of the issue uncovered bugs in hugetlb cgroup reservation
accounting. This patch addresses the found issues.
Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings")
Reported-by: Michal Privoznik <[email protected]>
Co-developed-by: David Hildenbrand <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Mike Kravetz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Michal Privoznik <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Cc: <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Tejun Heo <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
commit 6f42193fd86e ("memremap: don't use a separate devm action for
devmap_managed_enable_get") changed the static key updates such that we
now call devmap_managed_enable_put() without doing the equivalent
devmap_managed_enable_get().
devmap_managed_enable_get() is only called for MEMORY_DEVICE_PRIVATE and
MEMORY_DEVICE_FS_DAX, But memunmap_pages() get called for other pgmap
types too. This results in the below warning when switching between
system-ram and devdax mode for devdax namespace.
jump label: negative count!
WARNING: CPU: 52 PID: 1335 at kernel/jump_label.c:235 static_key_slow_try_dec+0x88/0xa0
Modules linked in:
....
NIP static_key_slow_try_dec+0x88/0xa0
LR static_key_slow_try_dec+0x84/0xa0
Call Trace:
static_key_slow_try_dec+0x84/0xa0
__static_key_slow_dec_cpuslocked+0x34/0xd0
static_key_slow_dec+0x54/0xf0
memunmap_pages+0x36c/0x500
devm_action_release+0x30/0x50
release_nodes+0x2f4/0x3e0
device_release_driver_internal+0x17c/0x280
bus_remove_device+0x124/0x210
device_del+0x1d4/0x530
unregister_dev_dax+0x48/0xe0
devm_action_release+0x30/0x50
release_nodes+0x2f4/0x3e0
device_release_driver_internal+0x17c/0x280
unbind_store+0x130/0x170
drv_attr_store+0x40/0x60
sysfs_kf_write+0x6c/0xb0
kernfs_fop_write+0x118/0x280
vfs_write+0xe8/0x2a0
ksys_write+0x84/0x140
system_call_exception+0x120/0x270
system_call_common+0xf0/0x27c
Reported-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
ARC HSDK platform stopped booting on released v5.10-rc1, getting stuck
in startup of non master SMP cores.
This was bisected to upstream commit 7fef431be9c9ac25
"(mm/page_alloc: place pages to tail in __free_pages_core())"
That commit itself is harmless, it just exposed a subtle assumption in
our platform code (hence CC'ing linux-mm just as FYI in case some other
arches / platforms trip on it).
The upstream commit is semantically disruptive as it reverses the order
of page allocations (actually it can be good test for hardware
verification to exercise different memory patterns altogether).
For ARC HSDK platform that meant a remapped memory region (pertaining to
unused Closely Coupled Memory) started getting used early for dynamice
allocations, while not effectively remapped on all the cores, triggering
memory error exception on those cores.
The fix is to move the CCM remapping from early platform code to to early core
boot code. And while it is undesirable to riddle common boot code with
platform quirks, there is no other way to do this since the faltering code
involves setting up stack itself so even function calls are not allowed at
that point.
If anyone is interested, all the gory details can be found at Link below.
Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/32
Cc: David Hildenbrand <[email protected]>
Cc: [email protected]
Signed-off-by: Vineet Gupta <[email protected]>
|
|
Currently stack unwinder is a while(1) loop which relies on the dwarf
unwinder to signal termination, which in turn relies on dwarf info to do
so. This in theory could cause an infinite loop if the dwarf info was
somehow messed up or the register contents were etc.
This fix thus detects the excessive looping and breaks the loop.
| Mem: 26184K used, 1009136K free, 0K shrd, 0K buff, 14416K cached
| CPU: 0.0% usr 72.8% sys 0.0% nic 27.1% idle 0.0% io 0.0% irq 0.0% sirq
| Load average: 4.33 2.60 1.11 2/74 139
| PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
| 133 2 root SWN 0 0.0 3 22.9 [rcu_torture_rea]
| 132 2 root SWN 0 0.0 0 22.0 [rcu_torture_rea]
| 131 2 root SWN 0 0.0 3 21.5 [rcu_torture_rea]
| 126 2 root RW 0 0.0 2 5.4 [rcu_torture_wri]
| 129 2 root SWN 0 0.0 0 0.2 [rcu_torture_fak]
| 137 2 root SW 0 0.0 0 0.2 [rcu_torture_cbf]
| 127 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak]
| 138 115 root R 1464 0.1 2 0.1 top
| 130 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak]
| 128 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak]
| 115 1 root S 1472 0.1 1 0.0 -/bin/sh
| 104 1 root S 1464 0.1 0 0.0 inetd
| 1 0 root S 1456 0.1 2 0.0 init
| 78 1 root S 1456 0.1 0 0.0 syslogd -O /var/log/messages
| 134 2 root SW 0 0.0 2 0.0 [rcu_torture_sta]
| 10 2 root IW 0 0.0 1 0.0 [rcu_preempt]
| 88 2 root IW 0 0.0 1 0.0 [kworker/1:1-eve]
| 66 2 root IW 0 0.0 2 0.0 [kworker/2:2-eve]
| 39 2 root IW 0 0.0 2 0.0 [kworker/2:1-eve]
| unwinder looping too long, aborting !
Cc: <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
Failure in srpt_refresh_port() for the second port will leave MAD
registered for the first one, however, the srpt_add_one() will be marked
as "failed" and SRPT will leak resources for that registered but not used
and released first port.
Unregister the MAD agent for all ports in case of failure.
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Camelia Groza says:
====================
dpaa_eth: buffer layout fixes
The patches are related to the software workaround for the A050385 erratum.
The first patch ensures optimal buffer usage for non-erratum scenarios. The
second patch fixes a currently inconsequential discrepancy between the
FMan and Ethernet drivers.
Changes in v3:
- refactor defines for clarity in 1/2
- add more details on the user impact in 1/2
- remove unnecessary inline identifier in 2/2
Changes in v2:
- make the returned value for TX ports explicit in 2/2
- simplify the buf_layout reference in 2/2
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The headroom reserved for received frames needs to be aligned to an
RX specific value. There is currently a discrepancy between the values
used in the Ethernet driver and the values passed to the FMan.
Coincidentally, the resulting aligned values are identical.
Fixes: 3c68b8fffb48 ("dpaa_eth: FMan erratum A050385 workaround")
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: Camelia Groza <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Impose a larger RX private data area only when the A050385 erratum is
present on the hardware. A smaller buffer size is sufficient in all
other scenarios. This enables a wider range of linear Jumbo frame
sizes in non-erratum scenarios, instead of turning to multi
buffer Scatter/Gather frames. The maximum linear frame size is
increased by 128 bytes for non-erratum arm64 platforms.
Cleanup the hardware annotations header defines in the process.
Fixes: 3c68b8fffb48 ("dpaa_eth: FMan erratum A050385 workaround")
Signed-off-by: Camelia Groza <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The commit f959dcd6ddfd ("dma-direct: Fix potential NULL pointer
dereference") made dma_mask as mandetory field to be setup even for
dma_virt_ops based dma devices. The commit in the fixes tag omitted
setting up the dma_mask on virtual devices triggering the below trace when
they were combined during the merge window.
Fix it by setting empty DMA MASK for software based RDMA devices.
WARNING: CPU: 1 PID: 8488 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x493/0x700
CPU: 1 PID: 8488 Comm: syz-executor144 Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dma_map_page_attrs+0x493/0x700 kernel/dma/mapping.c:149
Trace:
dma_map_single_attrs include/linux/dma-mapping.h:279 [inline]
ib_dma_map_single include/rdma/ib_verbs.h:3967 [inline]
ib_mad_post_receive_mads+0x23f/0xd60 drivers/infiniband/core/mad.c:2715
ib_mad_port_start drivers/infiniband/core/mad.c:2862 [inline]
ib_mad_port_open drivers/infiniband/core/mad.c:3016 [inline]
ib_mad_init_device+0x72b/0x1400 drivers/infiniband/core/mad.c:3092
add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:680
enable_device_and_get+0x1d5/0x3c0 drivers/infiniband/core/device.c:1301
ib_register_device drivers/infiniband/core/device.c:1376 [inline]
ib_register_device+0x7a7/0xa40 drivers/infiniband/core/device.c:1335
rxe_register_device+0x46d/0x570 drivers/infiniband/sw/rxe/rxe_verbs.c:1182
rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247
rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:507
rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline]
rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250
nldev_newlink+0x30e/0x540 drivers/infiniband/core/nldev.c:1555
rdma_nl_rcv_msg+0x367/0x690 drivers/infiniband/core/netlink.c:195
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x2f2/0x440 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:671
____sys_sendmsg+0x6e8/0x810 net/socket.c:2353
___sys_sendmsg+0xf3/0x170 net/socket.c:2407
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2440
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x443699
Link: https://lore.kernel.org/r/[email protected]
Reported-by: [email protected]
Fixes: e0477b34d9d1 ("RDMA: Explicitly pass in the dma_device to ib_register_device")
Signed-off-by: Parav Pandit <[email protected]>
Tested-by: Guoqing Jiang <[email protected]>
Tested-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Zhu Yanjun <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple of fixes, for
* HE on 2.4 GHz
* a few issues syzbot found, but we have many more reports :-(
* a regression in nl80211-transported EAPOL frames which had
affected a number of users, from Mathy
* kernel-doc markings in mac80211, from Mauro
* a format argument in reg.c, from Ye Bin
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There is no need to specify a "ULL" suffix for "all bits set": "~0" is
sufficient, and works regardless of type. In fact adding the suffix
makes the code more fragile.
Fixes: 48ab6d5d1f09 ("dma-mapping: fix 32-bit overflow with CONFIG_ARM_LPAE=n")
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Since the device is resumed from runtime-suspend in
__device_release_driver() anyway, it is better to do that before
looking for busy managed device links from it to consumers, because
if there are any, device_links_unbind_consumers() will be called
and it will cause the consumer devices' drivers to unbind, so the
consumer devices will be runtime-resumed. In turn, resuming each
consumer device will cause the supplier to be resumed and when the
runtime PM references from the given consumer to it are dropped, it
may be suspended. Then, the runtime-resume of the next consumer
will cause the supplier to resume again and so on.
Update the code accordingly.
Signed-off-by: Rafael J. Wysocki <[email protected]>
Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")
Cc: All applicable <[email protected]> # All applicable
Tested-by: Xiang Chen <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
|
|
After commit d12544fb2aa9 ("PM: runtime: Remove link state checks in
rpm_get/put_supplier()") nothing prevents the consumer device's
runtime PM from acquiring additional references to the supplier
device after pm_runtime_clean_up_links() has run (or even while it
is running), so calling this function from __device_release_driver()
may be pointless (or even harmful).
Moreover, it ignores stateless device links, so the runtime PM
handling of managed and stateless device links is inconsistent
because of it, so better get rid of it entirely.
Fixes: d12544fb2aa9 ("PM: runtime: Remove link state checks in rpm_get/put_supplier()")
Signed-off-by: Rafael J. Wysocki <[email protected]>
Cc: 5.1+ <[email protected]> # 5.1+
Tested-by: Xiang Chen <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
|
|
While removing a device link, drop the supplier device's runtime PM
usage counter as many times as needed to drop all of the runtime PM
references to it from the consumer in addition to dropping the
consumer's link count.
Fixes: baa8809f6097 ("PM / runtime: Optimize the use of device links")
Signed-off-by: Rafael J. Wysocki <[email protected]>
Cc: 5.1+ <[email protected]> # 5.1+
Tested-by: Xiang Chen <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
|
|
A semicolon is not needed after a switch statement.
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
cpu/ is needed before cpu<N>/
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
cerainly -> certainly
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The cpufreq policy's frequency limits (min/max) can get changed at any
point of time, while schedutil is trying to update the next frequency.
Though the schedutil governor has necessary locking and support in place
to make sure we don't miss any of those updates, there is a corner case
where the governor will find that the CPU is already running at the
desired frequency and so may skip an update.
For example, consider that the CPU can run at 1 GHz, 1.2 GHz and 1.4 GHz
and is running at 1 GHz currently. Schedutil tries to update the
frequency to 1.2 GHz, during this time the policy limits get changed as
policy->min = 1.4 GHz. As schedutil (and cpufreq core) does clamp the
frequency at various instances, we will eventually set the frequency to
1.4 GHz, while we will save 1.2 GHz in sg_policy->next_freq.
Now lets say the policy limits get changed back at this time with
policy->min as 1 GHz. The next time schedutil is invoked by the
scheduler, we will reevaluate the next frequency (because
need_freq_update will get set due to limits change event) and lets say
we want to set the frequency to 1.2 GHz again. At this point
sugov_update_next_freq() will find the next_freq == current_freq and
will abort the update, while the CPU actually runs at 1.4 GHz.
Until now need_freq_update was used as a flag to indicate that the
policy's frequency limits have changed, and that we should consider the
new limits while reevaluating the next frequency.
This patch fixes the above mentioned issue by extending the purpose of
the need_freq_update flag. If this flag is set now, the schedutil
governor will not try to abort a frequency change even if next_freq ==
current_freq.
As similar behavior is required in the case of
CPUFREQ_NEED_UPDATE_LIMITS flag as well, need_freq_update will never be
set to false if that flag is set for the driver.
We also don't need to consider the need_freq_update flag in
sugov_update_single() anymore to handle the special case of busy CPU, as
we won't abort a frequency update anymore.
Reported-by: zhuguangqing <[email protected]>
Suggested-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
[ rjw: Rearrange code to avoid a branch ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The array size is FTRACE_KSTACK_NESTING, so the index FTRACE_KSTACK_NESTING
is illegal too. And fix two typos by the way.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Qiujun Huang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
When an interrupt or NMI comes in and switches the context, there's a delay
from when the preempt_count() shows the update. As the preempt_count() is
used to detect recursion having each context have its own bit get set when
tracing starts, and if that bit is already set, it is considered a recursion
and the function exits. But if this happens in that section where context
has changed but preempt_count() has not been updated, this will be
incorrectly flagged as a recursion.
To handle this case, create another bit call TRANSITION and test it if the
current context bit is already set. Flag the call as a recursion if the
TRANSITION bit is already set, and if not, set it and continue. The
TRANSITION bit will be cleared normally on the return of the function that
set it, or if the current context bit is clear, set it and clear the
TRANSITION bit to allow for another transition between the current context
and an even higher one.
Cc: [email protected]
Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
The code that checks recursion will work to only do the recursion check once
if there's nested checks. The top one will do the check, the other nested
checks will see recursion was already checked and return zero for its "bit".
On the return side, nothing will be done if the "bit" is zero.
The problem is that zero is returned for the "good" bit when in NMI context.
This will set the bit for NMIs making it look like *all* NMI tracing is
recursing, and prevent tracing of anything in NMI context!
The simple fix is to return "bit + 1" and subtract that bit on the end to
get the real bit.
Cc: [email protected]
Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
The nesting count of trace_printk allows for 4 levels of nesting. The
nesting counter starts at zero and is incremented before being used to
retrieve the current context's buffer. But the index to the buffer uses the
nesting counter after it was incremented, and not its original number,
which in needs to do.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 3d9622c12c887 ("tracing: Add barrier to trace_printk() buffer nesting modification")
Signed-off-by: Qiujun Huang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
In order to make the vc4_kms_load code and error path a bit easier to
read and extend, add functions to create state objects, and use managed
actions to cleanup if needed.
Signed-off-by: Maxime Ripard <[email protected]>
Acked-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We can simplify a bit the bind code, its error path and unbind by using
the managed devm_drm_dev_alloc function.
Acked-by: Daniel Vetter <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We have a helper to retrieve the vc4_dev structure from the drm_device one
when needed, so let's use it consistently.
Acked-by: Daniel Vetter <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The job queue needs to be cleaned up using vc4_gem_destroy, but it's
not used consistently (vc4_drv's bind calls it in its error path, but
doesn't in unbind), and we can make that automatic through a managed
action. Let's remove the requirement to call vc4_gem_destroy.
Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.")
Acked-by: Daniel Vetter <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Using drmm_mode_config_init instead of drm_mode_config_init allows us to
cleanup a bit the error path.
Signed-off-by: Maxime Ripard <[email protected]>
Acked-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The BO cache needs to be cleaned up using vc4_bo_cache_destroy, but it's
not used consistently (vc4_drv's bind calls it in its error path, but
doesn't in unbind), and we can make that automatic through a managed
action. Let's remove the requirement to call vc4_bo_cache_destroy.
Fixes: c826a6e10644 ("drm/vc4: Add a BO cache.")
Acked-by: Daniel Vetter <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Three fixes all related to #DB:
- Handle the BTF bit correctly so it doesn't get lost due to a kernel
#DB
- Only clear and set the virtual DR6 value used by ptrace on user
space triggered #DB. A kernel #DB must leave it alone to ensure
data consistency for ptrace.
- Make the bitmasking of the virtual DR6 storage correct so it does
not lose DR_STEP"
* tag 'x86-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/debug: Fix DR_STEP vs ptrace_get_debugreg(6)
x86/debug: Only clear/set ->virtual_dr6 for userspace #DB
x86/debug: Fix BTF handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few fixes for timers/timekeeping:
- Prevent undefined behaviour in the timespec64_to_ns() conversion
which is used for converting user supplied time input to
nanoseconds. It lacked overflow protection.
- Mark sched_clock_read_begin/retry() to prevent recursion in the
tracer
- Remove unused debug functions in the hrtimer and timerlist code"
* tag 'timers-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Prevent undefined behaviour in timespec64_to_ns()
timers: Remove unused inline funtion debug_timer_free()
hrtimer: Remove unused inline function debug_hrtimer_free()
time/sched_clock: Mark sched_clock_read_begin/retry() as notrace
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp fix from Thomas Gleixner:
"A single fix for stop machine.
Mark functions no trace to prevent a crash caused by recursion when
enabling or disabling a tracer on RISC-V (probably all architectures
which patch through stop machine)"
* tag 'smp-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stop_machine, rcu: Mark functions as notrace
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A couple of locking fixes:
- Fix incorrect failure injection handling in the fuxtex code
- Prevent a preemption warning in lockdep when tracking
local_irq_enable() and interrupts are already enabled
- Remove more raw_cpu_read() usage from lockdep which causes state
corruption on !X86 architectures.
- Make the nr_unused_locks accounting in lockdep correct again"
* tag 'locking-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Fix nr_unused_locks accounting
locking/lockdep: Remove more raw_cpu_read() usage
futex: Fix incorrect should_fail_futex() handling
lockdep: Fix preemption WARN for spurious IRQ-enable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes/removals from Greg KH:
"Here's some small fixes for 5.10-rc2 and a big driver removal.
The fixes are for some reported issues in the interconnect and
coresight drivers, nothing major.
The "big" driver removal is the MIC drivers have been asked to be
removed as the hardware never shipped and Intel no longer wants to
maintain something that no one can use. This is welcomed by many as
the DMA usage of these drivers was "interesting" and the security
people were starting to question some issues that were starting to be
found in the codebase.
Note, one of the subsystems for this driver, the "VOP" code, will
probably come back in future kernel versions as it was looking to
potentially solve some PCIe virtualization issues that a number of
other vendors were wanting to solve. But as-is, this codebase didn't
work for anyone else so no actual functionality is being removed.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
coresight: cti: Initialize dynamic sysfs attributes
coresight: Fix uninitialised pointer bug in etm_setup_aux()
coresight: add module license
misc: mic: remove the MIC drivers
interconnect: qcom: use icc_sync state for sm8[12]50
interconnect: qcom: Ensure that the floor bandwidth value is enforced
interconnect: qcom: sc7180: Init BCMs before creating the nodes
interconnect: qcom: sdm845: Init BCMs before creating the nodes
interconnect: Aggregate before setting initial bandwidth
interconnect: qcom: sdm845: Enable keepalive for the MM1 BCM
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and documentation fixes from Greg KH:
"Here is one tiny debugfs change to fix up an API where the last user
was successfully fixed up in 5.10-rc1 (so it couldn't be merged
earlier), and a much larger Documentation/ABI/ update to the files so
they can be automatically parsed by our tools.
The Documentation/ABI/ updates are just formatting issues, small ones
to bring the files into parsable format, and have been acked by
numerous subsystem maintainers and the documentation maintainer. I
figured it was good to get this into 5.10-rc2 to help wih the merge
issues that would arise if these were to stick in linux-next until
5.11-rc1.
The debugfs change has been in linux-next for a long time, and the
Documentation updates only for the last linux-next release"
* tag 'driver-core-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (40 commits)
scripts: get_abi.pl: assume ReST format by default
docs: ABI: sysfs-class-led-trigger-pattern: remove hw_pattern duplication
docs: ABI: sysfs-class-backlight: unify ABI documentation
docs: ABI: sysfs-c2port: remove a duplicated entry
docs: ABI: sysfs-class-power: unify duplicated properties
docs: ABI: unify /sys/class/leds/<led>/brightness documentation
docs: ABI: stable: remove a duplicated documentation
docs: ABI: change read/write attributes
docs: ABI: cleanup several ABI documents
docs: ABI: sysfs-bus-nvdimm: use the right format for ABI
docs: ABI: vdso: use the right format for ABI
docs: ABI: fix syntax to be parsed using ReST notation
docs: ABI: convert testing/configfs-acpi to ReST
docs: Kconfig/Makefile: add a check for broken ABI files
docs: abi-testing.rst: enable --rst-sources when building docs
docs: ABI: don't escape ReST-incompatible chars from obsolete and removed
docs: ABI: create a 2-depth index for ABI
docs: ABI: make it parse ABI/stable as ReST-compatible files
docs: ABI: sysfs-uevent: make it compatible with ReST output
docs: ABI: testing: make the files compatible with ReST output
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for issues that have been
reported in 5.10-rc1:
- octeon driver fixes
- wfx driver fixes
- memory leak fix in vchiq driver
- fieldbus driver bugfix
- comedi driver bugfix
All of these have been in linux-next with no reported issues"
* tag 'staging-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: fieldbus: anybuss: jump to correct label in an error path
staging: wfx: fix test on return value of gpiod_get_value()
staging: wfx: fix use of uninitialized pointer
staging: mmal-vchiq: Fix memory leak for vchiq_instance
staging: comedi: cb_pcidas: Allow 2-channel commands for AO subdevice
staging: octeon: Drop on uncorrectable alignment or FCS error
staging: octeon: repair "fixed-link" support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small TTY and Serial driver fixes for reported issues
for 5.10-rc2. They include:
- vt ioctl bugfix for reported problems
- fsl_lpuart serial driver fix
- 21285 serial driver bugfix
All have been in linux-next with no reported issues"
* tag 'tty-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt_ioctl: fix GIO_UNIMAP regression
vt: keyboard, extend func_buf_lock to readers
vt: keyboard, simplify vt_kdgkbsent
tty: serial: fsl_lpuart: LS1021A has a FIFO size of 16 words, like LS1028A
tty: serial: 21285: fix lockup on open
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are a number of small bugfixes for reported issues in some USB
drivers. They include:
- typec bugfixes
- xhci bugfixes and lockdep warning fixes
- cdc-acm driver regression fix
- kernel doc fixes
- cdns3 driver bugfixes for a bunch of reported issues
- other tiny USB driver fixes
All have been in linux-next with no reported issues"
* tag 'usb-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: cdns3: gadget: own the lock wrongly at the suspend routine
usb: cdns3: Fix on-chip memory overflow issue
usb: cdns3: gadget: suspicious implicit sign extension
xhci: Don't create stream debugfs files with spinlock held.
usb: xhci: Workaround for S3 issue on AMD SNPS 3.0 xHC
xhci: Fix sizeof() mismatch
usb: typec: stusb160x: fix signedness comparison issue with enum variables
usb: typec: add missing MODULE_DEVICE_TABLE() to stusb160x
USB: apple-mfi-fastcharge: don't probe unhandled devices
usbcore: Check both id_table and match() when both available
usb: host: ehci-tegra: Fix error handling in tegra_ehci_probe()
usb: typec: stusb160x: fix an IS_ERR() vs NULL check in probe
usb: typec: tcpm: reset hard_reset_count for any disconnect
usb: cdc-acm: fix cooldown mechanism
usb: host: fsl-mph-dr-of: check return of dma_set_mask()
usb: fix kernel-doc markups
usb: typec: stusb160x: fix some signedness bugs
usb: cdns3: Variable 'length' set but not used
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- selftest fix
- force PTE mapping on device pages provided via VFIO
- fix detection of cacheable mapping at S2
- fallback to PMD/PTE mappings for composite huge pages
- fix accounting of Stage-2 PGD allocation
- fix AArch32 handling of some of the debug registers
- simplify host HYP entry
- fix stray pointer conversion on nVHE TLB invalidation
- fix initialization of the nVHE code
- simplify handling of capabilities exposed to HYP
- nuke VCPUs caught using a forbidden AArch32 EL0
x86:
- new nested virtualization selftest
- miscellaneous fixes
- make W=1 fixes
- reserve new CPUID bit in the KVM leaves"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: vmx: remove unused variable
KVM: selftests: Don't require THP to run tests
KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
KVM: selftests: test behavior of unmapped L2 APIC-access address
KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
KVM: x86: replace static const variables with macros
KVM: arm64: Handle Asymmetric AArch32 systems
arm64: cpufeature: upgrade hyp caps to final
arm64: cpufeature: reorder cpus_have_{const, final}_cap()
KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code()
KVM: arm64: Force PTE mapping on fault resulting in a device mapping
KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
KVM: arm64: Fix masks in stage2_pte_cacheable()
KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Incorrect netlink report logic in flowtable and genID.
2) Add a selftest to check that wireguard passes the right sk
to ip_route_me_harder, from Jason A. Donenfeld.
3) Pass the actual sk to ip_route_me_harder(), also from Jason.
4) Missing expression validation of updates via nft --check.
5) Update byte and packet counters regardless of whether they
match, from Stefano Brivio.
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The tunnel device such as vxlan, bareudp and geneve in the lwt mode set
the outer df only based TUNNEL_DONT_FRAGMENT.
And this was also the behavior for gre device before switching to use
ip_md_tunnel_xmit in commit 962924fa2b7a ("ip_gre: Refactor collect
metatdata mode tunnel xmit to ip_md_tunnel_xmit")
When the ip_gre in lwt mode xmit with ip_md_tunnel_xmi changed the rule and
make the discrepancy between handling of DF by different tunnels. So in the
ip_md_tunnel_xmit should follow the same rule like other tunnels.
Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Signed-off-by: wenxu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In my test setup, I had a SAMA5D27 device configured with ip forwarding, and
second device with usb ethernet (r8152) sending ICMP packets. If the packet
was larger than about 220 bytes, the SAMA5 device would "oops" with the
following trace:
kernel BUG at net/core/skbuff.c:1863!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in: xt_MASQUERADE ppp_async ppp_generic slhc iptable_nat xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 can_raw can bridge stp llc ipt_REJECT nf_reject_ipv4 sd_mod cdc_ether usbnet usb_storage r8152 scsi_mod mii o
ption usb_wwan usbserial micrel macb at91_sama5d2_adc phylink gpio_sama5d2_piobu m_can_platform m_can industrialio_triggered_buffer kfifo_buf of_mdio can_dev fixed_phy sdhci_of_at91 sdhci_pltfm libphy sdhci mmc_core ohci_at91 ehci_atmel o
hci_hcd iio_rescale industrialio sch_fq_codel spidev prox2_hal(O)
CPU: 0 PID: 0 Comm: swapper Tainted: G O 5.9.1-prox2+ #1
Hardware name: Atmel SAMA5
PC is at skb_put+0x3c/0x50
LR is at macb_start_xmit+0x134/0xad0 [macb]
pc : [<c05258cc>] lr : [<bf0ea5b8>] psr: 20070113
sp : c0d01a60 ip : c07232c0 fp : c4250000
r10: c0d03cc8 r9 : 00000000 r8 : c0d038c0
r7 : 00000000 r6 : 00000008 r5 : c59b66c0 r4 : 0000002a
r3 : 8f659eff r2 : c59e9eea r1 : 00000001 r0 : c59b66c0
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c53c7d Table: 2640c059 DAC: 00000051
Process swapper (pid: 0, stack limit = 0x75002d81)
<snipped stack>
[<c05258cc>] (skb_put) from [<bf0ea5b8>] (macb_start_xmit+0x134/0xad0 [macb])
[<bf0ea5b8>] (macb_start_xmit [macb]) from [<c053e504>] (dev_hard_start_xmit+0x90/0x11c)
[<c053e504>] (dev_hard_start_xmit) from [<c0571180>] (sch_direct_xmit+0x124/0x260)
[<c0571180>] (sch_direct_xmit) from [<c053eae4>] (__dev_queue_xmit+0x4b0/0x6d0)
[<c053eae4>] (__dev_queue_xmit) from [<c05a5650>] (ip_finish_output2+0x350/0x580)
[<c05a5650>] (ip_finish_output2) from [<c05a7e24>] (ip_output+0xb4/0x13c)
[<c05a7e24>] (ip_output) from [<c05a39d0>] (ip_forward+0x474/0x500)
[<c05a39d0>] (ip_forward) from [<c05a13d8>] (ip_sublist_rcv_finish+0x3c/0x50)
[<c05a13d8>] (ip_sublist_rcv_finish) from [<c05a19b8>] (ip_sublist_rcv+0x11c/0x188)
[<c05a19b8>] (ip_sublist_rcv) from [<c05a2494>] (ip_list_rcv+0xf8/0x124)
[<c05a2494>] (ip_list_rcv) from [<c05403c4>] (__netif_receive_skb_list_core+0x1a0/0x20c)
[<c05403c4>] (__netif_receive_skb_list_core) from [<c05405c4>] (netif_receive_skb_list_internal+0x194/0x230)
[<c05405c4>] (netif_receive_skb_list_internal) from [<c0540684>] (gro_normal_list.part.0+0x14/0x28)
[<c0540684>] (gro_normal_list.part.0) from [<c0541280>] (napi_complete_done+0x16c/0x210)
[<c0541280>] (napi_complete_done) from [<bf14c1c0>] (r8152_poll+0x684/0x708 [r8152])
[<bf14c1c0>] (r8152_poll [r8152]) from [<c0541424>] (net_rx_action+0x100/0x328)
[<c0541424>] (net_rx_action) from [<c01012ec>] (__do_softirq+0xec/0x274)
[<c01012ec>] (__do_softirq) from [<c012d6d4>] (irq_exit+0xcc/0xd0)
[<c012d6d4>] (irq_exit) from [<c0160960>] (__handle_domain_irq+0x58/0xa4)
[<c0160960>] (__handle_domain_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x90)
Exception stack(0xc0d01ef0 to 0xc0d01f38)
1ee0: 00000000 0000003d 0c31f383 c0d0fa00
1f00: c0d2eb80 00000000 c0d2e630 4dad8c49 4da967b0 0000003d 0000003d 00000000
1f20: fffffff5 c0d01f40 c04e0f88 c04e0f8c 30070013 ffffffff
[<c0100b0c>] (__irq_svc) from [<c04e0f8c>] (cpuidle_enter_state+0x7c/0x378)
[<c04e0f8c>] (cpuidle_enter_state) from [<c04e12c4>] (cpuidle_enter+0x28/0x38)
[<c04e12c4>] (cpuidle_enter) from [<c014f710>] (do_idle+0x194/0x214)
[<c014f710>] (do_idle) from [<c014fa50>] (cpu_startup_entry+0xc/0x14)
[<c014fa50>] (cpu_startup_entry) from [<c0a00dc8>] (start_kernel+0x46c/0x4a0)
Code: e580c054 8a000002 e1a00002 e8bd8070 (e7f001f2)
---[ end trace 146c8a334115490c ]---
The solution was to force nonlinear buffers to be cloned. This was previously
reported by Klaus Doth (https://www.spinics.net/lists/netdev/msg556937.html)
but never formally submitted as a patch.
This is the third revision, hopefully the formatting is correct this time!
Suggested-by: Klaus Doth <[email protected]>
Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation")
Signed-off-by: Mark Deneen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Pull vhost fixes from Michael Tsirkin:
"Fixes all over the place.
A new UAPI is borderline: can also be considered a new feature but
also seems to be the only way we could come up with to fix addressing
for userspace - and it seems important to switch to it now before
userspace making assumptions about addressing ability of devices is
set in stone"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpasim: allow to assign a MAC address
vdpasim: fix MAC address configuration
vdpa: handle irq bypass register failure case
vdpa_sim: Fix DMA mask
Revert "vhost-vdpa: fix page pinning leakage in error path"
vdpa/mlx5: Fix error return in map_direct_mr()
vhost_vdpa: Return -EFAULT if copy_from_user() fails
vdpa_sim: implement get_iova_range()
vhost: vdpa: report iova range
vdpa: introduce config op to get valid iova range
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull more flexible-array member conversions from Gustavo A. R. Silva:
"Replace zero-length arrays with flexible-array members"
* tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
printk: ringbuffer: Replace zero-length array with flexible-array member
net/smc: Replace zero-length array with flexible-array member
net/mlx5: Replace zero-length array with flexible-array member
mei: hw: Replace zero-length array with flexible-array member
gve: Replace zero-length array with flexible-array member
Bluetooth: btintel: Replace zero-length array with flexible-array member
scsi: target: tcmu: Replace zero-length array with flexible-array member
ima: Replace zero-length array with flexible-array member
enetc: Replace zero-length array with flexible-array member
fs: Replace zero-length array with flexible-array member
Bluetooth: Replace zero-length array with flexible-array member
params: Replace zero-length array with flexible-array member
tracepoint: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
|
|
Hangbin Liu says:
====================
IPv6: reply ICMP error if fragment doesn't contain all headers
When our Engineer run latest IPv6 Core Conformance test, test v6LC.1.3.6:
First Fragment Doesn’t Contain All Headers[1] failed. The test purpose is to
verify that the node (Linux for example) should properly process IPv6 packets
that don’t include all the headers through the Upper-Layer header.
Based on RFC 8200, Section 4.5 Fragment Header
- If the first fragment does not include all headers through an
Upper-Layer header, then that fragment should be discarded and
an ICMP Parameter Problem, Code 3, message should be sent to
the source of the fragment, with the Pointer field set to zero.
The first patch add a definition for ICMPv6 Parameter Problem, code 3.
The second patch add a check for the 1st fragment packet to make sure
Upper-Layer header exist.
[1] Page 68, v6LC.1.3.6: First Fragment Doesn’t Contain All Headers part A, B,
C and D at https://ipv6ready.org/docs/Core_Conformance_5_0_0.pdf
[2] My reproducer:
import sys, os
from scapy.all import *
def send_frag_dst_opt(src_ip6, dst_ip6):
ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44)
frag_1 = IPv6ExtHdrFragment(nh = 60, m = 1)
dst_opt = IPv6ExtHdrDestOpt(nh = 58)
frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1)
icmp_echo = ICMPv6EchoRequest(seq = 1)
pkt_1 = ip6/frag_1/dst_opt
pkt_2 = ip6/frag_2/icmp_echo
send(pkt_1)
send(pkt_2)
def send_frag_route_opt(src_ip6, dst_ip6):
ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44)
frag_1 = IPv6ExtHdrFragment(nh = 43, m = 1)
route_opt = IPv6ExtHdrRouting(nh = 58)
frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1)
icmp_echo = ICMPv6EchoRequest(seq = 2)
pkt_1 = ip6/frag_1/route_opt
pkt_2 = ip6/frag_2/icmp_echo
send(pkt_1)
send(pkt_2)
if __name__ == '__main__':
src = sys.argv[1]
dst = sys.argv[2]
conf.iface = sys.argv[3]
send_frag_dst_opt(src, dst)
send_frag_route_opt(src, dst)
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|