Age | Commit message (Collapse) | Author | Files | Lines |
|
Let's move the clearing to balloon_retrieve(). In
bp_state increase_reservation(), we now clear the flag a little earlier
than before, however, this should not matter for XEN.
Suggested-by: Boris Ostrovsky <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
|
|
Let's move the __SetPageOffline() call which all callers perform into
balloon_append().
In bp_state decrease_reservation(), pages are now marked PG_offline a
little later than before, however, this should not matter for XEN.
Suggested-by: Boris Ostrovsky <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
|
|
Let's simply use balloon_append() directly.
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
|
|
We are missing a __SetPageOffline(), which is why we can get
!PageOffline() pages onto the balloon list, where
alloc_xenballooned_pages() will complain:
page:ffffea0003e7ffc0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0xffffe00001000(reserved)
raw: 000ffffe00001000 dead000000000100 dead000000000200 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(!PageOffline(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:744!
invalid opcode: 0000 [#1] SMP NOPTI
Reported-by: Marek Marczykowski-Górecki <[email protected]>
Tested-by: Marek Marczykowski-Górecki <[email protected]>
Fixes: 77c4adf6a6df ("xen/balloon: mark inflated pages PG_offline")
Cc: [email protected] # v5.1+
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
|
|
HYPERVISOR_platform_op() is an inline function and should not
be exported. Since commit 15bfc2348d54 ("modpost: check for
static EXPORT_SYMBOL* functions"), this causes a warning:
WARNING: "HYPERVISOR_platform_op" [vmlinux] is a static EXPORT_SYMBOL_GPL
Instead, export the underlying function called by the static inline:
HYPERVISOR_platform_op_raw.
Fixes: 15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL* functions")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
|
|
Commit a745f7af3cbd ("selftests/harness: Add 30 second timeout per
test") solves the problem of kselftest_harness.h-using binary tests
possibly hanging forever. However, scripts and other binaries can still
hang forever. This adds a global timeout to each test script run.
To make this configurable (e.g. as needed in the "rtc" test case),
include a new per-test-directory "settings" file (similar to "config")
that can contain kselftest-specific settings. The first recognized field
is "timeout".
Additionally, this splits the reporting for timeouts into a specific
"TIMEOUT" not-ok (and adds exit code reporting in the remaining case).
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
A TARGET which failed to be built/installed should not be included in the
runlist generated inside the run_kselftest.sh script.
Signed-off-by: Cristian Marussi <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
The following commit:
227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load")
got fat fingered by me when merging it with other patches. It meant to move
the RCU section out of the for loop but ended up doing it partially, leaving
a superfluous rcu_read_lock() inside, causing havok.
Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Let the user specify an optional TARGETS skiplist through the new optional
SKIP_TARGETS Makefile variable.
It is easier to skip at will using a reduced and well defined list of
possibly problematic targets with SKIP_TARGETS than to provide a partially
stripped down list of good targets using the usual TARGETS variable.
Signed-off-by: Cristian Marussi <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
If CONFIG_PM_SLEEP is not set, we can comment out these functions to avoid
the below warnings:
drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not used [-Wunused-function]
Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
|
|
Simplify the ring buffer handling with the in-place API.
Also avoid the dynamic allocation and the memory leak in the channel
callback function.
Signed-off-by: Dexuan Cui <[email protected]>
Acked-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
|
|
A user reported a lockdep splat
======================================================
WARNING: possible circular locking dependency detected
5.2.11-gentoo #2 Not tainted
------------------------------------------------------
kswapd0/711 is trying to acquire lock:
000000007777a663 (sb_internal){.+.+}, at: start_transaction+0x3a8/0x500
but task is already holding lock:
000000000ba86300 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}:
kmem_cache_alloc+0x1f/0x1c0
btrfs_alloc_inode+0x1f/0x260
alloc_inode+0x16/0xa0
new_inode+0xe/0xb0
btrfs_new_inode+0x70/0x610
btrfs_symlink+0xd0/0x420
vfs_symlink+0x9c/0x100
do_symlinkat+0x66/0xe0
do_syscall_64+0x55/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (sb_internal){.+.+}:
__sb_start_write+0xf6/0x150
start_transaction+0x3a8/0x500
btrfs_commit_inode_delayed_inode+0x59/0x110
btrfs_evict_inode+0x19e/0x4c0
evict+0xbc/0x1f0
inode_lru_isolate+0x113/0x190
__list_lru_walk_one.isra.4+0x5c/0x100
list_lru_walk_one+0x32/0x50
prune_icache_sb+0x36/0x80
super_cache_scan+0x14a/0x1d0
do_shrink_slab+0x131/0x320
shrink_node+0xf7/0x380
balance_pgdat+0x2d5/0x640
kswapd+0x2ba/0x5e0
kthread+0x147/0x160
ret_from_fork+0x24/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(sb_internal);
lock(fs_reclaim);
lock(sb_internal);
*** DEADLOCK ***
3 locks held by kswapd0/711:
#0: 000000000ba86300 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
#1: 000000004a5100f8 (shrinker_rwsem){++++}, at: shrink_node+0x9a/0x380
#2: 00000000f956fa46 (&type->s_umount_key#30){++++}, at: super_cache_scan+0x35/0x1d0
stack backtrace:
CPU: 7 PID: 711 Comm: kswapd0 Not tainted 5.2.11-gentoo #2
Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.4.2 09/29/2017
Call Trace:
dump_stack+0x85/0xc7
print_circular_bug.cold.40+0x1d9/0x235
__lock_acquire+0x18b1/0x1f00
lock_acquire+0xa6/0x170
? start_transaction+0x3a8/0x500
__sb_start_write+0xf6/0x150
? start_transaction+0x3a8/0x500
start_transaction+0x3a8/0x500
btrfs_commit_inode_delayed_inode+0x59/0x110
btrfs_evict_inode+0x19e/0x4c0
? var_wake_function+0x20/0x20
evict+0xbc/0x1f0
inode_lru_isolate+0x113/0x190
? discard_new_inode+0xc0/0xc0
__list_lru_walk_one.isra.4+0x5c/0x100
? discard_new_inode+0xc0/0xc0
list_lru_walk_one+0x32/0x50
prune_icache_sb+0x36/0x80
super_cache_scan+0x14a/0x1d0
do_shrink_slab+0x131/0x320
shrink_node+0xf7/0x380
balance_pgdat+0x2d5/0x640
kswapd+0x2ba/0x5e0
? __wake_up_common_lock+0x90/0x90
kthread+0x147/0x160
? balance_pgdat+0x640/0x640
? __kthread_create_on_node+0x160/0x160
ret_from_fork+0x24/0x30
This is because btrfs_new_inode() calls new_inode() under the
transaction. We could probably move the new_inode() outside of this but
for now just wrap it in memalloc_nofs_save().
Reported-by: Zdenek Sojka <[email protected]>
Fixes: 712e36c5f2a7 ("btrfs: use GFP_KERNEL in btrfs_alloc_inode")
CC: [email protected] # 4.16+
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This fixes up the default ux500 CPU thermal zone:
- Set polling delay to 0 and explain why
- Set passive polling delay to 250
- Remove restrictions from the CPU cooling device,
we should use all cpufreq steps to cool down if
needed.
Link: https://lore.kernel.org/r/[email protected]
Fixes: b786a05f6ce4 ("ARM: dts: ux500: Update thermal zone")
Suggested-by: Daniel Lezcano <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Reviewed-by: Daniel Lezcano <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>
|
|
Currently, the command:
btrfs balance start -dconvert=single,soft .
on a Raspberry Pi produces the following kernel message:
BTRFS error (device mmcblk0p2): balance: invalid convert data profile single
This fails because we use is_power_of_2(unsigned long) to validate
the new data profile, the constant for 'single' profile uses bit 48,
and there are only 32 bits in a long on ARM.
Fix by open-coding the check using u64 variables.
Tested by completing the original balance command on several Raspberry
Pis.
Fixes: 818255feece6 ("btrfs: use common helper instead of open coding a bit test")
CC: [email protected] # 4.20+
Signed-off-by: Zygo Blaxell <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This patch is syncing driver with actual devicetree documentation:
Documentation/devicetree/bindings/net/qca,ar71xx.txt
|Optional subnodes:
|- mdio : specifies the mdio bus, used as a container for phy nodes
| according to phy.txt in the same directory
The driver was working with fixed phy without any noticeable issues. This bug
was uncovered by introducing dsa ar9331-switch driver.
Since no one reported this bug until now, I assume no body is using it
and this patch should not brake existing system.
Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jose Abreu says:
====================
net: stmmac: Fixes for -net
Misc fixes for -net tree. More info in commit logs.
v2 is just a rebase of v1 against -net and we added a new patch (09/09) to
fix RSS feature.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit b6b6cc9acd7b, changed the call to dwxgmac2_rss_write_reg()
passing it the variable cfg->key[i].
As key is an u8 but we write 32 bits at a time we need to cast it into
an u32 so that the correct key values are written. Notice that the for
loop already takes this into account so we don't try to write past the
keys size.
Fixes: b6b6cc9acd7b ("net: stmmac: selftest: avoid large stack usage")
Signed-off-by: Jose Abreu <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The sizeof(cfg->key) is != ARRAY_SIZE(cfg->key). Fix it. This warning is
triggered when running with cc flag -Wsizeof-array-div.
Reported-by: kbuild test robot <[email protected]>
Reported-by: Nick Desaulniers <[email protected]>
Reported-by: Nathan Chancellor <[email protected]>
Fixes: 76067459c686 ("net: stmmac: Implement RSS and enable it in XGMAC core")
Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We don't use it anyway as XGMAC only supports polling for timestamp (in
current SW implementation). This greatly reduces the system load by
reducing the number of interrupts.
Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If WoL is enabled we can't really stop the PHY, otherwise we will not
receive the WoL packet. Fix this by telling phylink that only the MAC is
down and only stop the PHY if WoL is not enabled.
Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The case for PTPV2_EVENT requires event packets to be captured so add
this setting to the list of enabled captures.
Fixes: 891434b18ec0 ("stmmac: add IEEE PTPv1 and PTPv2 support.")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We need to always update the MAC Hash Filter so that previous entries
are invalidated.
Found out while running stmmac selftests.
Fixes: b8ef7020d6e5 ("net: stmmac: add support for hash table size 128/256 in dwmac4")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Although some XGMAC setups support frames larger than DMA size, some of
them may not. As we can't know before-hand which ones support let's use
the maximum DMA buffer size in the Jumbo Tests.
User can always reconfigure the MTU to achieve larger frames.
Fixes: 427849e8c37f ("net: stmmac: selftests: Add Jumbo Frame tests")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since commit b8ef7020d6e5 ("net: stmmac: add support for hash table size
128/256 in dwmac4"), we can detect the Hash Table dinamically.
Let's implement this feature in XGMAC cores and fix possible setups that
don't support the maximum size for Hash Table.
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Some setups may not have all Unicast addresses filters available. Let's
check this before trying to setup filters.
Fixes: 0efedbf11f07 ("net: stmmac: xgmac: Fix XGMAC selftests")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If any of the param or info_get op returns error, dumpit cb is
skipping to dump remaining params or info_get ops for all the
drivers.
Fix to not return if any of the param/info_get op returns error
as not supported and continue to dump remaining information.
v2: Modify the patch to return error, except for params/info_get
op that return -EOPNOTSUPP as suggested by Andrew Lunn. Also, modify
commit message to reflect the same.
Cc: Andrew Lunn <[email protected]>
Cc: Jiri Pirko <[email protected]>
Cc: Michael Chan <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
of_node_put needs to be called when the device node which is got
from of_get_child_by_name finished using.
irq_domain_add_linear() also calls of_node_get() to increase refcount,
so irq_domain will not be affected when it is released.
Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Wen Yang <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Vivien Didelot <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
of_node_put needs to be called when the device node which is got
from of_get_child_by_name finished using.
In both cases of success and failure, we need to release 'ports',
so clean up the code using goto.
fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Wen Yang <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Microchip Linux Driver Support <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
As explained in the "net: sched: taprio: Avoid division by zero on
invalid link speed" commit, it is legal for the ethtool API to return
zero as a link speed. So guard against it to ensure we don't perform a
division by zero in kernel.
Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation")
Signed-off-by: Vladimir Oltean <[email protected]>
Acked-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The check in taprio_set_picos_per_byte is currently not robust enough
and will trigger this division by zero, due to e.g. PHYLINK not setting
kset->base.speed when there is no PHY connected:
[ 27.109992] Division by zero in kernel.
[ 27.113842] CPU: 1 PID: 198 Comm: tc Not tainted 5.3.0-rc5-01246-gc4006b8c2637-dirty #212
[ 27.121974] Hardware name: Freescale LS1021A
[ 27.126234] [<c03132e0>] (unwind_backtrace) from [<c030d8b8>] (show_stack+0x10/0x14)
[ 27.133938] [<c030d8b8>] (show_stack) from [<c10b21b0>] (dump_stack+0xb0/0xc4)
[ 27.141124] [<c10b21b0>] (dump_stack) from [<c10af97c>] (Ldiv0_64+0x8/0x18)
[ 27.148052] [<c10af97c>] (Ldiv0_64) from [<c0700260>] (div64_u64+0xcc/0xf0)
[ 27.154978] [<c0700260>] (div64_u64) from [<c07002d0>] (div64_s64+0x4c/0x68)
[ 27.161993] [<c07002d0>] (div64_s64) from [<c0f3d890>] (taprio_set_picos_per_byte+0xe8/0xf4)
[ 27.170388] [<c0f3d890>] (taprio_set_picos_per_byte) from [<c0f3f614>] (taprio_change+0x668/0xcec)
[ 27.179302] [<c0f3f614>] (taprio_change) from [<c0f2bc24>] (qdisc_create+0x1fc/0x4f4)
[ 27.187091] [<c0f2bc24>] (qdisc_create) from [<c0f2c0c8>] (tc_modify_qdisc+0x1ac/0x6f8)
[ 27.195055] [<c0f2c0c8>] (tc_modify_qdisc) from [<c0ee9604>] (rtnetlink_rcv_msg+0x268/0x2dc)
[ 27.203449] [<c0ee9604>] (rtnetlink_rcv_msg) from [<c0f4fef0>] (netlink_rcv_skb+0xe0/0x114)
[ 27.211756] [<c0f4fef0>] (netlink_rcv_skb) from [<c0f4f6cc>] (netlink_unicast+0x1b4/0x22c)
[ 27.219977] [<c0f4f6cc>] (netlink_unicast) from [<c0f4fa84>] (netlink_sendmsg+0x284/0x340)
[ 27.228198] [<c0f4fa84>] (netlink_sendmsg) from [<c0eae5fc>] (sock_sendmsg+0x14/0x24)
[ 27.235988] [<c0eae5fc>] (sock_sendmsg) from [<c0eaedf8>] (___sys_sendmsg+0x214/0x228)
[ 27.243863] [<c0eaedf8>] (___sys_sendmsg) from [<c0eb015c>] (__sys_sendmsg+0x50/0x8c)
[ 27.251652] [<c0eb015c>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
[ 27.259524] Exception stack(0xe8045fa8 to 0xe8045ff0)
[ 27.264546] 5fa0: b6f608c8 000000f8 00000003 bed7e2f0 00000000 00000000
[ 27.272681] 5fc0: b6f608c8 000000f8 004ce54c 00000128 5d3ce8c7 00000000 00000026 00505c9c
[ 27.280812] 5fe0: 00000070 bed7e298 004ddd64 b6dd1e64
Russell King points out that the ethtool API says zero is a valid return
value of __ethtool_get_link_ksettings:
* If it is enabled then they are read-only; if the link
* is up they represent the negotiated link mode; if the link is down,
* the speed is 0, %SPEED_UNKNOWN or the highest enabled speed and
* @duplex is %DUPLEX_UNKNOWN or the best enabled duplex mode.
So, it seems that taprio is not following the API... I'd suggest either
fixing taprio, or getting agreement to change the ethtool API.
The chosen path was to fix taprio.
Fixes: 7b9eba7ba0c1 ("net/sched: taprio: fix picos_per_byte miscalculation")
Signed-off-by: Vladimir Oltean <[email protected]>
Acked-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
BH must be disabled when invoking nf_conncount_gc_list() to perform
garbage collection, otherwise deadlock might happen.
nf_conncount_add+0x1f/0x50 [nf_conncount]
nft_connlimit_eval+0x4c/0xe0 [nft_connlimit]
nft_dynset_eval+0xb5/0x100 [nf_tables]
nft_do_chain+0xea/0x420 [nf_tables]
? sch_direct_xmit+0x111/0x360
? noqueue_init+0x10/0x10
? __qdisc_run+0x84/0x510
? tcp_packet+0x655/0x1610 [nf_conntrack]
? ip_finish_output2+0x1a7/0x430
? tcp_error+0x130/0x150 [nf_conntrack]
? nf_conntrack_in+0x1fc/0x4c0 [nf_conntrack]
nft_do_chain_ipv4+0x66/0x80 [nf_tables]
nf_hook_slow+0x44/0xc0
ip_rcv+0xb5/0xd0
? ip_rcv_finish_core.isra.19+0x360/0x360
__netif_receive_skb_one_core+0x52/0x70
netif_receive_skb_internal+0x34/0xe0
napi_gro_receive+0xba/0xe0
e1000_clean_rx_irq+0x1e9/0x420 [e1000e]
e1000e_poll+0xbe/0x290 [e1000e]
net_rx_action+0x149/0x3b0
__do_softirq+0xde/0x2d8
irq_exit+0xba/0xc0
do_IRQ+0x85/0xd0
common_interrupt+0xf/0xf
</IRQ>
RIP: 0010:nf_conncount_gc_list+0x3b/0x130 [nf_conncount]
Fixes: 2f971a8f4255 ("netfilter: nf_conncount: move all list iterations under spinlock")
Reported-by: Laura Garcia Liebana <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
commit 174e23810cd31
("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
recycle always drop skb extensions. The additional skb_ext_del() that is
performed via nf_reset on napi skb recycle is not needed anymore.
Most nf_reset() calls in the stack are there so queued skb won't block
'rmmod nf_conntrack' indefinitely.
This removes the skb_ext_del from nf_reset, and renames it to a more
fitting nf_reset_ct().
In a few selected places, add a call to skb_ext_reset to make sure that
no active extensions remain.
I am submitting this for "net", because we're still early in the release
cycle. The patch applies to net-next too, but I think the rename causes
needless divergence between those trees.
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
We've historically had reports of being unable to mount file systems
because the tree log root couldn't be read. Usually this is the "parent
transid failure", but could be any of the related errors, including
"fsid mismatch" or "bad tree block", depending on which block got
allocated.
The modification of the individual log root items are serialized on the
per-log root root_mutex. This means that any modification to the
per-subvol log root_item is completely protected.
However we update the root item in the log root tree outside of the log
root tree log_mutex. We do this in order to allow multiple subvolumes
to be updated in each log transaction.
This is problematic however because when we are writing the log root
tree out we update the super block with the _current_ log root node
information. Since these two operations happen independently of each
other, you can end up updating the log root tree in between writing out
the dirty blocks and setting the super block to point at the current
root.
This means we'll point at the new root node that hasn't been written
out, instead of the one we should be pointing at. Thus whatever garbage
or old block we end up pointing at complains when we mount the file
system later and try to replay the log.
Fix this by copying the log's root item into a local root item copy.
Then once we're safely under the log_root_tree->log_mutex we update the
root item in the log_root_tree. This way we do not modify the
log_root_tree while we're committing it, fixing the problem.
CC: [email protected] # 4.4+
Reviewed-by: Chris Mason <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
When we have a buffered write that starts at an offset greater than or
equals to the file's size happening concurrently with a full ranged
fiemap, we can end up leaking an extent state structure.
Suppose we have a file with a size of 1Mb, and before the buffered write
and fiemap are performed, it has a single extent state in its io tree
representing the range from 0 to 1Mb, with the EXTENT_DELALLOC bit set.
The following sequence diagram shows how the memory leak happens if a
fiemap a buffered write, starting at offset 1Mb and with a length of
4Kb, are performed concurrently.
CPU 1 CPU 2
extent_fiemap()
--> it's a full ranged fiemap
range from 0 to LLONG_MAX - 1
(9223372036854775807)
--> locks range in the inode's
io tree
--> after this we have 2 extent
states in the io tree:
--> 1 for range [0, 1Mb[ with
the bits EXTENT_LOCKED and
EXTENT_DELALLOC_BITS set
--> 1 for the range
[1Mb, LLONG_MAX[ with
the EXTENT_LOCKED bit set
--> start buffered write at offset
1Mb with a length of 4Kb
btrfs_file_write_iter()
btrfs_buffered_write()
--> cached_state is NULL
lock_and_cleanup_extent_if_need()
--> returns 0 and does not lock
range because it starts
at current i_size / eof
--> cached_state remains NULL
btrfs_dirty_pages()
btrfs_set_extent_delalloc()
(...)
__set_extent_bit()
--> splits extent state for range
[1Mb, LLONG_MAX[ and now we
have 2 extent states:
--> one for the range
[1Mb, 1Mb + 4Kb[ with
EXTENT_LOCKED set
--> another one for the range
[1Mb + 4Kb, LLONG_MAX[ with
EXTENT_LOCKED set as well
--> sets EXTENT_DELALLOC on the
extent state for the range
[1Mb, 1Mb + 4Kb[
--> caches extent state
[1Mb, 1Mb + 4Kb[ into
@cached_state because it has
the bit EXTENT_LOCKED set
--> btrfs_buffered_write() ends up
with a non-NULL cached_state and
never calls anything to release its
reference on it, resulting in a
memory leak
Fix this by calling free_extent_state() on cached_state if the range was
not locked by lock_and_cleanup_extent_if_need().
The same issue can happen if anything else other than fiemap locks a range
that covers eof and beyond.
This could be triggered, sporadically, by test case generic/561 from the
fstests suite, which makes duperemove run concurrently with fsstress, and
duperemove does plenty of calls to fiemap. When CONFIG_BTRFS_DEBUG is set
the leak is reported in dmesg/syslog when removing the btrfs module with
a message like the following:
[77100.039461] BTRFS: state leak: start 6574080 end 6582271 state 16402 in tree 0 refs 1
Otherwise (CONFIG_BTRFS_DEBUG not set) detectable with kmemleak.
CC: [email protected] # 4.16+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A small list of fixes this time:
* two null pointer dereference fixes
* a fix for preempt-enabled/BHs-enabled (lockdep) splats
(that correctly pointed out a bug)
* a fix for multi-BSSID ordering assumptions
* a fix for the EDMG support, on-stack chandefs need to
be initialized properly (now that they're bigger)
* beacon (head) data from userspace should be validated
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When no other driver selects the devlink library code, ionic
produces a link failure:
drivers/net/ethernet/pensando/ionic/ionic_devlink.o: In function `ionic_devlink_alloc':
ionic_devlink.c:(.text+0xd): undefined reference to `devlink_alloc'
drivers/net/ethernet/pensando/ionic/ionic_devlink.o: In function `ionic_devlink_register':
ionic_devlink.c:(.text+0x71): undefined reference to `devlink_register'
Add the same 'select' statement that the other drivers use here.
Fixes: fbfb8031533c ("ionic: Add hardware init and device commands")
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Resolving a couple of Sphinx documentation warnings
that are generated in the networking section.
- WARNING: document isn't included in any toctree
- WARNING: Title underline too short.
Signed-off-by: Adam Zerella <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add kselftest-all target to build tests from the top level
Makefile. This is to simplify kselftest use-cases for CI and
distributions where build and test systems are different.
Current kselftest target builds and runs tests on a development
system which is a developer use-case.
Add kselftest-install target to install tests from the top level
Makefile. This is to simplify kselftest use-cases for CI and
distributions where build and test systems are different.
This change addresses requests from developers and testers to add
support for installing kselftest from the main Makefile.
In addition, make the install directory the same when install is
run using "make kselftest-install" or by running kselftest_install.sh.
Also fix the INSTALL_PATH variable conflict between main Makefile and
selftests Makefile.
Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Masahiro Yamada <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Always acquire tx descriptor spinlock even if a xdp program is not loaded
on the netsec device since ndo_xdp_xmit can run concurrently with
netsec_netdev_start_xmit and netsec_clean_tx_dring. This can happen
loading a xdp program on a different device (e.g virtio-net) and
xdp_do_redirect_map/xdp_do_redirect_slow can redirect to netsec even if
we do not have a xdp program on it.
Fixes: ba2b232108d3 ("net: netsec: add XDP support")
Tested-by: Ilias Apalodimas <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Reviewed-by: Ilias Apalodimas <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Drivers typically expect this, as it's the case for almost all cases
where this is called (i.e. from the TX path). Also, the code in mac80211
itself (if the driver calls ieee80211_tx_dequeue()) expects this as it
uses this_cpu_ptr() without additional protection.
This should fix various reports of the problem:
https://bugzilla.kernel.org/show_bug.cgi?id=204127
https://lore.kernel.org/linux-wireless/CAN5HydrWb3o_FE6A1XDnP1E+xS66d5kiEuhHfiGKkLNQokx13Q@mail.gmail.com/
https://lore.kernel.org/lkml/[email protected]/
Cc: [email protected]
Reported-and-tested-by: Jiri Kosina <[email protected]>
Reported-by: Aaron Hill <[email protected]>
Reported-by: Lukas Redlinger <[email protected]>
Reported-by: Oleksii Shevchuk <[email protected]>
Fixes: 21a5d4c3a45c ("mac80211: add stop/start logic for software TXQs")
Link: https://lore.kernel.org/r/1569928763-I3e8838c5ecad878e59d4a94eb069a90f6641461a@changeid
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
|
|
If the interface type is P2P_DEVICE or NAN, read the file of
'/sys/kernel/debug/ieee80211/phyx/netdev:wlanx/aqm' will get a
NULL pointer dereference. As for those interface type, the
pointer sdata->vif.txq is NULL.
Unable to handle kernel NULL pointer dereference at virtual address 00000011
CPU: 1 PID: 30936 Comm: cat Not tainted 4.14.104 #1
task: ffffffc0337e4880 task.stack: ffffff800cd20000
PC is at ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
LR is at ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
[...]
Process cat (pid: 30936, stack limit = 0xffffff800cd20000)
[...]
[<ffffff8000b7cd00>] ieee80211_if_fmt_aqm+0x34/0xa0 [mac80211]
[<ffffff8000b7c414>] ieee80211_if_read+0x60/0xbc [mac80211]
[<ffffff8000b7ccc4>] ieee80211_if_read_aqm+0x28/0x30 [mac80211]
[<ffffff80082eff94>] full_proxy_read+0x2c/0x48
[<ffffff80081eef00>] __vfs_read+0x2c/0xd4
[<ffffff80081ef084>] vfs_read+0x8c/0x108
[<ffffff80081ef494>] SyS_read+0x40/0x7c
Signed-off-by: Miaoqing Pan <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[trim useless data from commit message]
Signed-off-by: Johannes Berg <[email protected]>
|
|
If the interface is not in MESH mode, the command 'iw wlanx mpath del'
will cause kernel panic.
The root cause is null pointer access in mpp_flush_by_proxy(), as the
pointer 'sdata->u.mesh.mpp_paths' is NULL for non MESH interface.
Unable to handle kernel NULL pointer dereference at virtual address 00000068
[...]
PC is at _raw_spin_lock_bh+0x20/0x5c
LR is at mesh_path_del+0x1c/0x17c [mac80211]
[...]
Process iw (pid: 4537, stack limit = 0xd83e0238)
[...]
[<c021211c>] (_raw_spin_lock_bh) from [<bf8c7648>] (mesh_path_del+0x1c/0x17c [mac80211])
[<bf8c7648>] (mesh_path_del [mac80211]) from [<bf6cdb7c>] (extack_doit+0x20/0x68 [compat])
[<bf6cdb7c>] (extack_doit [compat]) from [<c05c309c>] (genl_rcv_msg+0x274/0x30c)
[<c05c309c>] (genl_rcv_msg) from [<c05c25d8>] (netlink_rcv_skb+0x58/0xac)
[<c05c25d8>] (netlink_rcv_skb) from [<c05c2e14>] (genl_rcv+0x20/0x34)
[<c05c2e14>] (genl_rcv) from [<c05c1f90>] (netlink_unicast+0x11c/0x204)
[<c05c1f90>] (netlink_unicast) from [<c05c2420>] (netlink_sendmsg+0x30c/0x370)
[<c05c2420>] (netlink_sendmsg) from [<c05886d0>] (sock_sendmsg+0x70/0x84)
[<c05886d0>] (sock_sendmsg) from [<c0589f4c>] (___sys_sendmsg.part.3+0x188/0x228)
[<c0589f4c>] (___sys_sendmsg.part.3) from [<c058add4>] (__sys_sendmsg+0x4c/0x70)
[<c058add4>] (__sys_sendmsg) from [<c0208c80>] (ret_fast_syscall+0x0/0x44)
Code: e2822c02 e2822001 e5832004 f590f000 (e1902f9f)
---[ end trace bbd717600f8f884d ]---
Signed-off-by: Miaoqing Pan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[trim useless data from commit message]
Signed-off-by: Johannes Berg <[email protected]>
|
|
In a few places we don't properly initialize on-stack chandefs,
resulting in EDMG data to be non-zero, which broke things.
Additionally, in a few places we rely on the driver to init the
data completely, but perhaps we shouldn't as non-EDMG drivers
may not initialize the EDMG data, also initialize it there.
Cc: [email protected]
Fixes: 2a38075cd0be ("nl80211: Add support for EDMG channels")
Reported-by: Dmitry Osipenko <[email protected]>
Tested-by: Dmitry Osipenko <[email protected]>
Link: https://lore.kernel.org/r/1569239475-I2dcce394ecf873376c386a78f31c2ec8b538fa25@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
The code copying the data assumes that the SSID element is
before the MBSSID element, but since the data is untrusted
from the AP, this cannot be guaranteed.
Validate that this is indeed the case and ignore the MBSSID
otherwise, to avoid having to deal with both cases for the
copy of data that should be between them.
Cc: [email protected]
Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning")
Link: https://lore.kernel.org/r/1569009255-I1673911f5eae02964e21bdc11b2bf58e5e207e59@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
We currently don't validate the beacon head, i.e. the header,
fixed part and elements that are to go in front of the TIM
element. This means that the variable elements there can be
malformed, e.g. have a length exceeding the buffer size, but
most downstream code from this assumes that this has already
been checked.
Add the necessary checks to the netlink policy.
Cc: [email protected]
Fixes: ed1b6cc7f80f ("cfg80211/nl80211: add beacon settings")
Link: https://lore.kernel.org/r/1569009255-I7ac7fbe9436e9d8733439eab8acbbd35e55c74ef@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
This reverts commit 7e64db1597fe114b83fe17d0ba96c6aa5fca419a.
The thin provisioning feature introduces an IOCTL and the discard support
to allow userspace tools and filesystems to release unused and previously
allocated space respectively.
During some internal performance improvements and further tests, the
release of allocated space revealed some issues that may lead to data
corruption in some configurations when filesystems are mounted with
discard support enabled.
While we're working on a fix and trying to clarify the situation,
this commit reverts the discard support for ESE volumes to prevent
potential data corruption.
Cc: <[email protected]> # 5.3
Signed-off-by: Stefan Haberland <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
It is possible that the CCW commands for reading volume and extent pool
information are not supported, either by the storage server (for
dedicated DASDs) or by z/VM (for virtual devices, such as MDISKs).
As a command reject will occur in such a case, the current error
handling leads to a failing online processing and thus the DASD can't be
used at all.
Since the data being read is not essential for an fully operational
DASD, the error handling can be removed. Information about the failing
command is sent to the s390dbf debug feature.
Fixes: c729696bcf8b ("s390/dasd: Recognise data for ESE volumes")
Cc: <[email protected]> # 5.3
Reported-by: Frank Heimes <[email protected]>
Signed-off-by: Jan Höppner <[email protected]>
Signed-off-by: Stefan Haberland <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
All system calls use struct __kernel_timespec instead of the old struct
timespec, but this one was just added with the old-style ABI. Change it
now to enforce the use of __kernel_timespec, avoiding ABI confusion and
the need for compat handlers on 32-bit architectures.
Any user space caller will have to use __kernel_timespec now, but this
is unambiguous and works for any C library regardless of the time_t
definition. A nicer way to specify the timeout would have been a less
ambiguous 64-bit nanosecond value, but I suppose it's too late now to
change that as this would impact both 32-bit and 64-bit users.
Fixes: 5262f567987d ("io_uring: IORING_OP_TIMEOUT support")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The loop driver assumes that if the passed in fd is opened with
O_DIRECT, the caller wants to use direct I/O on the loop device.
However, if the underlying block device has a different block size than
the loop block queue, direct I/O can't be enabled. Instead of requiring
userspace to manually change the blocksize and re-enable direct I/O,
just change the queue block sizes to match, as well as the io_min size.
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martijn Coenen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch fixes the lock inversion complaint:
============================================
WARNING: possible recursive locking detected
5.3.0-rc7-dbg+ #1 Not tainted
--------------------------------------------
kworker/u16:6/171 is trying to acquire lock:
00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm]
but task is already holding lock:
00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&id_priv->handler_mutex);
lock(&id_priv->handler_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u16:6/171:
#0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0
#1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0
#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
stack backtrace:
CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: iw_cm_wq cm_work_handler [iw_cm]
Call Trace:
dump_stack+0x8a/0xd6
__lock_acquire.cold+0xe1/0x24d
lock_acquire+0x106/0x240
__mutex_lock+0x12e/0xcb0
mutex_lock_nested+0x1f/0x30
rdma_destroy_id+0x78/0x4a0 [rdma_cm]
iw_conn_req_handler+0x5c9/0x680 [rdma_cm]
cm_work_handler+0xe62/0x1100 [iw_cm]
process_one_work+0x56d/0xac0
worker_thread+0x7a/0x5d0
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30
This is not a bug as there are actually two lock classes here.
Link: https://lore.kernel.org/r/[email protected]
Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks")
Signed-off-by: Bart Van Assche <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|