Age | Commit message (Collapse) | Author | Files | Lines |
|
Avoid processing bogus interrupt statuses when the HW is runtime suspended and
the M_CAN_IR register read may get all bits 1's. Handler can be called if the
interrupt request is shared with other peripherals or at the end of free_irq().
Therefore check the runtime suspended status before processing.
Fixes: cdf8259d6573 ("can: m_can: Add PM Support")
Signed-off-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Patch 541656d3a513 ("gfs2: freeze should work on read-only mounts") changed
the check for glock state in function freeze_go_sync() from "gl->gl_state
== LM_ST_SHARED" to "gl->gl_req == LM_ST_EXCLUSIVE". That's wrong and it
regressed gfs2's freeze/thaw mechanism because it caused only the freezing
node (which requests the glock in EX) to queue freeze work.
All nodes go through this go_sync code path during the freeze to drop their
SHared hold on the freeze glock, allowing the freezing node to acquire it
in EXclusive mode. But all the nodes must freeze access to the file system
locally, so they ALL must queue freeze work. The freeze_work calls
freeze_func, which makes a request to reacquire the freeze glock in SH,
effectively blocking until the thaw from the EX holder. Once thawed, the
freezing node drops its EX hold on the freeze glock, then the (blocked)
freeze_func reacquires the freeze glock in SH again (on all nodes, including
the freezer) so all nodes go back to a thawed state.
This patch changes the check back to gl_state == LM_ST_SHARED like it was
prior to 541656d3a513.
Fixes: 541656d3a513 ("gfs2: freeze should work on read-only mounts")
Cc: [email protected] # v5.8+
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
|
|
flexcan_transceiver_enable() during bus-off recovery
If the CAN controller goes into bus off, the do_set_mode() callback with
CAN_MODE_START can be used to recover the controller, which then calls
flexcan_chip_start(). If configured, this is done automatically by the
framework or manually by the user.
In flexcan_chip_start() there is an explicit call to
flexcan_transceiver_enable(), which does a regulator_enable() on the
transceiver regulator. This results in a net usage counter increase, as there
is no corresponding flexcan_transceiver_disable() in the bus off code path.
This further leads to the transceiver stuck enabled, even if the CAN interface
is shut down.
To fix this problem the
flexcan_transceiver_enable()/flexcan_transceiver_disable() are moved out of
flexcan_chip_start()/flexcan_chip_stop() into flexcan_open()/flexcan_close().
Fixes: e955cead0311 ("CAN: Add Flexcan CAN controller driver")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Use correct bittiming limits for the KCAN CAN controller.
Fixes: aec5fb2268b7 ("can: kvaser_usb: Add support for Kvaser USB hydra family")
Signed-off-by: Jimmy Assarsson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Use correct bittiming limits for the KCAN CAN controller.
Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices")
Signed-off-by: Jimmy Assarsson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Checking for ifdef CONFIG_x fails if CONFIG_x=m.
Use IS_ENABLED instead, which is true for both built-ins and modules.
Otherwise, a
> ip -4 route add 1.2.3.4/32 via inet6 fe80::2 dev eth1
fails with the message "Error: IPv6 support not enabled in kernel." if
CONFIG_IPV6 is `m`.
In the spirit of b8127113d01e53adba15b41aefd37b90ed83d631.
Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes")
Cc: Kim Phillips <[email protected]>
Signed-off-by: Florian Klink <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The code refactoring of ILT configuration was not complete, the old
unused variables were used for the SRC block. That could lead to the memory
corruption by HW when rx filters are configured.
This patch completes that refactoring.
Fixes: 8a52bbab39c9 (qed: Debug feature: ilt and mdump)
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
nlmsg_cancel() needs to be called in the error path of
inet_req_diag_fill to cancel the message.
Fixes: d545caca827b ("net: inet: diag: expose the socket mark to privileged processes.")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add missing define of ALIGN_DOWN to make the test build and run. In
addition, __sg_alloc_table_from_pages now support unaligned maximum
segment, so adapt the test result accordingly.
Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of SG table from pages")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
When skb has a frag_list its possible for skb_to_sgvec() to fail. This
happens when the scatterlist has fewer elements to store pages than would
be needed for the initial skb plus any of its frags.
This case appears rare, but is possible when running an RX parser/verdict
programs exposed to the internet. Currently, when this happens we throw
an error, break the pipe, and kfree the msg. This effectively breaks the
application or forces it to do a retry.
Lets catch this case and handle it by doing an skb_linearize() on any
skb we receive with frags. At this point skb_to_sgvec should not fail
because the failing conditions would require frags to be in place.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556576837.73229.14800682790808797635.stgit@john-XPS-13-9370
|
|
If the skb_verdict_prog redirects an skb knowingly to itself, fix your
BPF program this is not optimal and an abuse of the API please use
SK_PASS. That said there may be cases, such as socket load balancing,
where picking the socket is hashed based or otherwise picks the same
socket it was received on in some rare cases. If this happens we don't
want to confuse userspace giving them an EAGAIN error if we can avoid
it.
To avoid double accounting in these cases. At the moment even if the
skb has already been charged against the sockets rcvbuf and forward
alloc we check it again and do set_owner_r() causing it to be orphaned
and recharged. For one this is useless work, but more importantly we
can have a case where the skb could be put on the ingress queue, but
because we are under memory pressure we return EAGAIN. The trouble
here is the skb has already been accounted for so any rcvbuf checks
include the memory associated with the packet already. This rolls
up and can result in unnecessary EAGAIN errors in userspace read()
calls.
Fix by doing an unlikely check and skipping checks if skb->sk == sk.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556574804.73229.11328201020039674147.stgit@john-XPS-13-9370
|
|
If a socket redirects to itself and it is under memory pressure it is
possible to get a socket stuck so that recv() returns EAGAIN and the
socket can not advance for some time. This happens because when
redirecting a skb to the same socket we received the skb on we first
check if it is OK to enqueue the skb on the receiving socket by checking
memory limits. But, if the skb is itself the object holding the memory
needed to enqueue the skb we will keep retrying from kernel side
and always fail with EAGAIN. Then userspace will get a recv() EAGAIN
error if there are no skbs in the psock ingress queue. This will continue
until either some skbs get kfree'd causing the memory pressure to
reduce far enough that we can enqueue the pending packet or the
socket is destroyed. In some cases its possible to get a socket
stuck for a noticeable amount of time if the socket is only receiving
skbs from sk_skb verdict programs. To reproduce I make the socket
memory limits ridiculously low so sockets are always under memory
pressure. More often though if under memory pressure it looks like
a spurious EAGAIN error on user space side causing userspace to retry
and typically enough has moved on the memory side that it works.
To fix skip memory checks and skb_orphan if receiving on the same
sock as already assigned.
For SK_PASS cases this is easy, its always the same socket so we
can just omit the orphan/set_owner pair.
For backlog cases we need to check skb->sk and decide if the orphan
and set_owner pair are needed.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556572660.73229.12566203819812939627.stgit@john-XPS-13-9370
|
|
We use skb->size with sk_rmem_scheduled() which is not correct. Instead
use truesize to align with socket and tcp stack usage of sk_rmem_schedule.
Suggested-by: Daniel Borkman <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556570616.73229.17003722112077507863.stgit@john-XPS-13-9370
|
|
Fix sockmap sk_skb programs so that they observe sk_rcvbuf limits. This
allows users to tune SO_RCVBUF and sockmap will honor them.
We can refactor the if(charge) case out in later patches. But, keep this
fix to the point.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Suggested-by: Jakub Sitnicki <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556568657.73229.8404601585878439060.stgit@john-XPS-13-9370
|
|
If copy_page_to_iter() fails or even partially completes, but with fewer
bytes copied than expected we currently reset sg.start and return EFAULT.
This proves problematic if we already copied data into the user buffer
before we return an error. Because we leave the copied data in the user
buffer and fail to unwind the scatterlist so kernel side believes data
has been copied and user side believes data has _not_ been received.
Expected behavior should be to return number of bytes copied and then
on the next read we need to return the error assuming its still there. This
can happen if we have a copy length spanning multiple scatterlist elements
and one or more complete before the error is hit.
The error is rare enough though that my normal testing with server side
programs, such as nginx, httpd, envoy, etc., I have never seen this. The
only reliable way to reproduce that I've found is to stream movies over
my browser for a day or so and wait for it to hang. Not very scientific,
but with a few extra WARN_ON()s in the code the bug was obvious.
When we review the errors from copy_page_to_iter() it seems we are hitting
a page fault from copy_page_to_iter_iovec() where the code checks
fault_in_pages_writeable(buf, copy) where buf is the user buffer. It
also seems typical server applications don't hit this case.
The other way to try and reproduce this is run the sockmap selftest tool
test_sockmap with data verification enabled, but it doesn't reproduce the
fault. Perhaps we can trigger this case artificially somehow from the
test tools. I haven't sorted out a way to do that yet though.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/160556566659.73229.15694973114605301063.stgit@john-XPS-13-9370
|
|
In async_resync mode, we log the TCP seq of records until the async request
is completed. Later, in case one of the logged seqs matches the resync
request, we return it, together with its record serial number. Before this
fix, we mistakenly returned the serial number of the current record
instead.
Fixes: ed9b7646b06a ("net/tls: Add asynchronous resync")
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Boris Pismenny <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If THIS_MODULE is not set, the module would be removed while debugfs is
being used.
It eventually makes kernel panic.
Fixes: 82c93a87bf8b ("netdevsim: implement couple of testing devlink health reporters")
Fixes: 424be63ad831 ("netdevsim: add UDP tunnel port offload support")
Fixes: 4418f862d675 ("netdevsim: implement support for devlink region and snapshots")
Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Taehee Yoo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Due to a hardware issue, an access to MDIO registers
that is concurrent with other ENETC register accesses
may lead to the MDIO access being dropped or corrupted.
The workaround introduces locking for all register accesses
to the ENETC register space. To reduce performance impact,
a readers-writers locking scheme has been implemented.
The writer in this case is the MDIO access code (irrelevant
whether that MDIO access is a register read or write), and
the reader is any access code to non-MDIO ENETC registers.
Also, the datapath functions acquire the read lock fewer times
and use _hot accessors. All the rest of the code uses the _wa
accessors which lock every register access.
The commit introducing MDIO support is -
commit ebfcb23d62ab ("enetc: Add ENETC PF level external MDIO support")
but due to subsequent refactoring this patch is applicable on
top of a later commit.
Fixes: 6517798dd343 ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
Signed-off-by: Alex Marginean <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: Claudiu Manoil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"A fix for use-after-free in the Sun keyboard driver, a fix to firmware
updates on newer ICs in the Elan touchpad diver, and a couple misc
driver fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - fix firmware update on newer ICs
Input: resistive-adc-touch - fix kconfig dependency on IIO_BUFFER
Input: sunkbd - avoid use-after-free in teardown paths
Input: i8042 - allow insmod to succeed on devices without an i8042 controller
Input: adxl34x - clean up a data type in adxl34x_probe()
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: aedd133d17bc ("net/mlx5e: Support CT offload for tc nic flows")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Avoid calling mlx5_esw_modify_vport_rate() if qos is not enabled and
avoid unnecessary syndrome messages from firmware.
Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Currently when QoS is enabled for VF and any min_rate is configured,
the driver sets bw_share value to at least 1 and doesn’t allow to set
it to 0 to make minimal rate unlimited. It means there is always a
minimal rate configured for every VF, even if user tries to remove it.
In order to make QoS disable possible, check whether all vports have
configured min_rate = 0. If this is true, set their bw_share to 0 to
disable min_rate limitations.
Fixes: c9497c98901c ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Vladyslav Tarasiuk <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Currently, if user disables VFs with some min and max rates configured,
they are cleared. But QoS data is not cleared and restored upon next VF
enable placing limits on minimal rate for given VF, when user expects
none.
To match cleared vport->info struct with QoS-related min and max rates
upon VF disable, clear vport->qos struct too.
Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Vladyslav Tarasiuk <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Handle destruction of rules with port destination type to enable
full destruction of flow.
Without this handling of TX rules the deletion of these rules fails.
Dmesg of flow destruction failure:
[ 203.714146] mlx5_core 0000:00:0b.0: mlx5_cmd_check:753:(pid 342): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a)
[ 210.547387] ------------[ cut here ]------------
[ 210.548663] refcount_t: decrement hit 0; leaking memory.
[ 210.550651] WARNING: CPU: 4 PID: 342 at lib/refcount.c:31 refcount_warn_saturate+0x5c/0x110
[ 210.550654] Modules linked in: mlx5_ib mlx5_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core
[ 210.550675] CPU: 4 PID: 342 Comm: test Not tainted 5.8.0-rc2+ #116
[ 210.550678] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 210.550680] RIP: 0010:refcount_warn_saturate+0x5c/0x110
[ 210.550685] Code: c6 d1 1b 01 00 0f 84 ad 00 00 00 5b 5d c3 80 3d b5 d1 1b 01 00 75 f4 48 c7 c7 20 d1 15 82 c6 05 a5 d1 1b 01 01 e8 a7 eb af ff <0f> 0b eb dd 80 3d 99 d1 1b 01 00 75 d4 48 c7 c7 c0 cf 15 82 c6 05
[ 210.550687] RSP: 0018:ffff8881642e77e8 EFLAGS: 00010282
[ 210.550691] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
[ 210.550694] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102c85ceef
[ 210.550696] RBP: ffff888161720428 R08: ffffffff8124c10e R09: ffffed103243beae
[ 210.550698] R10: ffff8881921df56b R11: ffffed103243bead R12: ffff8881841b4180
[ 210.550701] R13: ffff888161720428 R14: ffff8881616d0000 R15: ffff888161720380
[ 210.550704] FS: 00007fc27f025740(0000) GS:ffff888192000000(0000) knlGS:0000000000000000
[ 210.550706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 210.550708] CR2: 0000557e4b41a6a0 CR3: 0000000002415004 CR4: 0000000000360ea0
[ 210.550711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 210.550713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 210.550715] Call Trace:
[ 210.550717] mlx5_del_flow_rules+0x484/0x490 [mlx5_core]
[ 210.550720] ? mlx5_cmd_set_fte+0xa80/0xa80 [mlx5_core]
[ 210.550722] mlx5_ib_destroy_flow+0x17f/0x280 [mlx5_ib]
[ 210.550724] uverbs_free_flow+0x4c/0x90 [ib_uverbs]
[ 210.550726] destroy_hw_idr_uobject+0x41/0xb0 [ib_uverbs]
[ 210.550728] uverbs_destroy_uobject+0xaa/0x390 [ib_uverbs]
[ 210.550731] __uverbs_cleanup_ufile+0x129/0x1b0 [ib_uverbs]
[ 210.550733] ? uverbs_destroy_uobject+0x390/0x390 [ib_uverbs]
[ 210.550735] uverbs_destroy_ufile_hw+0x78/0x190 [ib_uverbs]
[ 210.550737] ib_uverbs_close+0x36/0x140 [ib_uverbs]
[ 210.550739] __fput+0x181/0x380
[ 210.550741] task_work_run+0x88/0xd0
[ 210.550743] do_exit+0x5f6/0x13b0
[ 210.550745] ? sched_clock_cpu+0x30/0x140
[ 210.550747] ? is_current_pgrp_orphaned+0x70/0x70
[ 210.550750] ? lock_downgrade+0x360/0x360
[ 210.550752] ? mark_held_locks+0x1d/0x90
[ 210.550754] do_group_exit+0x8a/0x140
[ 210.550756] get_signal+0x20a/0xf50
[ 210.550758] do_signal+0x8c/0xbe0
[ 210.550760] ? hrtimer_nanosleep+0x1d8/0x200
[ 210.550762] ? nanosleep_copyout+0x50/0x50
[ 210.550764] ? restore_sigcontext+0x320/0x320
[ 210.550766] ? __hrtimer_init+0xf0/0xf0
[ 210.550768] ? timespec64_add_safe+0x150/0x150
[ 210.550770] ? mark_held_locks+0x1d/0x90
[ 210.550772] ? lockdep_hardirqs_on_prepare+0x14c/0x240
[ 210.550774] __prepare_exit_to_usermode+0x119/0x170
[ 210.550776] do_syscall_64+0x65/0x300
[ 210.550778] ? trace_hardirqs_off+0x10/0x120
[ 210.550781] ? mark_held_locks+0x1d/0x90
[ 210.550783] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 210.550785] ? lockdep_hardirqs_on+0x112/0x190
[ 210.550787] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 210.550789] RIP: 0033:0x7fc27f1cd157
[ 210.550791] Code: Bad RIP value.
[ 210.550793] RSP: 002b:00007ffd4db27ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000023
[ 210.550798] RAX: fffffffffffffdfc RBX: ffffffffffffff80 RCX: 00007fc27f1cd157
[ 210.550800] RDX: 00007fc27f025740 RSI: 00007ffd4db27eb0 RDI: 00007ffd4db27eb0
[ 210.550803] RBP: 0000000000000016 R08: 0000000000000000 R09: 000000000000000e
[ 210.550805] R10: 00007ffd4db27dc7 R11: 0000000000000246 R12: 0000000000400c00
[ 210.550808] R13: 00007ffd4db285f0 R14: 0000000000000000 R15: 0000000000000000
[ 210.550809] irq event stamp: 49399
[ 210.550812] hardirqs last enabled at (49399): [<ffffffff81172d36>] console_unlock+0x556/0x6f0
[ 210.550815] hardirqs last disabled at (49398): [<ffffffff81172897>] console_unlock+0xb7/0x6f0
[ 210.550818] softirqs last enabled at (48706): [<ffffffff81e0037b>] __do_softirq+0x37b/0x60c
[ 210.550820] softirqs last disabled at (48697): [<ffffffff81c00e2f>] asm_call_on_stack+0xf/0x20
[ 210.550822] ---[ end trace ad18c0e6fa846454 ]---
[ 210.581862] mlx5_core 0000:00:0c.0: mlx5_destroy_flow_table:2132:(pid 342): Flow table 262150 wasn't destroyed, refcount > 1
Fixes: a7ee18bdee83 ("RDMA/mlx5: Allow creating a matcher for a NIC TX flow table")
Signed-off-by: Michael Guralnik <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Bond events handler uses bond_slave_get_rtnl to check if net device
is bond slave. bond_slave_get_rtnl return the rcu rx_handler pointer
from the netdev which exists for bond slaves but also exists for
devices that are attached to linux bridge so using it as indication
for bond slave is wrong.
Fix by using netif_is_lag_port instead.
Fixes: 7e51891a237f ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
Signed-off-by: Maor Dickman <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Reviewed-by: Ariel Levkovich <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Both TC and IPsec crypto offload use metadata_regB to store
private information. Since TC does not use bit 31 of regB, IPsec
will use bit 31 as the IPsec packet marker. The IPsec's regB usage
is changed to:
Bit31: IPsec marker
Bit30-24: IPsec syndrome
Bit23-0: IPsec obj id
Fixes: b2ac7541e377 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: Huy Nguyen <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Reviewed-by: Ariel Levkovich <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
The IP's checksum partial still requires L4 csum flag on Ethernet WQE.
Make the IPsec WAs only for the IP's non checksum partial case
(for example icmd packet)
Fixes: 5be019040cb7 ("net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offload")
Signed-off-by: Huy Nguyen <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Reviewed-by: Alaa Hleihel <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
On resync, the driver calls inet_lookup_established
(__inet6_lookup_established) that increases sk_refcnt of the socket. To
decrease it, the driver set skb->destructor to sock_edemux. However, it
didn't work well, because the TCP stack also sets this destructor for
early demux, and the refcount gets decreased only once, while increased
two times (in mlx5e and in the TCP stack). It leads to a socket leak, a
TLS context leak, which in the end leads to calling tls_dev_del twice:
on socket close and on driver unload, which in turn leads to a crash.
This commit fixes the refcount leak by calling sock_gen_put right away
after using the socket, thus fixing all the subsequent issues.
Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- fix system call exit path; avoid return to user space with any
TIF/CIF/PIF set
- fix file permission for cpum_sfb_size parameter
- another small defconfig update
* tag 's390-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpum_sf.c: fix file permission for cpum_sfb_size
s390: update defconfigs
s390: fix system call exit path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix bug preventing booting on several platforms
- fix for build error, when modules need has_transparent_hugepage
- fix for memleak in alchemy clk setup
* tag 'mips_fixes_5.10_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Alchemy: Fix memleak in alchemy_clk_setup_cpu
MIPS: kernel: Fix for_each_memblock conversion
MIPS: export has_transparent_hugepage() for modules
|
|
During loss recovery, retransmitted packets are forced to use TCP
timestamps to calculate the RTT samples, which have a millisecond
granularity. BBR is designed using a microsecond granularity. As a
result, multiple RTT samples could be truncated to the same RTT value
during loss recovery. This is problematic, as BBR will not enter
PROBE_RTT if the RTT sample is <= the current min_rtt sample, meaning
that if there are persistent losses, PROBE_RTT will constantly be
pushed off and potentially never re-entered. This patch makes sure
that BBR enters PROBE_RTT by checking if RTT sample is < the current
min_rtt sample, rather than <=.
The Netflix transport/TCP team discovered this bug in the Linux TCP
BBR code during lab tests.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Ryan Sharpelletti <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific))
in net/core/dev.c due to still having the NC-SI packet handler
registered.
# echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind
------------[ cut here ]------------
kernel BUG at net/core/dev.c:10254!
Internal error: Oops - BUG: 0 [#1] SMP ARM
CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46
Hardware name: Generic DT based system
PC is at netdev_run_todo+0x314/0x394
LR is at cpumask_next+0x20/0x24
pc : [<806f5830>] lr : [<80863cb0>] psr: 80000153
sp : 855bbd58 ip : 00000001 fp : 855bbdac
r10: 80c03d00 r9 : 80c06228 r8 : 81158c54
r7 : 00000000 r6 : 80c05dec r5 : 80c05d18 r4 : 813b9280
r3 : 813b9054 r2 : 8122c470 r1 : 00000002 r0 : 00000002
Flags: Nzcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
Control: 00c5387d Table: 85514008 DAC: 00000051
Process sh (pid: 115, stack limit = 0x7cb5703d)
...
Backtrace:
[<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c)
r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410
r4:813b9000
[<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30)
[<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8)
r5:8115b410 r4:813b9000
[<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c)
Fixes: bd466c3fb5a4 ("net/faraday: Support NCSI mode")
Signed-off-by: Joel Stanley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 39a6f4bce6b4 ("b44: replace the ssb_dma API with the generic DMA API")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Changzhong <[email protected]>
Reviewed-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix file corruption due to event deletion in 'perf inject'.
- Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem
memcpy', silencing perf build warning.
- Avoid an msan warning in a copied stack in 'perf test'.
- Correct tracepoint field name "flags" in ARM's CS-ETM hardware
tracing 'perf test' entry.
- Update branch sample pattern for cs-etm to cope with excluding guest
in userspace counting.
- Don't free "lock_seq_stat" if read_count isn't zero in 'perf lock'.
* tag 'perf-tools-fixes-for-v5.10-2020-11-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf test: Avoid an msan warning in a copied stack.
perf inject: Fix file corruption due to event deletion
perf test: Update branch sample pattern for cs-etm
perf test: Fix a typo in cs-etm testing
tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
perf lock: Don't free "lock_seq_stat" if read_count isn't zero
perf lock: Correct field name "flags"
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 469981b17a4f ("qed: Add unaligned and packed packet processing")
Fixes: fcb39f6c10b2 ("qed: Add mpa buffer descriptors for storing and processing mpa fpdus")
Fixes: 1e28eaad07ea ("qed: Add iWARP support for fpdu spanned over more than two tcp packets")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Changzhong <[email protected]>
Acked-by: Michal Kalderon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"A single commit that fixes a bug that was introduced a couple of merge
windows ago, but which rather more recently converged to an
agreed-upon fix. The bug is that interrupts can be incorrectly enabled
while holding an irq-disabled spinlock. This can of course result in
self-deadlocks.
The bug is a bit difficult to trigger. It requires that a preempted
task be blocking a preemptible-RCU grace period long enough to trigger
an RCU CPU stall warning. In addition, an interrupt must occur at just
the right time, and that interrupt's handler must acquire that same
irq-disabled spinlock. Still, a deadlock is a deadlock.
Furthermore, we do now have a fix, and that fix survives kernel test
robot, -next, and rcutorture testing. It has also been verified by
Sebastian as fixing the bug. Therefore..."
* 'urgent-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled
|
|
If the calls to of_match_device(), of_alias_get_id(),
devm_ioremap_resource(), devm_regmap_init_mmio() or devm_clk_get()
fail on probe of the NPCM FIU SPI driver, the spi_controller struct is
erroneously not freed.
Fix by switching over to the new devm_spi_alloc_master() helper.
Fixes: ace55c411b11 ("spi: npcm-fiu: add NPCM FIU controller driver")
Signed-off-by: Lukas Wunner <[email protected]>
Cc: <[email protected]> # v5.4+: 5e844cc37a5c: spi: Introduce device-managed SPI controller allocation
Cc: <[email protected]> # v5.4+
Cc: Tomer Maimon <[email protected]>
Link: https://lore.kernel.org/r/a420c23a363a3bc9aa684c6e790c32a8af106d17.1605512876.git.lukas@wunner.de
Signed-off-by: Mark Brown <[email protected]>
|
|
It turns out the IRQs most like can be unmasked before the controller is
enabled with no problematic consequences. The manual doesn't explicitly
state that, but the examples perform the controller initialization
procedure in that order. So the commit da8f58909e7e ("spi: dw: Unmask IRQs
after enabling the chip") hasn't been that required as I thought. But
anyway setting the IRQs up after the chip enabling still worth adding
since it has simplified the code a bit. The problem is that it has
introduced a potential bug. The transfer handler pointer is now
initialized after the IRQs are enabled. That may and eventually will cause
an invalid or uninitialized callback invocation. Fix that just by
performing the callback initialization before the IRQ unmask procedure.
Fixes: da8f58909e7e ("spi: dw: Unmask IRQs after enabling the chip")
Signed-off-by: Serge Semin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
When adding __user annotations in commit 2adf5352a34a, the
strncpy_from_user() function declaration for the
CONFIG_GENERIC_STRNCPY_FROM_USER case was missed. Fix it.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Laurent Pinchart <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Max Filippov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull cpufreq-arm fixes for 5.10-rc5 from Viresh Kumar:
"- tegra186: Fix ->get() callback.
- arm/scmi: Add dummy clock provider to fix failure."
* 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: scmi: Fix OPP addition failure with a dummy clock provider
cpufreq: tegra186: Fix get frequency callback
|
|
If the clk_register fails, we should free h before
function returns to prevent memleak.
Fixes: 474402291a0ad ("MIPS: Alchemy: clock framework integration of onchip clocks")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Qilong <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
|
|
The loop over all memblocks works with PFNs and not physical
addresses, so we need for_each_mem_pfn_range().
Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
|
|
Commit dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return
-EPROBE_DEFER") handles -EPROBE_DEFER for the clock/interconnects within
_allocate_opp_table() which is called from dev_pm_opp_add and it
now propagates the error back to the caller.
SCMI performance domain re-used clock bindings to keep it simple. However
with the above mentioned change, if clock property is present in a device
node, opps fails to get added with below errors until clk_get succeeds.
cpu0: failed to add opp 450000000Hz
cpu0: failed to add opps to the device
....(errors on cpu1-cpu4)
cpu5: failed to add opp 450000000Hz
cpu5: failed to add opps to the device
So, in order to fix the issue, we need to register dummy clock provider.
With the dummy clock provider, clk_get returns NULL(no errors!), then opp
core proceeds to add OPPs for the CPUs.
Cc: Rafael J. Wysocki <[email protected]>
Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER")
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
Commit b89c01c96051 ("cpufreq: tegra186: Fix initial frequency")
implemented the CPUFREQ 'get' callback to determine the current
operating frequency for each CPU. This implementation used a simple
looked up to determine the current operating frequency. The problem
with this is that frequency table for different Tegra186 devices may
vary and so the default boot frequency for Tegra186 device may or may
not be present in the frequency table. If the default boot frequency is
not present in the frequency table, this causes the function
tegra186_cpufreq_get() to return 0 and in turn causes cpufreq_online()
to fail which prevents CPUFREQ from working.
Fix this by always calculating the CPU frequency based upon the current
'ndiv' setting for the CPU. Note that the CPU frequency for Tegra186 is
calculated by reading the current 'ndiv' setting, multiplying by the
CPU reference clock and dividing by a constant divisor.
Fixes: b89c01c96051 ("cpufreq: tegra186: Fix initial frequency")
Signed-off-by: Jon Hunter <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
Michael Chan says:
====================
bnxt_en: Bug fixes.
This first patch fixes a module eeprom A2h addressing issue. The next
2 patches fix counter related issues. The last one skips an
unsupported firmware call on the VF to avoid the error log.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
VFs do not have access permissions to issue NVM_GET_DEV_INFO
firmware command.
Fixes: 4933f6753b50 ("bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.")
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
bnxt_add_one_ctr() adds a hardware counter to a software counter and
adjusts for the hardware counter wraparound against the mask. The logic
assumes that the hardware counter is always smaller than or equal to
the mask.
This assumption is mostly correct. But in some cases if the firmware
is older and does not provide the accurate mask, the driver can use
a mask that is smaller than the actual hardware mask. This can cause
some extra carry bits to be added to the software counter, resulting in
counters that far exceed the actual value. Fix it by masking the
hardware counter with the mask passed into bnxt_add_one_ctr().
Fixes: fea6b3335527 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Vasundhara Volam <[email protected]>
Reviewed-by: Pavan Chebbi <[email protected]>
Reviewed-by: Edwin Peer <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Firmware is unable to retain the port counters during any kind of
fatal or non-fatal resets, so we must clear the port counters to
avoid false detection of port counter overflow.
Fixes: fea6b3335527 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Edwin Peer <[email protected]>
Reviewed-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The module eeprom address range returned by bnxt_get_module_eeprom()
should be 256 bytes of A0h address space, the lower half of the A2h
address space, and page 0 for the upper half of the A2h address space.
Fix the firmware call by passing page_number 0 for the A2h slave address
space.
Fixes: 42ee18fe4ca2 ("bnxt_en: Add Support for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPRO")
Signed-off-by: Edwin Peer <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Transactions sit on one of several lists, depending on their state
(allocated, pending, complete, or polled). A spinlock protects
against concurrent access when transactions are moved between these
lists.
Transactions are also reference counted. A newly-allocated
transaction has an initial count of 1; a transaction is released in
gsi_trans_free() only if its decremented reference count reaches 0.
Releasing a transaction includes removing it from the polled (or if
unused, allocated) list, so the spinlock is acquired when we release
a transaction.
The reference count is used to allow a caller to synchronously wait
for a committed transaction to complete. In this case, the waiter
takes an extra reference to the transaction *before* committing it
(so it won't be freed), and releases its reference (calls
gsi_trans_free()) when it is done with it.
Similarly, gsi_channel_update() takes an extra reference to ensure a
transaction isn't released before the function is done operating on
it. Until the transaction is moved to the completed list (by this
function) it won't be freed, so this reference is taken "safely."
But in the quiesce path, we want to wait for the "last" transaction,
which we find in the completed or polled list. Transactions on
these lists can be freed at any time, so we (try to) prevent that
by taking the reference while holding the spinlock.
Currently gsi_trans_free() decrements a transaction's reference
count unconditionally, acquiring the lock to remove the transaction
from its list *only* when the count reaches 0. This does not
protect the quiesce path, which depends on the lock to ensure its
extra reference prevents release of the transaction.
Fix this by only dropping the last reference to a transaction
in gsi_trans_free() while holding the spinlock.
Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions")
Reported-by: Stephen Boyd <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|