Age | Commit message (Collapse) | Author | Files | Lines |
|
Commit 0ed881027690 ("scsi: storvsc: setup 1:1 mapping between hardware
queue and CPU queue") introduced a regression for disks attached to
IDE. For these disks the host VSP only offers one VMBUS channel. Setting
multiple queues can overload the VMBUS channel and result in performance
drop for high queue depth workload on system with large number of CPUs.
Fix it by leaving the number of hardware queues to 1 (default value) for
IDE disks.
Fixes: 0ed881027690 ("scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue")
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Long Li <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
gcc -O3 warns that some local variables are not properly initialized:
drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_hang_notify':
drivers/scsi/fnic/vnic_dev.c:511:16: error: 'a0' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[0] = *a0;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:6: note: 'a0' was declared here
u64 a0, a1;
^~
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[1] = *a1;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:10: note: 'a1' was declared here
u64 a0, a1;
^~
drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_mac_addr':
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
vdev->args[1] = *a1;
~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:698:10: note: 'a1' was declared here
u64 a0, a1;
^~
Apparently the code relies on the local variables occupying adjacent memory
locations in the same order, but this is of course not guaranteed.
Use an array of two u64 variables where needed to make it work correctly.
I suspect there is also an endianness bug here, but have not digged in deep
enough to be sure.
Fixes: 5df6d737dd4b ("[SCSI] fnic: Add new Cisco PCI-Express FCoE HBA")
Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The code in secondary_park is currently placed in the .init section. The
kernel reclaims and clears this code when it finishes booting. That
causes the cores parked in it to go to somewhere unpredictable, so we
move this function out of init to make sure the cores stay looping there.
The instruction bgeu a0, t0, .Lsecondary_park may have "a relocation
truncated to fit" issue during linking time. It is because that sections
are too far to jump. Let's use tail to jump to the .Lsecondary_park.
Signed-off-by: Greentime Hu <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Cc: Andreas Schwab <[email protected]>
Cc: [email protected]
Fixes: 76d2a0493a17d ("RISC-V: Init and Halt Code")
Signed-off-by: Paul Walmsley <[email protected]>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.5-2020-01-15:
- Update golden settings for renoir
- eDP fix
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
read_current_link_settings_on_detect() on eDP 1.4+ may use the
edp_supported_link_rates table which is set up by
detect_edp_sink_caps(), so that function needs to be called first.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Mario Kleiner <[email protected]>
Cc: Martin Leung <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.
Signed-off-by: Aaron Liu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
John Fastabend says:
====================
To date our usage of sockmap/tls has been fairly simple, the BPF programs
did only well-defined pop, push, pull and apply/cork operations.
Now that we started to push more complex programs into sockmap we uncovered
a series of issues addressed here. Further OpenSSL3.0 version should be
released soon with kTLS support so its important to get any remaining
issues on BPF and kTLS support resolved.
Additionally, I have a patch under development to allow sockmap to be
enabled/disabled at runtime for Cilium endpoints. This allows us to stress
the map insert/delete with kTLS more than previously where Cilium only
added the socket to the map when it entered ESTABLISHED state and never
touched it from the control path side again relying on the sockets own
close() hook to remove it.
To test I have a set of test cases in test_sockmap.c that expose these
issues. Once we get fixes here merged and in bpf-next I'll submit the
tests to bpf-next tree to ensure we don't regress again. Also I've run
these patches in the Cilium CI with OpenSSL (master branch) this will
run tools such as netperf, ab, wrk2, curl, etc. to get a broad set of
testing.
I'm aware of two more issues that we are working to resolve in another
couple (probably two) patches. First we see an auth tag corruption in
kTLS when sending small 1byte chunks under stress. I've not pinned this
down yet. But, guessing because its under 1B stress tests it must be
some error path being triggered. And second we need to ensure BPF RX
programs are not skipped when kTLS ULP is loaded. This breaks some of the
sockmap selftests when running with kTLS. I'll send a follow up for this.
v2: I dropped a patch that added !0 size check in tls_push_record
this originated from a panic I caught awhile ago with a trace
in the crypto stack. But I can not reproduce it anymore so will
dig into that and send another patch later if needed. Anyways
after a bit of thought it would be nicer if tls/crypto/bpf didn't
require special case handling for the !0 size.
====================
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
When user returns SK_DROP we need to reset the number of copied bytes
to indicate to the user the bytes were dropped and not sent. If we
don't reset the copied arg sendmsg will return as if those bytes were
copied giving the user a positive return value.
This works as expected today except in the case where the user also
pops bytes. In the pop case the sg.size is reduced but we don't correctly
account for this when copied bytes is reset. The popped bytes are not
accounted for and we return a small positive value potentially confusing
the user.
The reason this happens is due to a typo where we do the wrong comparison
when accounting for pop bytes. In this fix notice the if/else is not
needed and that we have a similar problem if we push data except its not
visible to the user because if delta is larger the sg.size we return a
negative value so it appears as an error regardless.
Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Jonathan Lemon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Its possible through a set of push, pop, apply helper calls to construct
a skmsg, which is just a ring of scatterlist elements, with the start
value larger than the end value. For example,
end start
|_0_|_1_| ... |_n_|_n+1_|
Where end points at 1 and start points and n so that valid elements is
the set {n, n+1, 0, 1}.
Currently, because we don't build the correct chain only {n, n+1} will
be sent. This adds a check and sg_chain call to correctly submit the
above to the crypto and tls send path.
Fixes: d3b18ad31f93d ("tls: add bpf support to sk_msg handling")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Jonathan Lemon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
It is possible to build a plaintext buffer using push helper that is larger
than the allocated encrypt buffer. When this record is pushed to crypto
layers this can result in a NULL pointer dereference because the crypto
API expects the encrypt buffer is large enough to fit the plaintext
buffer. Kernel splat below.
To resolve catch the cases this can happen and split the buffer into two
records to send individually. Unfortunately, there is still one case to
handle where the split creates a zero sized buffer. In this case we merge
the buffers and unmark the split. This happens when apply is zero and user
pushed data beyond encrypt buffer. This fixes the original case as well
because the split allocated an encrypt buffer larger than the plaintext
buffer and the merge simply moves the pointers around so we now have
a reference to the new (larger) encrypt buffer.
Perhaps its not ideal but it seems the best solution for a fixes branch
and avoids handling these two cases, (a) apply that needs split and (b)
non apply case. The are edge cases anyways so optimizing them seems not
necessary unless someone wants later in next branches.
[ 306.719107] BUG: kernel NULL pointer dereference, address: 0000000000000008
[...]
[ 306.747260] RIP: 0010:scatterwalk_copychunks+0x12f/0x1b0
[...]
[ 306.770350] Call Trace:
[ 306.770956] scatterwalk_map_and_copy+0x6c/0x80
[ 306.772026] gcm_enc_copy_hash+0x4b/0x50
[ 306.772925] gcm_hash_crypt_remain_continue+0xef/0x110
[ 306.774138] gcm_hash_crypt_continue+0xa1/0xb0
[ 306.775103] ? gcm_hash_crypt_continue+0xa1/0xb0
[ 306.776103] gcm_hash_assoc_remain_continue+0x94/0xa0
[ 306.777170] gcm_hash_assoc_continue+0x9d/0xb0
[ 306.778239] gcm_hash_init_continue+0x8f/0xa0
[ 306.779121] gcm_hash+0x73/0x80
[ 306.779762] gcm_encrypt_continue+0x6d/0x80
[ 306.780582] crypto_gcm_encrypt+0xcb/0xe0
[ 306.781474] crypto_aead_encrypt+0x1f/0x30
[ 306.782353] tls_push_record+0x3b9/0xb20 [tls]
[ 306.783314] ? sk_psock_msg_verdict+0x199/0x300
[ 306.784287] bpf_exec_tx_verdict+0x3f2/0x680 [tls]
[ 306.785357] tls_sw_sendmsg+0x4a3/0x6a0 [tls]
test_sockmap test signature to trigger bug,
[TEST]: (1, 1, 1, sendmsg, pass,redir,start 1,end 2,pop (1,2),ktls,):
Fixes: d3b18ad31f93d ("tls: add bpf support to sk_msg handling")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Jonathan Lemon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Leaving an incorrect end mark in place when passing to crypto
layer will cause crypto layer to stop processing data before
all data is encrypted. To fix clear the end mark on push
data instead of expecting users of the helper to clear the
mark value after the fact.
This happens when we push data into the middle of a skmsg and
have room for it so we don't do a set of copies that already
clear the end flag.
Fixes: 6fff607e2f14b ("bpf: sk_msg program helper bpf_msg_push_data")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
In the push, pull, and pop helpers operating on skmsg objects to make
data writable or insert/remove data we use this bounds check to ensure
specified data is valid,
/* Bounds checks: start and pop must be inside message */
if (start >= offset + l || last >= msg->sg.size)
return -EINVAL;
The problem here is offset has already included the length of the
current element the 'l' above. So start could be past the end of
the scatterlist element in the case where start also points into an
offset on the last skmsg element.
To fix do the accounting slightly different by adding the length of
the previous entry to offset at the start of the iteration. And
ensure its initialized to zero so that the first iteration does
nothing.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Fixes: 6fff607e2f14b ("bpf: sk_msg program helper bpf_msg_push_data")
Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When sockmap sock with TLS enabled is removed we cleanup bpf/psock state
and call tcp_update_ulp() to push updates to TLS ULP on top. However, we
don't push the write_space callback up and instead simply overwrite the
op with the psock stored previous op. This may or may not be correct so
to ensure we don't overwrite the TLS write space hook pass this field to
the ULP and have it fixup the ctx.
This completes a previous fix that pushed the ops through to the ULP
but at the time missed doing this for write_space, presumably because
write_space TLS hook was added around the same time.
Fixes: 95fa145479fbc ("bpf: sockmap/tls, close can race with map free")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Acked-by: Jonathan Lemon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The sock_map_free() and sock_hash_free() paths used to delete sockmap
and sockhash maps walk the maps and destroy psock and bpf state associated
with the socks in the map. When done the socks no longer have BPF programs
attached and will function normally. This can happen while the socks in
the map are still "live" meaning data may be sent/received during the walk.
Currently, though we don't take the sock_lock when the psock and bpf state
is removed through this path. Specifically, this means we can be writing
into the ops structure pointers such as sendmsg, sendpage, recvmsg, etc.
while they are also being called from the networking side. This is not
safe, we never used proper READ_ONCE/WRITE_ONCE semantics here if we
believed it was safe. Further its not clear to me its even a good idea
to try and do this on "live" sockets while networking side might also
be using the socket. Instead of trying to reason about using the socks
from both sides lets realize that every use case I'm aware of rarely
deletes maps, in fact kubernetes/Cilium case builds map at init and
never tears it down except on errors. So lets do the simple fix and
grab sock lock.
This patch wraps sock deletes from maps in sock lock and adds some
annotations so we catch any other cases easier.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When a sockmap is free'd and a socket in the map is enabled with tls
we tear down the bpf context on the socket, the psock struct and state,
and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform
the tls stack it needs to update its saved sock ops so that when the tls
socket is later destroyed it doesn't try to call the now destroyed psock
hooks.
This is about keeping stacked ULPs in good shape so they always have
the right set of stacked ops.
However, recently unhash() hook was removed from TLS side. But, the
sockmap/bpf side is not doing any extra work to update the unhash op
when is torn down instead expecting TLS side to manage it. So both
TLS and sockmap believe the other side is managing the op and instead
no one updates the hook so it continues to point at tcp_bpf_unhash().
When unhash hook is called we call tcp_bpf_unhash() which detects the
psock has already been destroyed and calls sk->sk_prot_unhash() which
calls tcp_bpf_unhash() yet again and so on looping and hanging the core.
To fix have sockmap tear down logic fixup the stale pointer.
Fixes: 5d92e631b8be ("net/tls: partially revert fix transition through disconnect with close")
Reported-by: [email protected]
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Review of the recently added documentation file for the qed driver
noticed a couple of typos. Fix them now.
Noticed-by: Michal Kalderon <[email protected]>
Fixes: 0f261c3ca09e ("devlink: add a driver-specific file for the qed driver")
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
use sk_bound_dev_if for route lookup as already done
in most of the other ip_route_output_ports() calls.
Since most PPPoA providers use 10.0.0.138 as default gateway IP
this will allow connections to multiple PPTP providers with the
same IP address over different interfaces.
Signed-off-by: Ulrich Weber <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jose Abreu says:
====================
net: stmmac: Fix selftests in Synopsys AXS101 board
Set of fixes for sefltests so that they work in Synopsys AXS101 board.
Final output:
$ ethtool -t eth0
The test result is PASS
The test extra info:
1. MAC Loopback 0
2. PHY Loopback -95
3. MMC Counters 0
4. EEE -95
5. Hash Filter MC 0
6. Perfect Filter UC 0
7. MC Filter 0
8. UC Filter 0
9. Flow Control -95
10. RSS -95
11. VLAN Filtering -95
12. VLAN Filtering (perf) -95
13. Double VLAN Filter -95
14. Double VLAN Filter (perf) -95
15. Flexible RX Parser -95
16. SA Insertion (desc) -95
17. SA Replacement (desc) -95
18. SA Insertion (reg) -95
19. SA Replacement (reg) -95
20. VLAN TX Insertion -95
21. SVLAN TX Insertion -95
22. L3 DA Filtering -95
23. L3 SA Filtering -95
24. L4 DA TCP Filtering -95
25. L4 SA TCP Filtering -95
26. L4 DA UDP Filtering -95
27. L4 SA UDP Filtering -95
28. ARP Offload -95
29. Jumbo Frame 0
30. Multichannel Jumbo -95
31. Split Header -95
Description:
1) Fixes the unaligned accesses that caused CPU halt in Synopsys AXS101
boards.
2) Fixes the VLAN tests when filtering failed to work.
3) Fixes the VLAN Perfect tests when filtering is not available in HW.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When HW does not support perfect filtering the feature will not be
enabled in the net_device. Add a check for this to prevent failures.
Fixes: 1b2250a04c1f ("net: stmmac: selftests: Add tests for VLAN Perfect Filtering")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When the VLAN ID does not match the expected one it means filter failed
in HW. Fix it.
Fixes: 94e18382003c ("net: stmmac: selftests: Add selftest for VLAN TX Offload")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Synopsys AXS101 boards do not support unaligned memory loads or stores.
Change the selftests mechanism to explicity:
- Not add extra alignment in TX SKB
- Use the unaligned version of ether_addr_equal()
Fixes: 091810dbded9 ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements
however up to UCC_MAX_NUM (8) elements are potentially being written
to it. Currently we have an array out-of-bounds write error on the
last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in
size.
Addresses-Coverity: ("Out-of-bounds write")
Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- fix typo and kerneldocs, by Sven Eckelmann
- use WiFi txbitrate for B.A.T.M.A.N. V as fallback, by René Treffer
- silence some endian sparse warnings by adding annotations,
by Sven Eckelmann
- Update copyright years to 2020, by Sven Eckelmann
- Disable deprecated sysfs configuration by default, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
[Why]
Noticed this while testing MST with the 4 ports MST hub from
StarTech.com. Sometimes can't light up monitors normally and get the
error message as 'sideband msg build failed'.
Look into aux transactions, found out that source sometimes will send
out another down request before receiving the down reply of the
previous down request. On the other hand, in drm_dp_get_one_sb_msg(),
current code doesn't handle the interleaved replies case. Hence, source
can't build up message completely and can't light up monitors.
[How]
For good compatibility, enforce source to send out one down request at a
time. Add a flag, is_waiting_for_dwn_reply, to determine if the source
can send out a down request immediately or not.
- Check the flag before calling process_single_down_tx_qlock to send out
a msg
- Set the flag when successfully send out a down request
- Clear the flag when successfully build up a down reply
- Clear the flag when find erros during sending out a down request
- Clear the flag when find errors during building up a down reply
- Clear the flag when timeout occurs during waiting for a down reply
- Use drm_dp_mst_kick_tx() to try to send another down request in queue
at the end of drm_dp_mst_wait_tx_reply() (attempt to send out messages
in queue when errors occur)
Cc: Lyude Paul <[email protected]>
Signed-off-by: Wayne Lin <[email protected]>
Signed-off-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- Fix DAT candidate selection on little endian systems,
by Sven Eckelmann
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Anatoly has been fuzzing with kBdysch harness and reported a hang in one
of the outcomes:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (85) call bpf_get_socket_cookie#46
1: R0_w=invP(id=0) R10=fp0
1: (57) r0 &= 808464432
2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
2: (14) w0 -= 810299440
3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
3: (c4) w0 s>>= 1
4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
4: (76) if w0 s>= 0x30303030 goto pc+216
221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
221: (95) exit
processed 6 insns (limit 1000000) [...]
Taking a closer look, the program was xlated as follows:
# ./bpftool p d x i 12
0: (85) call bpf_get_socket_cookie#7800896
1: (bf) r6 = r0
2: (57) r6 &= 808464432
3: (14) w6 -= 810299440
4: (c4) w6 s>>= 1
5: (76) if w6 s>= 0x30303030 goto pc+216
6: (05) goto pc-1
7: (05) goto pc-1
8: (05) goto pc-1
[...]
220: (05) goto pc-1
221: (05) goto pc-1
222: (95) exit
Meaning, the visible effect is very similar to f54c7898ed1c ("bpf: Fix
precision tracking for unbounded scalars"), that is, the fall-through
branch in the instruction 5 is considered to be never taken given the
conclusion from the min/max bounds tracking in w6, and therefore the
dead-code sanitation rewrites it as goto pc-1. However, real-life input
disagrees with verification analysis since a soft-lockup was observed.
The bug sits in the analysis of the ARSH. The definition is that we shift
the target register value right by K bits through shifting in copies of
its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the
register into 32 bit mode, same happens after simulating the operation.
However, for the case of simulating the actual ARSH, we don't take the
mode into account and act as if it's always 64 bit, but location of sign
bit is different:
dst_reg->smin_value >>= umin_val;
dst_reg->smax_value >>= umin_val;
dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val);
Consider an unknown R0 where bpf_get_socket_cookie() (or others) would
for example return 0xffff. With the above ARSH simulation, we'd see the
following results:
[...]
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (57) r0 &= 808464432
-> R0_runtime = 0x3030
3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
3: (14) w0 -= 810299440
-> R0_runtime = 0xcfb40000
4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
(0xffffffff)
4: (c4) w0 s>>= 1
-> R0_runtime = 0xe7da0000
5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0x7ffbfff8)
[...]
In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011
0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that
is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly
retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into
0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000'
and means here that the simulation didn't retain the sign bit. With above
logic, the updates happen on the 64 bit min/max bounds and given we coerced
the register, the sign bits of the bounds are cleared as well, meaning, we
need to force the simulation into s32 space for 32 bit alu mode.
Verification after the fix below. We're first analyzing the fall-through branch
on 32 bit signed >= test eventually leading to rejection of the program in this
specific case:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r2 = 808464432
1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0
1: (85) call bpf_get_socket_cookie#46
2: R0_w=invP(id=0) R10=fp0
2: (bf) r6 = r0
3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0
3: (57) r6 &= 808464432
4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
4: (14) w6 -= 810299440
5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
5: (c4) w6 s>>= 1
6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
(0x67c00000) (0xfffbfff8)
6: (76) if w6 s>= 0x30303030 goto pc+216
7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
7: (30) r0 = *(u8 *)skb[808464432]
BPF_LD_[ABS|IND] uses reserved fields
processed 8 insns (limit 1000000) [...]
Fixes: 9cbe1f5a32dc ("bpf/verifier: improve register value range tracking with ARSH")
Reported-by: Anatoly Trosinenko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
kmemleak detects the following memory leak when hot removing
a network device:
unreferenced object 0xffff888083f63600 (size 256):
comm "kworker/0:1", pid 12, jiffies 4294831717 (age 1113.676s)
hex dump (first 32 bytes):
00 40 c7 33 80 88 ff ff 00 00 00 00 10 00 00 00 [email protected]............
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
backtrace:
[<00000000d4a8f5be>] rndis_filter_device_add+0x117/0x11c0 [hv_netvsc]
[<000000009c02d75b>] netvsc_probe+0x5e7/0xbf0 [hv_netvsc]
[<00000000ddafce23>] vmbus_probe+0x74/0x170 [hv_vmbus]
[<00000000046e64f1>] really_probe+0x22f/0xb50
[<000000005cc35eb7>] driver_probe_device+0x25e/0x370
[<0000000043c642b2>] bus_for_each_drv+0x11f/0x1b0
[<000000005e3d09f0>] __device_attach+0x1c6/0x2f0
[<00000000a72c362f>] bus_probe_device+0x1a6/0x260
[<0000000008478399>] device_add+0x10a3/0x18e0
[<00000000cf07b48c>] vmbus_device_register+0xe7/0x1e0 [hv_vmbus]
[<00000000d46cf032>] vmbus_add_channel_work+0x8ab/0x1770 [hv_vmbus]
[<000000002c94bb64>] process_one_work+0x919/0x17d0
[<0000000096de6781>] worker_thread+0x87/0xb40
[<00000000fbe7397e>] kthread+0x333/0x3f0
[<000000004f844269>] ret_from_fork+0x3a/0x50
rndis_filter_device_add() allocates an instance of struct rndis_device
which never gets deallocated as rndis_filter_device_remove() sets
net_device->extension which points to the rndis_device struct to NULL,
leaving the rndis_device dangling.
Since net_device->extension is eventually freed in free_netvsc_device(),
we refrain from setting it to NULL inside rndis_filter_device_remove()
Signed-off-by: Mohammed Gamal <[email protected]>
Reviewed-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When the packet pointed to by retransmit_skb_hint is unlinked by ACK,
retransmit_skb_hint will be set to NULL in tcp_clean_rtx_queue().
If packet loss is detected at this time, retransmit_skb_hint will be set
to point to the current packet loss in tcp_verify_retransmit_hint(),
then the packets that were previously marked lost but not retransmitted
due to the restriction of cwnd will be skipped and cannot be
retransmitted.
To fix this, when retransmit_skb_hint is NULL, retransmit_skb_hint can
be reset only after all marked lost packets are retransmitted
(retrans_out >= lost_out), otherwise we need to traverse from
tcp_rtx_queue_head in tcp_xmit_retransmit_queue().
Packetdrill to demonstrate:
// Disable RACK and set max_reordering to keep things simple
0 `sysctl -q net.ipv4.tcp_recovery=0`
+0 `sysctl -q net.ipv4.tcp_max_reordering=3`
// Establish a connection
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <...>
+.01 < . 1:1(0) ack 1 win 257
+0 accept(3, ..., ...) = 4
// Send 8 data segments
+0 write(4, ..., 8000) = 8000
+0 > P. 1:8001(8000) ack 1
// Enter recovery and 1:3001 is marked lost
+.01 < . 1:1(0) ack 1 win 257 <sack 3001:4001,nop,nop>
+0 < . 1:1(0) ack 1 win 257 <sack 5001:6001 3001:4001,nop,nop>
+0 < . 1:1(0) ack 1 win 257 <sack 5001:7001 3001:4001,nop,nop>
// Retransmit 1:1001, now retransmit_skb_hint points to 1001:2001
+0 > . 1:1001(1000) ack 1
// 1001:2001 was ACKed causing retransmit_skb_hint to be set to NULL
+.01 < . 1:1(0) ack 2001 win 257 <sack 5001:8001 3001:4001,nop,nop>
// Now retransmit_skb_hint points to 4001:5001 which is now marked lost
// BUG: 2001:3001 was not retransmitted
+0 > . 2001:3001(1000) ack 1
Signed-off-by: Pengcheng Yang <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Tested-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
snd_seq_info_timer_read() reads the information of the timer assigned
for each queue, but it's done in a racy way which may lead to UAF as
spotted by syzkaller.
This patch applies the missing q->timer_mutex lock while accessing the
timer object as well as a slight code change to adapt the standard
coding style.
Reported-by: [email protected]
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Intel Software Developer's Manual, volume 3, chapter 9.11.6 says:
"Note that the microcode update must be aligned on a 16-byte boundary
and the size of the microcode update must be 1-KByte granular"
When early-load Intel microcode is loaded from initramfs, userspace tool
'iucode_tool' has already 16-byte aligned those microcode bits in that
initramfs image. Image that was created something like this:
iucode_tool --write-earlyfw=FOO.cpio microcode-files...
However, when early-load Intel microcode is loaded from built-in
firmware BLOB using CONFIG_EXTRA_FIRMWARE= kernel config option, that
16-byte alignment is not guaranteed.
Fix this by forcing all built-in firmware BLOBs to 16-byte alignment.
[ If we end up having other firmware with much bigger alignment
requirements, we might need to introduce some method for the firmware
to specify it, this is the minimal "just increase the alignment a bit
to account for this one special case" patch - Linus ]
Signed-off-by: Jari Ruusu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull arch/nios2 fixlet from Ley Foon Tan:
"Update my nios2 maintainer email"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
MAINTAINERS: Update Ley Foon Tan's email address
|
|
When handling devm_gpiod_get_optional() errors, free the memory already
allocated. This fixes Smatch warnings:
drivers/i2c/busses/i2c-iop3xx.c:437 iop3xx_i2c_probe() warn: possible memory leak of 'new_adapter'
drivers/i2c/busses/i2c-iop3xx.c:442 iop3xx_i2c_probe() warn: possible memory leak of 'new_adapter'
Fixes: fdb7e884ad61 ("i2c: iop: Use GPIO descriptors")
Reported-by: kbuild test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fixes from Andy Shevchenko:
- Fix keyboard brightness control for ASUS laptops
- Better handling parameters of GPD pocket fan module to avoid
thermal shock
- Add IDs to PMC platform driver to support Intel Comet Lake
- Fix potential dead lock in Mellanox TM FIFO driver and ABI
documentation
* tag 'platform-drivers-x86-v5.5-3' of git://git.infradead.org/linux-platform-drivers-x86:
Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfaces
Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs interfaces
platform/x86: asus-wmi: Fix keyboard brightness cannot be set to 0
platform/x86: intel_pmc_core: update Comet Lake platform driver
platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits
platform/x86: GPD pocket fan: Use default values when wrong modparams are given
platform/mellanox: fix potential deadlock in the tmfifo driver
platform/x86: intel-ips: Use the correct style for SPDX License Identifier
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a build problem for the hisilicon driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: hisilicon/sec2 - Use atomics instead of __sync
|
|
Pull vfs fixes from Al Viro:
"Fixes for mountpoint_last() bugs (by converting to use of
lookup_last()) and an autofs regression fix from this cycle (caused by
follow_managed() breakage introduced in barrier fixes series)"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix autofs regression caused by follow_managed() changes
reimplement path_mountpoint() with less magic
|
|
One of the recent Tegra I2C commits made a change that resumes runtime PM
during driver's probe, but it missed to put the RPM in a case of error.
Note that it's not correct to use pm_runtime_status_suspended because it
breaks RPM refcounting.
Fixes: 8ebf15e9c869 ("i2c: tegra: Move suspend handling to NOIRQ phase")
Cc: <[email protected]> # v5.4+
Tested-by: Thierry Reding <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
I noticed that sometime I2C clock is kept enabled during suspend-resume.
This happens because runtime PM defers dynamic suspension and thus it may
happen that runtime PM is in active state when system enters into suspend.
In particular I2C controller that is used for CPU's DVFS is often kept ON
during suspend because CPU's voltage scaling happens quite often.
Fixes: 8ebf15e9c869 ("i2c: tegra: Move suspend handling to NOIRQ phase")
Cc: <[email protected]> # v5.4+
Tested-by: Thierry Reding <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
null_zone_write() only allows writing empty and implicitly opened zones.
Writing to closed and explicitly opened zones must also be allowed and
the zone condition must be transitioned to implicit open if the zone
is not explicitly opened already.
Fixes: da644b2cc1a4 ("null_blk: add zone open, close, and finish support")
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Nikolay Aleksandrov says:
====================
net: bridge: add vlan notifications and rtm support
This patch-set is a prerequisite for adding per-vlan options support
because we need to be able to send vlan-only notifications and do larger
vlan netlink dumps. Per-vlan options are needed as we move the control
more to vlans and would like to add per-vlan state (needed for per-vlan
STP and EVPN), per-vlan multicast options and control, and I'm sure
there would be many more per-vlan options coming.
Now we create/delete/dump vlans with the device AF_SPEC attribute which is
fine since we support vlan ranges or use a compact bridge_vlan_info
structure, but that cannot really be extended to support per-vlan options
well. The biggest issue is dumping them - we tried using the af_spec with
a new vlan option attribute but that led to insufficient message size
quickly, also another minor problem with that is we have to dump all vlans
always when notifying which, with options present, can be huge if they have
different options set, so we decided to add new rtm message types
specifically for vlans and register handlers for them and a new bridge vlan
notification nl group for vlan-only notifications.
The new RTM NEW/DEL/GETVLAN types introduced match the current af spec
bridge functionality and in fact use the same helpers.
The new nl format is:
[BRIDGE_VLANDB_ENTRY]
[BRIDGE_VLANDB_ENTRY_INFO] - bridge_vlan_info (either 1 vlan or
range start)
[BRIDGE_VLANDB_ENTRY_RANGE] - range end
This allows to encapsulate a range in a single attribute and also to
create vlans and immediately set options on all of them with a single
attribute. The GETVLAN dump can span multiple messages and dump all the
necessary information. The vlan-only notifications are sent on
NEW/DELVLAN events or when vlan options change (currently only flags),
we try hard to compress the vlans into ranges in the notifications as
well. When the per-vlan options are added we'll add helpers to check for
option equality between neighbor vlans and will keep compressing them
when possible.
Note patch 02 is not really required, it's just a nice addition to have
human-readable error messages from the different vlan checks.
iproute2 changes and selftests will be sent with the next set which adds
the first per-vlan option - per-vlan state similar to the port state.
v2: changed patch 03 and patch 04 to use nlmsg_parse() in order to
strictly validate the msg and make sure there are no remaining bytes
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Now that we can notify, send a notification on add/del or change of flags.
Notifications are also compressed when possible to reduce their number
and relieve user-space of extra processing, due to that we have to
manually notify after each add/del in order to avoid double
notifications. We try hard to notify only about the vlans which actually
changed, thus a single command can result in multiple notifications
about disjoint ranges if there were vlans which didn't change inside.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN
and add support for sending vlan notifications (both single and ranges).
No functional changes intended, the notification support will be used by
later patches.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new vlandb nl attribute - BRIDGE_VLANDB_ENTRY_RANGE which causes
RTM_NEWVLAN/DELVAN to act on a range. Dumps now automatically compress
similar vlans into ranges. This will be also used when per-vlan options
are introduced and vlans' options match, they will be put into a single
range which is encapsulated in one netlink attribute. We need to run
similar checks as br_process_vlan_info() does because these ranges will
be used for options setting and they'll be able to skip
br_process_vlan_info().
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Adding RTM_DELVLAN support similar to RTM_NEWVLAN is simple, just need to
map DELVLAN to DELLINK and register the handler.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add initial RTM_NEWVLAN support which can only create vlans, operating
similar to the current br_afspec(). We will use it later to also change
per-vlan options. Old-style (flag-based) vlan ranges are not allowed
when using RTM messages, we will introduce vlan ranges later via a new
nested attribute which would allow us to have all the information about a
range encapsulated into a single nl attribute.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds vlan rtm definitions:
- NEWVLAN: to be used for creating vlans, setting options and
notifications
- DELVLAN: to be used for deleting vlans
- GETVLAN: used for dumping vlan information
Dumping vlans which can span multiple messages is added now with basic
information (vid and flags). We use nlmsg_parse() to validate the header
length in order to be able to extend the message with filtering
attributes later.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add extack messages on vlan processing errors. We need to move the flags
missing check after the "last" check since we may have "last" set but
lack a range end flag in the next entry.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add helpers to check if a vlan id or range are valid. The range helper
must be called when range start or end are detected.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch fixes a regression on setting up asynchronous commands to use
external trigger sources when board-specific routing information is
missing.
`ni_find_device_routes()` (called via `ni_assign_device_routes()`) finds
the table of register values for the device family and the set of valid
routes for the specific board. If both are found,
`tables->route_values` is set to point to the table of register values
for the device family and `tables->valid_routes` is set to point to the
list of valid routes for the specific board. If either is not found,
both `tables->route_values` and `tables->valid_routes` are left set at
their initial null values (initialized by `ni_assign_device_routes()`)
and the function returns `-ENODATA`.
Returning an error results in some routing functionality being disabled.
Unfortunately, leaving `table->route_values` set to `NULL` also breaks
the setting up of asynchronous commands that are configured to use
external trigger sources. Calls to `ni_check_trigger_arg()` or
`ni_check_trigger_arg_roffs()` while checking the asynchronous command
set-up would result in a null pointer dereference if
`table->route_values` is `NULL`. The null pointer dereference is fixed
in another patch, but it now results in failure to set up the
asynchronous command. That is a regression from the behavior prior to
commit 347e244884c3 ("staging: comedi: tio: implement global tio/ctr
routing") and commit 56d0b826d39f ("staging: comedi: ni_mio_common:
implement new routing for TRIG_EXT").
Change `ni_find_device_routes()` to set `tables->route_values` and/or
`tables->valid_routes` to valid information even if the other one can
only be set to `NULL` due to missing information. The function will
still return an error in that case. This should result in
`tables->valid_routes` being valid for all currently supported device
families even if the board-specific routing information is missing.
That should be enough to fix the regression on setting up asynchronous
commands to use external triggers for boards with missing routing
information.
Fixes: 347e244884c3 ("staging: comedi: tio: implement global tio/ctr routing")
Fixes: 56d0b826d39f ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT").
Cc: <[email protected]> # 4.20+
Cc: Spencer E. Olson <[email protected]>
Signed-off-by: Ian Abbott <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
In `ni_find_route_source()`, `tables->route_values` gets dereferenced.
However it is possible that `tables->route_values` is `NULL`, leading to
a null pointer dereference. `tables->route_values` will be `NULL` if
the call to `ni_assign_device_routes()` during board initialization
returned an error due to missing device family routing information or
missing board-specific routing information. For example, there is
currently no board-specific routing information provided for the
PCIe-6251 board and several other boards, so those are affected by this
bug.
The bug is triggered when `ni_find_route_source()` is called via
`ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` when checking
the arguments for setting up asynchronous commands. Fix it by returning
`-EINVAL` if `tables->route_values` is `NULL`.
Even with this fix, setting up asynchronous commands to use external
trigger sources for boards with missing routing information will still
fail gracefully. Since `ni_find_route_source()` only depends on the
device family routing information, it would be better if that was made
available even if the board-specific routing information is missing.
That will be addressed by another patch.
Fixes: 4bb90c87abbe ("staging: comedi: add interface to ni routing table information")
Cc: <[email protected]> # 4.20+
Cc: Spencer E. Olson <[email protected]>
Signed-off-by: Ian Abbott <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Ido Schimmel says:
====================
mlxsw: Various fixes
This patch set contains various fixes for mlxsw.
Patch #1 splits the init() callback between Spectrum-2 and Spectrum-3 in
order to avoid enforcing the same firmware version for both ASICs, as
this can't possibly work. Without this patch the driver cannot boot with
the Spectrum-3 ASIC.
Patches #2-#3 fix a long standing race condition that was recently
exposed while testing the driver on an emulator, which is very slow
compared to the actual hardware. The problem is explained in detail in
the commit messages.
Patch #4 fixes a selftest.
Patch #5 prevents offloaded qdiscs from presenting a non-zero backlog to
the user when the netdev is down. This is done by clearing the cached
backlog in the driver when the netdev goes down.
Patch #6 fixes qdisc statistics (backlog and tail drops) to also take
into account the multicast traffic classes.
v2:
* Patches #2-#3: use skb_cow_head() instead of skb_unshare() as
suggested by Jakub. Remove unnecessary check regarding headroom
* Patches #5-#6: new
====================
Signed-off-by: David S. Miller <[email protected]>
|