Age | Commit message (Collapse) | Author | Files | Lines |
|
As commit fc191af1bb0d ("net: stmmac: platform: Fix misleading
interrupt error msg") noted, not every stmmac based platform
makes use of the 'eth_wake_irq' or 'eth_lpi' interrupts.
So, update the 'interrupt-names' inside 'snps,dwmac' YAML
bindings to reflect the same.
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Bhupesh Sharma <[email protected]>
Signed-off-by: Andrew Halaney <[email protected]>
Tested-by: Brian Masney <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
I have observed an issue where the RX direction of the LS1028A ENETC pMAC
seems unresponsive. The minimal procedure to reproduce the issue is:
1. Connect ENETC port 0 with a loopback RJ45 cable to one of the Felix
switch ports (0).
2. Bring the ports up (MAC Merge layer is not enabled on either end).
3. Send a large quantity of unidirectional (express) traffic from Felix
to ENETC. I tried altering frame size and frame count, and it doesn't
appear to be specific to either of them, but rather, to the quantity
of octets received. Lowering the frame count, the minimum quantity of
packets to reproduce relatively consistently seems to be around 37000
frames at 1514 octets (w/o FCS) each.
4. Using ethtool --set-mm, enable the pMAC in the Felix and in the ENETC
ports, in both RX and TX directions, and with verification on both
ends.
5. Wait for verification to complete on both sides.
6. Configure a traffic class as preemptible on both ends.
7. Send some packets again.
The issue is at step 5, where the verification process of ENETC ends
(meaning that Felix responds with an SMD-R and ENETC sees the response),
but the verification process of Felix never ends (it remains VERIFYING).
If step 3 is skipped or if ENETC receives less traffic than
approximately that threshold, the test runs all the way through
(verification succeeds on both ends, preemptible traffic passes fine).
If, between step 4 and 5, the step below is also introduced:
4.1. Disable and re-enable PM0_COMMAND_CONFIG bit RX_EN
then again, the sequence of steps runs all the way through, and
verification succeeds, even if there was the previous RX traffic
injected into ENETC.
Traffic sent *by* the ENETC port prior to enabling the MAC Merge layer
does not seem to influence the verification result, only received
traffic does.
The LS1028A manual does not mention any relationship between
PM0_COMMAND_CONFIG and MMCSR, and the hardware people don't seem to
know for now either.
The bit that is toggled to work around the issue is also toggled
by enetc_mac_enable(), called from phylink's mac_link_down() and
mac_link_up() methods - which is how the workaround was found:
verification would work after a link down/up.
Fixes: c7b9e8086902 ("net: enetc: add support for MAC Merge layer")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Remove an inability of bnxt_en driver to set eswitch to switchdev
mode without existing VFs by:
1. Allow to set switchdev mode in bnxt_dl_eswitch_mode_set() so
representors are created only when num_vfs > 0 otherwise just
set bp->eswitch_mode
2. Do not automatically change bp->eswitch_mode during
bnxt_vf_reps_create() and bnxt_vf_reps_destroy() calls so
the eswitch mode is managed only by an user by devlink.
Just set temporarily bp->eswitch_mode to legacy to avoid
re-opening of representors during destroy.
3. Create representors in bnxt_sriov_enable() if current eswitch
mode is switchdev one
Tested by this sequence:
1. Set PF interface up
2. Set PF's eswitch mode to switchdev
3. Created N VFs
4. Checked that N representors were created
5. Set eswitch mode to legacy
6. Checked that representors were deleted
7. Set eswitch mode back to switchdev
8. Checked that representors exist again for VFs
9. Deleted all VFs
10. Checked that all representors were deleted as well
11. Checked that current eswitch mode is still switchdev
Signed-off-by: Ivan Vecera <[email protected]>
Acked-by: Venkat Duvvuru <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Currently, when traversing ifwdtsn skips with _sctp_walk_ifwdtsn, it only
checks the pos against the end of the chunk. However, the data left for
the last pos may be < sizeof(struct sctp_ifwdtsn_skip), and dereference
it as struct sctp_ifwdtsn_skip may cause coverflow.
This patch fixes it by checking the pos against "the end of the chunk -
sizeof(struct sctp_ifwdtsn_skip)" in sctp_ifwdtsn_skip, similar to
sctp_fwdtsn_skip.
Fixes: 0fc2ea922c8a ("sctp: implement validate_ftsn for sctp_stream_interleave")
Signed-off-by: Xin Long <[email protected]>
Link: https://lore.kernel.org/r/2a71bffcd80b4f2c61fac6d344bb2f11c8fd74f7.1681155810.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.3-2023-04-12:
amdgpu:
- SMU13 fixes
- DP MST fix
Signed-off-by: Daniel Vetter <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Syzbot reported a bug as following:
=====================================================
BUG: KMSAN: uninit-value in qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
qrtr_endpoint_post+0xf85/0x11b0 net/qrtr/af_qrtr.c:519
qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
call_write_iter include/linux/fs.h:2189 [inline]
aio_write+0x63a/0x950 fs/aio.c:1600
io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
__do_sys_io_submit fs/aio.c:2078 [inline]
__se_sys_io_submit+0x293/0x770 fs/aio.c:2048
__x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was created at:
slab_post_alloc_hook mm/slab.h:766 [inline]
slab_alloc_node mm/slub.c:3452 [inline]
__kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491
__do_kmalloc_node mm/slab_common.c:967 [inline]
__kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988
kmalloc_reserve net/core/skbuff.c:492 [inline]
__alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565
__netdev_alloc_skb+0x120/0x7d0 net/core/skbuff.c:630
qrtr_endpoint_post+0xbd/0x11b0 net/qrtr/af_qrtr.c:446
qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
call_write_iter include/linux/fs.h:2189 [inline]
aio_write+0x63a/0x950 fs/aio.c:1600
io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
__do_sys_io_submit fs/aio.c:2078 [inline]
__se_sys_io_submit+0x293/0x770 fs/aio.c:2048
__x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
It is because that skb->len requires at least sizeof(struct qrtr_ctrl_pkt)
in qrtr_tx_resume(). And skb->len equals to size in qrtr_endpoint_post().
But size is less than sizeof(struct qrtr_ctrl_pkt) when qrtr_cb->type
equals to QRTR_TYPE_RESUME_TX in qrtr_endpoint_post() under the syzbot
scenario. This triggers the uninit variable access bug.
Add size check when qrtr_cb->type equals to QRTR_TYPE_RESUME_TX in
qrtr_endpoint_post() to fix the bug.
Fixes: 5fdeb0d372ab ("net: qrtr: Implement outgoing flow control")
Reported-by: [email protected]
Link: https://syzkaller.appspot.com/bug?id=c14607f0963d27d5a3d5f4c8639b500909e43540
Suggested-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Ziyang Xuan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
These Lenovo laptops use Realtek HDA codec combined with
2xCS35L41 Amplifiers using I2C with External Boost.
Signed-off-by: Stefan Binding <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Mika Westerberg says:
====================
net: thunderbolt: Fix for sparse warnings and typos
This series tries to fix the rest of the sparse warnings generated
against the driver. While there fix the two typos in comments as well.
The previous version of the series can be found here:
https://lore.kernel.org/netdev/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix two typos in comments:
blongs -> belongs
UPD -> UDP
No functional changes.
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fixes the following warning when the driver is built with sparse checks
enabled:
main.c:993:23: warning: incorrect type in initializer (different base types)
main.c:993:23: expected restricted __wsum [usertype] wsum
main.c:993:23: got restricted __be32 [usertype]
No functional changes intended.
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fixes the following warnings when the driver is built with sparse
checks enabled:
main.c:767:47: warning: restricted __le32 degrades to integer
main.c:775:47: warning: restricted __le16 degrades to integer
main.c:776:44: warning: restricted __le16 degrades to integer
main.c:876:40: warning: incorrect type in assignment (different base types)
main.c:876:40: expected restricted __le32 [usertype] frame_size
main.c:876:40: got unsigned int [assigned] [usertype] frame_size
main.c:877:41: warning: incorrect type in assignment (different base types)
main.c:877:41: expected restricted __le32 [usertype] frame_count
main.c:877:41: got unsigned int [usertype]
main.c:878:41: warning: incorrect type in assignment (different base types)
main.c:878:41: expected restricted __le16 [usertype] frame_index
main.c:878:41: got unsigned short [usertype]
main.c:879:38: warning: incorrect type in assignment (different base types)
main.c:879:38: expected restricted __le16 [usertype] frame_id
main.c:879:38: got unsigned short [usertype]
main.c:880:62: warning: restricted __le32 degrades to integer
main.c:880:35: warning: restricted __le16 degrades to integer
No functional changes intended.
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The commits referenced below allows userspace to use the NLM_F_ECHO flag
for RTM_NEW/DELLINK operations to receive unicast notifications for the
affected link. Prior to these changes, applications may have relied on
multicast notifications to learn the same information without specifying
the NLM_F_ECHO flag.
For such applications, the mentioned commits changed the behavior for
requests not using NLM_F_ECHO. Multicast notifications are still received,
but now use the portid of the requester and the sequence number of the
request instead of zero values used previously. For the application, this
message may be unexpected and likely handled as a response to the
NLM_F_ACKed request, especially if it uses the same socket to handle
requests and notifications.
To fix existing applications relying on the old notification behavior,
set the portid and sequence number in the notification only if the
request included the NLM_F_ECHO flag. This restores the old behavior
for applications not using it, but allows unicasted notifications for
others.
Fixes: f3a63cce1b4f ("rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link")
Fixes: d88e136cab37 ("rtnetlink: Honour NLM_F_ECHO flag in rtnl_newlink_create")
Signed-off-by: Martin Willi <[email protected]>
Acked-by: Guillaume Nault <[email protected]>
Acked-by: Hangbin Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
A number of MDIO drivers make use of devm_mdiobus_alloc_size(). This
is only available when CONFIG_MDIO_DEVRES is enabled. Add missing
depends or selects, depending on if there are circular dependencies or
not. This avoids linker errors, especially for randconfig builds.
Signed-off-by: Andrew Lunn <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There are several issues with copy_from_user_nofault():
- access_ok() is designed for user context only and for that reason
it has WARN_ON_IN_IRQ() which triggers when bpf, kprobe, eprobe
and perf on ppc are calling it from irq.
- it's missing nmi_uaccess_okay() which is a nop on all architectures
except x86 where it's required.
The comment in arch/x86/mm/tlb.c explains the details why it's necessary.
Calling copy_from_user_nofault() from bpf, [ke]probe without this check is not safe.
- __copy_from_user_inatomic() under CONFIG_HARDENED_USERCOPY is calling
check_object_size()->__check_object_size()->check_heap_object()->find_vmap_area()->spin_lock()
which is not safe to do from bpf, [ke]probe and perf due to potential deadlock.
Fix all three issues. At the end the copy_from_user_nofault() becomes
equivalent to copy_from_user_nmi() from safety point of view with
a difference in the return value.
Reported-by: Hsin-Wei Hung <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Florian Lehner <[email protected]>
Tested-by: Hsin-Wei Hung <[email protected]>
Tested-by: Florian Lehner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- kernel panic fix for intel-ish-hid driver (Tanu Malhotra)
- buffer overflow fix in hid-sensor-custom driver (Todd Brandt)
- two device specific quirks (Alessandro Manca, Philippe Troin)
* tag 'for-linus-2023041201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: intel-ish-hid: Fix kernel panic during warm reset
HID: hid-sensor-custom: Fix buffer overrun in device name
HID: topre: Add support for 87 keys Realforce R2
HID: add HP 13t-aw100 & 14t-ea100 digitizer battery quirks
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"A couple of fixes in apple driver, core and kernedoc fix for dmaengine
subsystem:
- apple admac driver fixes for current_tx, src_addr_widths and
global' interrupt flags handling
- xdma kerneldoc fix
- core fix for use of devm_add_action_or_reset"
* tag 'dmaengine-fix-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: apple-admac: Fix 'current_tx' not getting freed
dmaengine: apple-admac: Set src_addr_widths capability
dmaengine: apple-admac: Handle 'global' interrupt flags
dmaengine: xilinx: xdma: Fix some kernel-doc comments
dmaengine: Actually use devm_add_action_or_reset()
|
|
Christian Ehrig says:
====================
This patch set adds support for using FOU or GUE encapsulation with
an ipip device operating in collect-metadata mode and a set of kfuncs
for controlling encap parameters exposed to a BPF tc-hook.
BPF tc-hooks allow us to read tunnel metadata (like remote IP addresses)
in the ingress path of an externally controlled tunnel interface via
the bpf_skb_get_tunnel_{key,opt} bpf-helpers. Packets can then be
redirected to the same or a different externally controlled tunnel
interface by overwriting metadata via the bpf_skb_set_tunnel_{key,opt}
helpers and a call to bpf_redirect. This enables us to redirect packets
between tunnel interfaces - and potentially change the encapsulation
type - using only a single BPF program.
Today this approach works fine for a couple of tunnel combinations.
For example: redirecting packets between Geneve and GRE interfaces or
GRE and plain ipip interfaces. However, redirecting using FOU or GUE is
not supported today. The ip_tunnel module does not allow us to egress
packets using additional UDP encapsulation from an ipip device in
collect-metadata mode.
Patch 1 lifts this restriction by adding a struct ip_tunnel_encap to
the tunnel metadata. It can be filled by a new BPF kfunc introduced
in Patch 2 and evaluated by the ip_tunnel egress path. This will allow
us to use FOU and GUE encap with externally controlled ipip devices.
Patch 2 introduces two new BPF kfuncs: bpf_skb_{set,get}_fou_encap.
These helpers can be used to set and get UDP encap parameters from the
BPF tc-hook doing the packet redirect.
Patch 3 adds BPF tunnel selftests using the two kfuncs.
---
v3:
- Integrate selftest into test_progs (Alexei)
v2:
- Fixes for checkpatch.pl
- Fixes for kernel test robot
====================
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Add tests for FOU and GUE encapsulation via the bpf_skb_{set,get}_fou_encap
kfuncs, using ipip devices in collect-metadata mode.
These tests make sure that we can successfully set and obtain FOU and GUE
encap parameters using ingress / egress BPF tc-hooks.
Signed-off-by: Christian Ehrig <[email protected]>
Link: https://lore.kernel.org/r/040193566ddbdb0b53eb359f7ac7bbd316f338b5.1680874078.git.cehrig@cloudflare.com
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Add two new kfuncs that allow a BPF tc-hook, installed on an ipip
device in collect-metadata mode, to control FOU encap parameters on a
per-packet level. The set of kfuncs is registered with the fou module.
The bpf_skb_set_fou_encap kfunc is supposed to be used in tandem and after
a successful call to the bpf_skb_set_tunnel_key bpf-helper. UDP source and
destination ports can be controlled by passing a struct bpf_fou_encap. A
source port of zero will auto-assign a source port. enum bpf_fou_encap_type
is used to specify if the egress path should FOU or GUE encap the packet.
On the ingress path bpf_skb_get_fou_encap can be used to read UDP source
and destination ports from the receiver's point of view and allows for
packet multiplexing across different destination ports within a single
BPF program and ipip device.
Signed-off-by: Christian Ehrig <[email protected]>
Link: https://lore.kernel.org/r/e17c94a646b63e78ce0dbf3f04b2c33dc948a32d.1680874078.git.cehrig@cloudflare.com
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Today ipip devices in collect-metadata mode don't allow for sending FOU
or GUE encapsulated packets. This patch lifts the restriction by adding
a struct ip_tunnel_encap to the tunnel metadata.
On the egress path, the members of this struct can be set by the
bpf_skb_set_fou_encap kfunc via a BPF tc-hook. Instead of dropping packets
wishing to use additional UDP encapsulation, ip_md_tunnel_xmit now
evaluates the contents of this struct and adds the corresponding FOU or
GUE header. Furthermore, it is making sure that additional header bytes
are taken into account for PMTU discovery.
On the ingress path, an ipip device in collect-metadata mode will fill this
struct and a BPF tc-hook can obtain the information via a call to the
bpf_skb_get_fou_encap kfunc.
The minor change to ip_tunnel_encap, which now takes a pointer to
struct ip_tunnel_encap instead of struct ip_tunnel, allows us to control
FOU encap type and parameters on a per packet-level.
Signed-off-by: Christian Ehrig <[email protected]>
Link: https://lore.kernel.org/r/cfea47de655d0f870248abf725932f851b53960a.1680874078.git.cehrig@cloudflare.com
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
When huang uses sched_switch tracepoint, the tracepoint
does only one thing in the mounted ebpf program, which
deletes the fixed elements in sockhash ([0])
It seems that elements in sockhash are rarely actively
deleted by users or ebpf program. Therefore, we do not
pay much attention to their deletion. Compared with hash
maps, sockhash only provides spin_lock_bh protection.
This causes it to appear to have self-locking behavior
in the interrupt context.
[0]:https://lore.kernel.org/all/CABcoxUayum5oOqFMMqAeWuS8+EzojquSOSyDA3J_2omY=2EeAg@mail.gmail.com/
Reported-by: Hsin-Wei Hung <[email protected]>
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Xin Liu <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Update the driver implementations to fit those data exposed
by PMFW.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected] # 6.1.x
|
|
Correct the max shader clock reporting on SMU
13.0.7.
Signed-off-by: Horatio Zhang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected] # 6.1.x
|
|
Correct the pstate standard/peak profiling mode clock
settings for SMU13.0.7.
Signed-off-by: Horatio Zhang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected] # 6.1.x
|
|
[Why & How]
drm_dp_remove_payload() interface was changed. Correct amdgpu dm code
to pass the right parameter to the drm helper function.
Reviewed-by: Jerry Zuo <[email protected]>
Acked-by: Qingqing Zhuo <[email protected]>
Signed-off-by: Wayne Lin <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The bpf_cgroup_kptr_get() kfunc has been removed, and
bpf_cgroup_acquire() / bpf_cgroup_release() now have the same semantics
as bpf_task_acquire() / bpf_task_release(). This patch updates the BPF
documentation to reflect this.
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Now that bpf_cgroup_acquire() is KF_RCU | KF_RET_NULL,
bpf_cgroup_kptr_get() is redundant. Let's remove it, and update
selftests to instead use bpf_cgroup_acquire() where appropriate. The
next patch will update the BPF documentation to not mention
bpf_cgroup_kptr_get().
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
struct cgroup is already an RCU-safe type in the verifier. We can
therefore update bpf_cgroup_acquire() to be KF_RCU | KF_RET_NULL, and
subsequently remove bpf_cgroup_kptr_get(). This patch does the first of
these by updating bpf_cgroup_acquire() to be KF_RCU | KF_RET_NULL, and
also updates selftests accordingly.
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
It is found that attaching a task to the top_cpuset does not currently
ignore CPUs allocated to subpartitions in cpuset_attach_task(). So the
code is changed to fix that.
Signed-off-by: Waiman Long <[email protected]>
Reviewed-by: Michal Koutný <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
In the case of CLONE_INTO_CGROUP, not all cpusets are ready to accept
new tasks. It is too late to check that in cpuset_fork(). So we need
to add the cpuset_can_fork() and cpuset_cancel_fork() methods to
pre-check it before we can allow attachment to a different cpuset.
We also need to set the attach_in_progress flag to alert other code
that a new task is going to be added to the cpuset.
Fixes: ef2c41cf38a7 ("clone3: allow spawning processes into cgroups")
Suggested-by: Michal Koutný <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Cc: [email protected] # v5.7+
Signed-off-by: Tejun Heo <[email protected]>
|
|
By default, the clone(2) syscall spawn a child process into the same
cgroup as its parent. With the use of the CLONE_INTO_CGROUP flag
introduced by commit ef2c41cf38a7 ("clone3: allow spawning processes
into cgroups"), the child will be spawned into a different cgroup which
is somewhat similar to writing the child's tid into "cgroup.threads".
The current cpuset_fork() method does not properly handle the
CLONE_INTO_CGROUP case where the cpuset of the child may be different
from that of its parent. Update the cpuset_fork() method to treat the
CLONE_INTO_CGROUP case similar to cpuset_attach().
Since the newly cloned task has not been running yet, its actual
memory usage isn't known. So it is not necessary to make change to mm
in cpuset_fork().
Fixes: ef2c41cf38a7 ("clone3: allow spawning processes into cgroups")
Reported-by: Giuseppe Scrivano <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Cc: [email protected] # v5.7+
Signed-off-by: Tejun Heo <[email protected]>
|
|
After a successful cpuset_can_attach() call which increments the
attach_in_progress flag, either cpuset_cancel_attach() or cpuset_attach()
will be called later. In cpuset_attach(), tasks in cpuset_attach_wq,
if present, will be woken up at the end. That is not the case in
cpuset_cancel_attach(). So missed wakeup is possible if the attach
operation is somehow cancelled. Fix that by doing the wakeup in
cpuset_cancel_attach() as well.
Fixes: e44193d39e8d ("cpuset: let hotplug propagation work wait for task attaching")
Signed-off-by: Waiman Long <[email protected]>
Reviewed-by: Michal Koutný <[email protected]>
Cc: [email protected] # v3.11+
Signed-off-by: Tejun Heo <[email protected]>
|
|
syzbot is reporting circular locking dependency between cpu_hotplug_lock
and freezer_mutex, for commit f5d39b020809 ("freezer,sched: Rewrite core
freezer logic") replaced atomic_inc() in freezer_apply_state() with
static_branch_inc() which holds cpu_hotplug_lock.
cpu_hotplug_lock => cgroup_threadgroup_rwsem => freezer_mutex
cgroup_file_write() {
cgroup_procs_write() {
__cgroup_procs_write() {
cgroup_procs_write_start() {
cgroup_attach_lock() {
cpus_read_lock() {
percpu_down_read(&cpu_hotplug_lock);
}
percpu_down_write(&cgroup_threadgroup_rwsem);
}
}
cgroup_attach_task() {
cgroup_migrate() {
cgroup_migrate_execute() {
freezer_attach() {
mutex_lock(&freezer_mutex);
(...snipped...)
}
}
}
}
(...snipped...)
}
}
}
freezer_mutex => cpu_hotplug_lock
cgroup_file_write() {
freezer_write() {
freezer_change_state() {
mutex_lock(&freezer_mutex);
freezer_apply_state() {
static_branch_inc(&freezer_active) {
static_key_slow_inc() {
cpus_read_lock();
static_key_slow_inc_cpuslocked();
cpus_read_unlock();
}
}
}
mutex_unlock(&freezer_mutex);
}
}
}
Swap locking order by moving cpus_read_lock() in freezer_apply_state()
to before mutex_lock(&freezer_mutex) in freezer_change_state().
Reported-by: syzbot <[email protected]>
Link: https://syzkaller.appspot.com/bug?extid=c39682e86c9d84152f93
Suggested-by: Hillf Danton <[email protected]>
Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Signed-off-by: Tetsuo Handa <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
During OOM bpf_local_storage_alloc() may fail to allocate 'storage' and
call to bpf_local_storage_free() with NULL pointer will cause a crash like:
[ 271718.917646] BUG: kernel NULL pointer dereference, address: 00000000000000a0
[ 271719.019620] RIP: 0010:call_rcu+0x2d/0x240
[ 271719.216274] bpf_local_storage_alloc+0x19e/0x1e0
[ 271719.250121] bpf_local_storage_update+0x33b/0x740
Fixes: 7e30a8477b0b ("bpf: Add bpf_local_storage_free()")
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Fix netfs_extract_iter_to_sg() for ITER_UBUF and ITER_IOVEC to set the
size of the page to the part of the page extracted, not the remaining
amount of data in the extracted page array at that point.
This doesn't yet affect anything as cifs, the only current user, only
passes in non-user-backed iterators.
Fixes: 018584697533 ("netfs: Add a function to extract an iterator into a scatterlist")
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Cc: Steve French <[email protected]>
Cc: Shyam Prasad N <[email protected]>
Cc: Rohith Surabattula <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When local group is fully busy but its average load is above system load,
computing the imbalance will overflow and local group is not the best
target for pulling this load.
Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
Reported-by: Tingjia Cao <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Tingjia Cao <[email protected]>
Link: https://lore.kernel.org/lkml/CABcWv9_DAhVBOq2=W=2ypKE9dKM5s2DvoV8-U0+GDwwuKZ89jQ@mail.gmail.com/T/
|
|
We were stuck on rc2, should at least attempt to track drm-fixes
slightly.
Signed-off-by: Maarten Lankhorst <[email protected]>
|
|
TI CPSW uses of_platform_* functions which are declared in of_platform.h.
of_platform.h gets implicitly included by of_device.h, but that is going
to be removed soon. Nothing else depends on of_device.h so it can be
dropped. of_platform.h also implicitly includes platform_device.h, so
add an explicit include for it, too.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Smatch reports:
drivers/net/wwan/iosm/iosm_ipc_pcie.c:298 ipc_pcie_probe()
warn: missing unwind goto?
When dma_set_mask fails it directly returns without disabling pci
device and freeing ipc_pcie. Fix this my calling a correct goto label
As dma_set_mask returns either 0 or -EIO, we can use a goto label, as
it finally returns -EIO.
Add a set_mask_fail goto label which stands consistent with other goto
labels in this function..
Fixes: 035e3befc191 ("net: wwan: iosm: fix driver not working with INTEL_IOMMU disabled")
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Harshit Mogalapalli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
With Eric's ref tracker, syzbot finally found a repro for
use-after-free in tcp_write_timer_handler() by kernel TCP
sockets. [0]
If SMC creates a kernel socket in __smc_create(), the kernel
socket is supposed to be freed in smc_clcsock_release() by
calling sock_release() when we close() the parent SMC socket.
However, at the end of smc_clcsock_release(), the kernel
socket's sk_state might not be TCP_CLOSE. This means that
we have not called inet_csk_destroy_sock() in __tcp_close()
and have not stopped the TCP timers.
The kernel socket's TCP timers can be fired later, so we
need to hold a refcnt for net as we do for MPTCP subflows
in mptcp_subflow_create_socket().
[0]:
leaked reference.
sk_alloc (./include/net/net_namespace.h:335 net/core/sock.c:2108)
inet_create (net/ipv4/af_inet.c:319 net/ipv4/af_inet.c:244)
__sock_create (net/socket.c:1546)
smc_create (net/smc/af_smc.c:3269 net/smc/af_smc.c:3284)
__sock_create (net/socket.c:1546)
__sys_socket (net/socket.c:1634 net/socket.c:1618 net/socket.c:1661)
__x64_sys_socket (net/socket.c:1672)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
==================================================================
BUG: KASAN: slab-use-after-free in tcp_write_timer_handler (net/ipv4/tcp_timer.c:378 net/ipv4/tcp_timer.c:624 net/ipv4/tcp_timer.c:594)
Read of size 1 at addr ffff888052b65e0d by task syzrepro/18091
CPU: 0 PID: 18091 Comm: syzrepro Tainted: G W 6.3.0-rc4-01174-gb5d54eb5899a #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:107)
print_report (mm/kasan/report.c:320 mm/kasan/report.c:430)
kasan_report (mm/kasan/report.c:538)
tcp_write_timer_handler (net/ipv4/tcp_timer.c:378 net/ipv4/tcp_timer.c:624 net/ipv4/tcp_timer.c:594)
tcp_write_timer (./include/linux/spinlock.h:390 net/ipv4/tcp_timer.c:643)
call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701)
__run_timers.part.0 (kernel/time/timer.c:1752 kernel/time/timer.c:2022)
run_timer_softirq (kernel/time/timer.c:2037)
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572)
__irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650)
irq_exit_rcu (kernel/softirq.c:664)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1107 (discriminator 14))
</IRQ>
Fixes: ac7138746e14 ("smc: establish new socket family")
Reported-by: [email protected]
Link: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Tony Lu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Remove unused functions.
These functions may have some value in documenting the
hardware. But that information may be accessed via SCM history.
Flagged by clang-16 with W=1.
No functional change intended.
Compile tested only.
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The driver was incorrectly overwriting the cyclecounter bitmask,
which was truncating it and not aligning to the hardware mask value.
This isn't causing any issues, but it's wrong. Fix this by not
constraining the cyclecounter/hardware mask.
Luckily, this seems to cause no issues, which is why this change
doesn't have a fixes tag and isn't being sent to net. However, if
any transformations from time->cycles are needed in the future,
this change will be needed.
Suggested-by: Allen Hubbe <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Sebastian Reichel says:
====================
stmmac: Fix RK3588 error prints
This fixes a couple of false positive error messages printed
by stmmac on RK3588. I expect them to go via net-next since
the fixes are not critical.
Changes since PATCHv1:
* https://lore.kernel.org/all/[email protected]/
* Add Fixes tags
* Use loop to request clocks
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The usual devm_regulator_get() call already handles "optional"
regulators by returning a valid dummy and printing a warning
that the dummy regulator should be described properly. This
code open coded the same behaviour, but masked any errors that
are not -EPROBE_DEFER and is quite noisy.
This change effectively unmasks and propagates regulators errors
not involving -ENODEV, downgrades the error print to warning level
if no regulator is specified and captures the probe defer message
for /sys/kernel/debug/devices_deferred.
Fixes: 2e12f536635f ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The clock requesting code is quite repetitive. Fix this by requesting
the clocks via devm_clk_bulk_get_optional. The optional variant has been
used, since this is effectively what the old code did. The exact clocks
required depend on the platform and configuration. As a side effect
this change adds correct -EPROBE_DEFER handling.
Suggested-by: Jakub Kicinski <[email protected]>
Suggested-by: Andrew Lunn <[email protected]>
Fixes: 7ad269ea1a2b ("GMAC: add driver for Rockchip RK3288 SoCs integrated GMAC")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Vladimir Oltean says:
====================
DSA trace events
This series introduces the "dsa" trace event class, with the following
events:
$ trace-cmd list | grep dsa
dsa
dsa:dsa_fdb_add_hw
dsa:dsa_mdb_add_hw
dsa:dsa_fdb_del_hw
dsa:dsa_mdb_del_hw
dsa:dsa_fdb_add_bump
dsa:dsa_mdb_add_bump
dsa:dsa_fdb_del_drop
dsa:dsa_mdb_del_drop
dsa:dsa_fdb_del_not_found
dsa:dsa_mdb_del_not_found
dsa:dsa_lag_fdb_add_hw
dsa:dsa_lag_fdb_add_bump
dsa:dsa_lag_fdb_del_hw
dsa:dsa_lag_fdb_del_drop
dsa:dsa_lag_fdb_del_not_found
dsa:dsa_vlan_add_hw
dsa:dsa_vlan_del_hw
dsa:dsa_vlan_add_bump
dsa:dsa_vlan_del_drop
dsa:dsa_vlan_del_not_found
These are useful to debug refcounting issues on CPU and DSA ports, where
entries may remain lingering, or may be removed too soon, depending on
bugs in higher layers of the network stack.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
These are not as critical as the FDB/MDB trace points (I'm not aware of
outstanding VLAN related bugs), but maybe they are useful to somebody,
either debugging something or simply trying to learn more.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
DSA performs non-trivial housekeeping of unicast and multicast addresses
on shared (CPU and DSA) ports, and puts a bit of pressure on higher
layers, requiring them to behave correctly (remove these addresses
exactly as many times as they were added). Otherwise, either addresses
linger around forever, or DSA returns -ENOENT complaining that entries
that were already deleted must be deleted again.
To aid debugging, introduce some trace points specifically for FDB and
MDB - that's where some of the bugs still are right now.
Some bugs I have seen were also due to race conditions, see:
- 630fd4822af2 ("net: dsa: flush switchdev workqueue on bridge join error path")
- a2614140dc0f ("net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN")
so it would be good to not disturb the timing too much, hence the choice
to use trace points vs regular dev_dbg().
I've had these for some time on my computer in a less polished form, and
they've proven useful. What I found most useful was to enable
CONFIG_BOOTTIME_TRACING, add "trace_event=dsa" to the kernel cmdline,
and run "cat /sys/kernel/debug/tracing/trace". This is to debug more
complex environments with network managers started by the init system,
things like that.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Static code analyzer complains to unchecked return value.
The result of pci_reset_function() is unchecked.
Despite, the issue is on the FLR supported code path and in that
case reset can be done with pcie_flr(), the patch uses less invasive
approach by adding the result check of pci_reset_function().
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 7e2cf4feba05 ("qlcnic: change driver hardware interface mechanism")
Signed-off-by: Denis Plotnikov <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Bjorn Helgaas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
iavf: fix racing in VLANs
Ahmed Zaki says:
This patchset mainly fixes a racing issue in the iavf where the number of
VLANs in the vlan_filter_list might be more than the PF limit. To fix that,
we get rid of the cvlans and svlans bitmaps and keep all the required info
in the list.
The second patch adds two new states that are needed so that we keep the
VLAN info while the interface goes DOWN:
-- DISABLE (notify PF, but keep the filter in the list)
-- INACTIVE (dev is DOWN, filter is removed from PF)
Finally, the current code keeps each state in a separate bit field, which
is error prone. The first patch refactors that by replacing all bits with
a single enum. The changes are minimal where each bit change is replaced
with the new state value.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: remove active_cvlans and active_svlans bitmaps
iavf: refactor VLAN filter states
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|