aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-06-10irqdomain: Cache irq_data instead of a virq number in the revmapMarc Zyngier1-5/+11
Caching a virq number in the revmap is pretty inefficient, as it means we will need to convert it back to either an irq_data or irq_desc to do anything with it. It is also a bit odd, as the radix tree does cache irq_data pointers. Change the revmap type to be an irq_data pointer instead of an unsigned int, and preserve the current API for now. Signed-off-by: Marc Zyngier <[email protected]>
2021-06-10irqdomain: Use struct_size() helper when allocating irqdomainMarc Zyngier1-1/+1
Instead of open-coding the size computation of struct irqdomain, use the struct_size() helper instead. This is going to be handy as we change the type of the revmap array. Signed-off-by: Marc Zyngier <[email protected]>
2021-06-10irqdomain: Make normal and nomap irqdomains exclusiveMarc Zyngier1-10/+35
Direct mappings are completely exclusive of normal mappings, meaning that we can refactor the code slightly so that we can get rid of the revmap_direct_max_irq field and use the revmap_size field instead, reducing the size of the irqdomain structure. Signed-off-by: Marc Zyngier <[email protected]>
2021-06-10powerpc: Move the use of irq_domain_add_nomap() behind a config optionMarc Zyngier2-0/+7
Only a handful of old PPC systems are still using the old 'nomap' variant of the irqdomain library. Move the associated definitions behind a configuration option, which will allow us to make some more radical changes. Signed-off-by: Marc Zyngier <[email protected]>
2021-06-10irqdomain: Reimplement irq_linear_revmap() with irq_find_mapping()Marc Zyngier1-3/+3
irq_linear_revmap() is supposed to be a fast path for domain lookups, but it only exposes low-level details of the irqdomain implementation, details which are better kept private. The *overhead* between the two is only a function call and a couple of tests, so it is likely that noone can show any meaningful difference compared to the cost of taking an interrupt. Reimplement irq_linear_revmap() with irq_find_mapping() in order to preserve source code compatibility, and rename the internal field for a measure. Signed-off-by: Marc Zyngier <[email protected]>
2021-06-10irq_work: Make irq_work_queue() NMI-safe againPeter Zijlstra1-3/+0
Someone carelessly put NMI unsafe code in irq_work_queue(), breaking just about every single user. Also, someone has a terrible comment style. Fixes: e2b5bcf9f5ba ("irq_work: record irq_work_queue() call stack") Reported-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-08audit: Use list_move instead of list_del/list_addBaokun Li1-8/+4
Using list_move() instead of list_del() + list_add(). Reported-by: Hulk Robot <[email protected]> Signed-off-by: Baokun Li <[email protected]> Acked-by: Richard Guy Briggs <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2021-06-08audit: Rename enum audit_state constants to avoid AUDIT_DISABLED redefinitionSergey Nazarov2-21/+21
AUDIT_DISABLED defined in kernel/audit.h as element of enum audit_state and redefined in kernel/audit.c. This produces a warning when kernel builds with syscalls audit disabled and brokes kernel build if -Werror used. enum audit_state used in syscall audit code only. This patch changes enum audit_state constants prefix AUDIT to AUDIT_STATE to avoid AUDIT_DISABLED redefinition. Signed-off-by: Sergey Nazarov <[email protected]> Acked-by: Richard Guy Briggs <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2021-06-08rq-qos: fix missed wake-ups in rq_qos_throttle try twoJan Kara1-2/+7
Commit 545fbd0775ba ("rq-qos: fix missed wake-ups in rq_qos_throttle") tried to fix a problem that a process could be sleeping in rq_qos_wait() without anyone to wake it up. However the fix is not complete and the following can still happen: CPU1 (waiter1) CPU2 (waiter2) CPU3 (waker) rq_qos_wait() rq_qos_wait() acquire_inflight_cb() -> fails acquire_inflight_cb() -> fails completes IOs, inflight decreased prepare_to_wait_exclusive() prepare_to_wait_exclusive() has_sleeper = !wq_has_single_sleeper() -> true as there are two sleepers has_sleeper = !wq_has_single_sleeper() -> true io_schedule() io_schedule() Deadlock as now there's nobody to wakeup the two waiters. The logic automatically blocking when there are already sleepers is really subtle and the only way to make it work reliably is that we check whether there are some waiters in the queue when adding ourselves there. That way, we are guaranteed that at least the first process to enter the wait queue will recheck the waiting condition before going to sleep and thus guarantee forward progress. Fixes: 545fbd0775ba ("rq-qos: fix missed wake-ups in rq_qos_throttle") CC: [email protected] Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-06-08tracing: Correct the length check which causes memory corruptionLiangyan1-1/+1
We've suffered from severe kernel crashes due to memory corruption on our production environment, like, Call Trace: [1640542.554277] general protection fault: 0000 [#1] SMP PTI [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190 [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286 [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX: 0000000006e931bf [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI: ffff9a45ff004300 [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09: 0000000000000000 [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9a20608d [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15: 696c662f65636976 [1640542.563128] FS: 00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000) knlGS:0000000000000000 [1640542.563937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4: 00000000003606e0 [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1640542.566742] Call Trace: [1640542.567009] anon_vma_clone+0x5d/0x170 [1640542.567417] __split_vma+0x91/0x1a0 [1640542.567777] do_munmap+0x2c6/0x320 [1640542.568128] vm_munmap+0x54/0x70 [1640542.569990] __x64_sys_munmap+0x22/0x30 [1640542.572005] do_syscall_64+0x5b/0x1b0 [1640542.573724] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1640542.575642] RIP: 0033:0x7f45d6e61e27 James Wang has reproduced it stably on the latest 4.19 LTS. After some debugging, we finally proved that it's due to ftrace buffer out-of-bound access using a debug tool as follows: [ 86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000 [ 86.780806] no_context+0xdf/0x3c0 [ 86.784327] __do_page_fault+0x252/0x470 [ 86.788367] do_page_fault+0x32/0x140 [ 86.792145] page_fault+0x1e/0x30 [ 86.795576] strncpy_from_unsafe+0x66/0xb0 [ 86.799789] fetch_memory_string+0x25/0x40 [ 86.804002] fetch_deref_string+0x51/0x60 [ 86.808134] kprobe_trace_func+0x32d/0x3a0 [ 86.812347] kprobe_dispatcher+0x45/0x50 [ 86.816385] kprobe_ftrace_handler+0x90/0xf0 [ 86.820779] ftrace_ops_assist_func+0xa1/0x140 [ 86.825340] 0xffffffffc00750bf [ 86.828603] do_sys_open+0x5/0x1f0 [ 86.832124] do_syscall_64+0x5b/0x1b0 [ 86.835900] entry_SYSCALL_64_after_hwframe+0x44/0xa9 commit b220c049d519 ("tracing: Check length before giving out the filter buffer") adds length check to protect trace data overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent overflow entirely, the length check should also take the sizeof entry->array[0] into account, since this array[0] is filled the length of trace data and occupy addtional space and risk overflow. Link: https://lkml.kernel.org/r/[email protected] Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Xunlei Pang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer") Reviewed-by: Xunlei Pang <[email protected]> Reviewed-by: yinbinbin <[email protected]> Reviewed-by: Wetp Zhang <[email protected]> Tested-by: James Wang <[email protected]> Signed-off-by: Liangyan <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-06-08ftrace: Do not blindly read the ip address in ftrace_bug()Steven Rostedt (VMware)1-1/+7
It was reported that a bug on arm64 caused a bad ip address to be used for updating into a nop in ftrace_init(), but the error path (rightfully) returned -EINVAL and not -EFAULT, as the bug caused more than one error to occur. But because -EINVAL was returned, the ftrace_bug() tried to report what was at the location of the ip address, and read it directly. This caused the machine to panic, as the ip was not pointing to a valid memory address. Instead, read the ip address with copy_from_kernel_nofault() to safely access the memory, and if it faults, report that the address faulted, otherwise report what was in that location. Link: https://lore.kernel.org/lkml/[email protected]/ Cc: [email protected] Fixes: 05736a427f7e1 ("ftrace: warn on failure to disable mcount callers") Reported-by: Mark-PK Tsai <[email protected]> Tested-by: Mark-PK Tsai <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-06-08cgroup: make per-cgroup pressure stall tracking configurableSuren Baghdasaryan2-13/+65
PSI accounts stalls for each cgroup separately and aggregates it at each level of the hierarchy. This causes additional overhead with psi_avgs_work being called for each cgroup in the hierarchy. psi_avgs_work has been highly optimized, however on systems with large number of cgroups the overhead becomes noticeable. Systems which use PSI only at the system level could avoid this overhead if PSI can be configured to skip per-cgroup stall accounting. Add "cgroup_disable=pressure" kernel command-line option to allow requesting system-wide only pressure stall accounting. When set, it keeps system-wide accounting under /proc/pressure/ but skips accounting for individual cgroups and does not expose PSI nodes in cgroup hierarchy. Signed-off-by: Suren Baghdasaryan <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Johannes Weiner <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2021-06-07Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2-22/+17
Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <[email protected]>
2021-06-07quota: Change quotactl_path() systcall to an fd-based oneJan Kara1-1/+1
Some users have pointed out that path-based syscalls are problematic in some environments and at least directory fd argument and possibly also resolve flags are desirable for such syscalls. Rather than reimplementing all details of pathname lookup and following where it may eventually evolve, let's go for full file descriptor based syscall similar to how ioctl(2) works since the beginning. Managing of quotas isn't performance sensitive so the extra overhead of open does not matter and we are able to consume O_PATH descriptors as well which makes open cheap anyway. Also for frequent operations (such as retrieving usage information for all users) we can reuse single fd and in fact get even better performance as well as avoiding races with possible remounts etc. Tested-by: Sascha Hauer <[email protected]> Acked-by: Christian Brauner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2021-06-04Merge tag 'net-5.13-rc5' of ↵Linus Torvalds2-22/+17
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from bpf, wireless, netfilter and wireguard trees. The bpf vs lockdown+audit fix is the most notable. Things haven't slowed down just yet, both in terms of regressions in current release and largish fixes for older code, but we usually see a slowdown only after -rc5. Current release - regressions: - virtio-net: fix page faults and crashes when XDP is enabled - mlx5e: fix HW timestamping with CQE compression, and make sure they are only allowed to coexist with capable devices - stmmac: - fix kernel panic due to NULL pointer dereference of mdio_bus_data - fix double clk unprepare when no PHY device is connected Current release - new code bugs: - mt76: a few fixes for the recent MT7921 devices and runtime power management Previous releases - regressions: - ice: - track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx - fix allowing VF to request more/less queues via virtchnl - correct supported and advertised autoneg by using PHY capabilities - allow all LLDP packets from PF to Tx - kbuild: quote OBJCOPY var to avoid a pahole call break the build Previous releases - always broken: - bpf, lockdown, audit: fix buggy SELinux lockdown permission checks - mt76: address the recent FragAttack vulnerabilities not covered by generic fixes - ipv6: fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions - Bluetooth: - fix the erroneous flush_work() order, to avoid double free - use correct lock to prevent UAF of hdev object - nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect - ieee802154: multiple fixes to error checking and return values - igb: fix XDP with PTP enabled - intel: add correct exception tracing for XDP - tls: fix use-after-free when TLS offload device goes down and back up - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service - netfilter: nft_ct: skip expectations for confirmed conntrack - mptcp: fix falling back to TCP in presence of out of order packets early in connection lifetime - wireguard: switch from O(n) to a O(1) algorithm for maintaining peers, fixing stalls and a large memory leak in the process Misc: - devlink: correct VIRTUAL port to not have phys_port attributes - Bluetooth: fix VIRTIO_ID_BT assigned number - net: return the correct errno code ENOBUF -> ENOMEM - wireguard: - peer: allocate in kmem_cache saving 25% on peer memory - do not use -O3" * tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: avoid link re-train during TC-MQPRIO configuration sch_htb: fix refcount leak in htb_parent_to_leaf_offload wireguard: allowedips: free empty intermediate nodes when removing single node wireguard: allowedips: allocate nodes in kmem_cache wireguard: allowedips: remove nodes in O(1) wireguard: allowedips: initialize list head in selftest wireguard: peer: allocate in kmem_cache wireguard: use synchronize_net rather than synchronize_rcu wireguard: do not use -O3 wireguard: selftests: make sure rp_filter is disabled on vethc wireguard: selftests: remove old conntrack kconfig value virtchnl: Add missing padding to virtchnl_proto_hdrs ice: Allow all LLDP packets from PF to Tx ice: report supported and advertised autoneg using PHY capabilities ice: handle the VF VSI rebuild failure ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared ice: Fix allowing VF to request more/less queues via virtchnl virtio-net: fix for skb_over_panic inside big mode ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions fib: Return the correct errno code ...
2021-06-04cgroup: Fix kernel-docYang Li2-2/+2
Fix function name in cgroup.c and rstat.c kernel-doc comment to remove these warnings found by clang_w1. kernel/cgroup/cgroup.c:2401: warning: expecting prototype for cgroup_taskset_migrate(). Prototype was for cgroup_migrate_execute() instead. kernel/cgroup/rstat.c:233: warning: expecting prototype for cgroup_rstat_flush_begin(). Prototype was for cgroup_rstat_flush_hold() instead. Reported-by: Abaci Robot <[email protected]> Fixes: 'commit e595cd706982 ("cgroup: track migration context in cgroup_mgctx")' Signed-off-by: Yang Li <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2021-06-04sched/debug: Remove obsolete init_schedstats()Eric Dumazet1-17/+2
Revert commit 4698f88c06b8 ("sched/debug: Fix 'schedstats=enable' cmdline option"). After commit 6041186a3258 ("init: initialize jump labels before command line option parsing") we can rely on jump label infra being ready for use when setup_schedstats() is called. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-03Merge branch 'sched/urgent' into sched/core, to pick up fixesIngo Molnar32-190/+407
Signed-off-by: Ingo Molnar <[email protected]>
2021-06-03sched/fair: Fix util_est UTIL_AVG_UNCHANGED handlingDietmar Eggemann3-13/+6
The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent unnecessary util_est updates uses the LSB of util_est.enqueued. It is exposed via _task_util_est() (and task_util_est()). Commit 92a801e5d5b7 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages") mentions that the LSB is lost for util_est resolution but find_energy_efficient_cpu() checks if task_util_est() returns 0 to return prev_cpu early. _task_util_est() returns the max value of util_est.ewma and util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED. So task_util_est() returning the max of task_util() and _task_util_est() will never return 0 under the default SCHED_FEAT(UTIL_EST, true). To fix this use the MSB of util_est.enqueued instead and keep the flag util_est internal, i.e. don't export it via _task_util_est(). The maximal possible util_avg value for a task is 1024 so the MSB of 'unsigned int util_est.enqueued' isn't used to store a util value. As a caveat the code behind the util_est_se trace point has to filter UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should be easy to do. This also fixes an issue report by Xuewen Yan that util_est_update() only used UTIL_AVG_UNCHANGED for the subtrahend of the equation: last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED) Fixes: b89997aa88f0b sched/pelt: Fix task util_est update filtering Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Xuewen Yan <[email protected]> Reviewed-by: Vincent Donnefort <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-06-03sched/pelt: Ensure that *_sum is always synced with *_avgVincent Guittot1-2/+8
Rounding in PELT calculation happening when entities are attached/detached of a cfs_rq can result into situations where util/runnable_avg is not null but util/runnable_sum is. This is normally not possible so we need to ensure that util/runnable_sum stays synced with util/runnable_avg. detach_entity_load_avg() is the last place where we don't sync util/runnable_sum with util/runnbale_avg when moving some sched_entities Signed-off-by: Vincent Guittot <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-02bpf, lockdown, audit: Fix buggy SELinux lockdown permission checksDaniel Borkmann2-22/+17
Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") added an implementation of the locked_down LSM hook to SELinux, with the aim to restrict which domains are allowed to perform operations that would breach lockdown. This is indirectly also getting audit subsystem involved to report events. The latter is problematic, as reported by Ondrej and Serhei, since it can bring down the whole system via audit: 1) The audit events that are triggered due to calls to security_locked_down() can OOM kill a machine, see below details [0]. 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() when trying to wake up kauditd, for example, when using trace_sched_switch() tracepoint, see details in [1]. Triggering this was not via some hypothetical corner case, but with existing tools like runqlat & runqslower from bcc, for example, which make use of this tracepoint. Rough call sequence goes like: rq_lock(rq) -> -------------------------+ trace_sched_switch() -> | bpf_prog_xyz() -> +-> deadlock selinux_lockdown() -> | audit_log_end() -> | wake_up_interruptible() -> | try_to_wake_up() -> | rq_lock(rq) --------------+ What's worse is that the intention of 59438b46471a to further restrict lockdown settings for specific applications in respect to the global lockdown policy is completely broken for BPF. The SELinux policy rule for the current lockdown check looks something like this: allow <who> <who> : lockdown { <reason> }; However, this doesn't match with the 'current' task where the security_locked_down() is executed, example: httpd does a syscall. There is a tracing program attached to the syscall which triggers a BPF program to run, which ends up doing a bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does the permission check against 'current', that is, httpd in this example. httpd has literally zero relation to this tracing program, and it would be nonsensical having to write an SELinux policy rule against httpd to let the tracing helper pass. The policy in this case needs to be against the entity that is installing the BPF program. For example, if bpftrace would generate a histogram of syscall counts by user space application: bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' bpftrace would then go and generate a BPF program from this internally. One way of doing it [for the sake of the example] could be to call bpf_get_current_task() helper and then access current->comm via one of bpf_probe_read_kernel{,_str}() helpers. So the program itself has nothing to do with httpd or any other random app doing a syscall here. The BPF program _explicitly initiated_ the lockdown check. The allow/deny policy belongs in the context of bpftrace: meaning, you want to grant bpftrace access to use these helpers, but other tracers on the system like my_random_tracer _not_. Therefore fix all three issues at the same time by taking a completely different approach for the security_locked_down() hook, that is, move the check into the program verification phase where we actually retrieve the BPF func proto. This also reliably gets the task (current) that is trying to install the BPF tracing program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since we're moving this out of the BPF helper's fast-path which can be called several millions of times per second. The check is then also in line with other security_locked_down() hooks in the system where the enforcement is performed at open/load time, for example, open_kcore() for /proc/kcore access or module_sig_check() for module signatures just to pick few random ones. What's out of scope in the fix as well as in other security_locked_down() hook locations /outside/ of BPF subsystem is that if the lockdown policy changes on the fly there is no retrospective action. This requires a different discussion, potentially complex infrastructure, and it's also not clear whether this can be solved generically. Either way, it is out of scope for a suitable stable fix which this one is targeting. Note that the breakage is specifically on 59438b46471a where it started to rely on 'current' as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode"). [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says: I starting seeing this with F-34. When I run a container that is traced with BPF to record the syscalls it is doing, auditd is flooded with messages like: type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality } for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM" scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=lockdown permissive=0 This seems to be leading to auditd running out of space in the backlog buffer and eventually OOMs the machine. [...] auditd running at 99% CPU presumably processing all the messages, eventually I get: Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64 Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1 Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [...] [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/, Serhei Makarov says: Upstream kernel 5.11.0-rc7 and later was found to deadlock during a bpf_probe_read_compat() call within a sched_switch tracepoint. The problem is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on ppc64le. Example stack trace: [...] [ 730.868702] stack backtrace: [ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1 [ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 [ 730.873278] Call Trace: [ 730.873770] dump_stack+0x7f/0xa1 [ 730.874433] check_noncircular+0xdf/0x100 [ 730.875232] __lock_acquire+0x1202/0x1e10 [ 730.876031] ? __lock_acquire+0xfc0/0x1e10 [ 730.876844] lock_acquire+0xc2/0x3a0 [ 730.877551] ? __wake_up_common_lock+0x52/0x90 [ 730.878434] ? lock_acquire+0xc2/0x3a0 [ 730.879186] ? lock_is_held_type+0xa7/0x120 [ 730.880044] ? skb_queue_tail+0x1b/0x50 [ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90 [ 730.881656] ? __wake_up_common_lock+0x52/0x90 [ 730.882532] __wake_up_common_lock+0x52/0x90 [ 730.883375] audit_log_end+0x5b/0x100 [ 730.884104] slow_avc_audit+0x69/0x90 [ 730.884836] avc_has_perm+0x8b/0xb0 [ 730.885532] selinux_lockdown+0xa5/0xd0 [ 730.886297] security_locked_down+0x20/0x40 [ 730.887133] bpf_probe_read_compat+0x66/0xd0 [ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820 [ 730.888917] trace_call_bpf+0xe9/0x240 [ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0 [ 730.890579] perf_trace_sched_switch+0x142/0x180 [ 730.891485] ? __schedule+0x6d8/0xb20 [ 730.892209] __schedule+0x6d8/0xb20 [ 730.892899] schedule+0x5b/0xc0 [ 730.893522] exit_to_user_mode_prepare+0x11d/0x240 [ 730.894457] syscall_exit_to_user_mode+0x27/0x70 [ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae [...] Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") Reported-by: Ondrej Mosnacek <[email protected]> Reported-by: Jakub Hrozek <[email protected]> Reported-by: Serhei Makarov <[email protected]> Reported-by: Jiri Olsa <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Paul Moore <[email protected]> Cc: James Morris <[email protected]> Cc: Jerome Marchand <[email protected]> Cc: Frank Eigler <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-06-01sched: Don't defer CPU pick to migration_cpu_stop()Valentin Schneider1-8/+12
Will reported that the 'XXX __migrate_task() can fail' in migration_cpu_stop() can happen, and it *is* sort of a big deal. Looking at it some more, one will note there is a glaring hole in the deferred CPU selection: (w/ CONFIG_CPUSET=n, so that the affinity mask passed via taskset doesn't get AND'd with cpu_online_mask) $ taskset -pc 0-2 $PID # offline CPUs 3-4 $ taskset -pc 3-5 $PID `\ $PID may stay on 0-2 due to the cpumask_any_distribute() picking an offline CPU and __migrate_task() refusing to do anything due to cpu_is_allowed(). set_cpus_allowed_ptr() goes to some length to pick a dest_cpu that matches the right constraints vs affinity and the online/active state of the CPUs. Reuse that instead of discarding it in the affine_move_task() case. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Reported-by: Will Deacon <[email protected]> Signed-off-by: Valentin Schneider <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-01sched/fair: Fix ascii art by relpacing tabsOdin Ugedal1-4/+4
When using something other than 8 spaces per tab, this ascii art makes not sense, and the reader might end up wondering what this advanced equation "is". Signed-off-by: Odin Ugedal <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-01sched,init: Fix DEBUG_PREEMPT vs early bootPeter Zijlstra1-0/+1
Extend 8fb12156b8db ("init: Pin init task to the boot CPU, initially") to cover the new PF_NO_SETAFFINITY requirement. While there, move wait_for_completion(&kthreadd_done) into kernel_init() to make it absolutely clear it is the very first thing done by the init thread. Fixes: 570a752b7a9b ("lib/smp_processor_id: Use is_percpu_thread() instead of nr_cpus_allowed") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Tested-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-01sched: Add CONFIG_SCHED_CORE help textPeter Zijlstra1-0/+14
Hugh noted that the SCHED_CORE Kconfig option could do with a help text. Requested-by: Hugh Dickins <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Acked-by: Hugh Dickins <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-01kprobes: Remove kprobe::fault_handlerPeter Zijlstra1-19/+0
The reason for kprobe::fault_handler(), as given by their comment: * We come here because instructions in the pre/post * handler caused the page_fault, this could happen * if handler tries to access user space by * copy_from_user(), get_user() etc. Let the * user-specified handler try to fix it first. Is just plain bad. Those other handlers are ran from non-preemptible context and had better use _nofault() functions. Also, there is no upstream usage of this. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-06-01uprobes: Update uprobe_write_opcode() kernel-doc commentQiujun Huang1-0/+1
commit 6d43743e9079 ("Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()") added the parameter @auprobe. Signed-off-by: Qiujun Huang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-01bpf, tnums: Provably sound, faster, and more precise algorithm for tnum_mulHarishankar Vishwanathan1-19/+22
This patch introduces a new algorithm for multiplication of tristate numbers (tnums) that is provably sound. It is faster and more precise when compared to the existing method. Like the existing method, this new algorithm follows the long multiplication algorithm. The idea is to generate partial products by multiplying each bit in the multiplier (tnum a) with the multiplicand (tnum b), and adding the partial products after appropriately bit-shifting them. The new algorithm, however, uses just a single loop over the bits of the multiplier (tnum a) and accumulates only the uncertain components of the multiplicand (tnum b) into a mask-only tnum. The following paper explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. A natural way to construct the tnum product is by performing a tnum addition on all the partial products. This algorithm presents another method of doing this: decompose each partial product into two tnums, consisting of the values and the masks separately. The mask-sum is accumulated within the loop in acc_m. The value-sum tnum is generated using a.value * b.value. The tnum constructed by tnum addition of the value-sum and the mask-sum contains all possible summations of concrete values drawn from the partial product tnums pairwise. We prove this result in the paper. Our evaluations show that the new algorithm is overall more precise (producing tnums with less uncertain components) than the existing method. As an illustrative example, consider the input tnums A and B. The numbers in the parenthesis correspond to (value;mask). A = 000000x1 (1;2) B = 0010011x (38;1) A * B (existing) = xxxxxxxx (0;255) A * B (new) = 0x1xxxxx (32;95) Importantly, we present a proof of soundness of the new algorithm in the aforementioned paper. Additionally, we show that this new algorithm is empirically faster than the existing method. Co-developed-by: Matan Shachnai <[email protected]> Co-developed-by: Srinivas Narayana <[email protected]> Co-developed-by: Santosh Nagarakatte <[email protected]> Signed-off-by: Matan Shachnai <[email protected]> Signed-off-by: Srinivas Narayana <[email protected]> Signed-off-by: Santosh Nagarakatte <[email protected]> Signed-off-by: Harishankar Vishwanathan <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Edward Cree <[email protected]> Link: https://arxiv.org/abs/2105.05398 Link: https://lore.kernel.org/bpf/[email protected]
2021-06-01kgdb: Fix fall-through warning for ClangGustavo A. R. Silva1-0/+1
In preparation to enable -Wimplicit-fallthrough for Clang, fix a fall-through warning by explicitly adding a goto statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <[email protected]> Link: https://lore.kernel.org/r/20210528200222.GA39201@embeddedor Signed-off-by: Daniel Thompson <[email protected]>
2021-06-01kgdb: Fix spelling mistakesZhen Lei3-6/+6
Fix some spelling mistakes in comments: initalization ==> initialization detatch ==> detach represntation ==> representation hexidecimal ==> hexadecimal delimeter ==> delimiter architecure ==> architecture Signed-off-by: Zhen Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daniel Thompson <[email protected]>
2021-05-31timer_list: Print name of per-cpu wakeup deviceWill Deacon3-1/+17
With the introduction of per-cpu wakeup devices that can be used in preference to the broadcast timer, print the name of such devices when they are available. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31tick/broadcast: Program wakeup timer when entering idle if requiredWill Deacon1-1/+43
When configuring the broadcast timer on entry to and exit from deep idle states, prefer a per-CPU wakeup timer if one exists. On entry to idle, stop the tick device and transfer the next event into the oneshot wakeup device, which will serve as the wakeup from idle. To avoid the overhead of additional hardware accesses on exit from idle, leave the timer armed and treat the inevitable interrupt as a (possibly spurious) tick event. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31tick/broadcast: Prefer per-cpu oneshot wakeup timers to broadcastWill Deacon3-4/+61
Some SoCs have two per-cpu timer implementations where the timer with the higher rating stops in deep idle (i.e. suffers from CLOCK_EVT_FEAT_C3STOP) but is otherwise preferable to the timer with the lower rating. In such a design, selecting the higher rated devices relies on a global broadcast timer and IPIs to wake up from deep idle states. To avoid the reliance on a global broadcast timer and also to reduce the overhead associated with the IPI wakeups, extend tick_install_broadcast_device() to manage per-cpu wakeup timers separately from the broadcast device. For now, these timers remain unused. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31tick/broadcast: Split __tick_broadcast_oneshot_control() into a helperWill Deacon1-13/+20
In preparation for adding support for per-cpu wakeup timers, split _tick_broadcast_oneshot_control() into a helper function which deals only with the broadcast timer management across idle transitions. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31tick/broadcast: Drop unneeded CONFIG_GENERIC_CLOCKEVENTS_BROADCAST guardWill Deacon1-2/+0
tick-broadcast.o is only built if CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y so remove the redundant #ifdef guards around the definition of tick_receive_broadcast(). Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31clockevents: Use DEVICE_ATTR_[RO|WO] macrosYueHaibing1-8/+8
Use the DEVICE_ATTR_[RO|WO] helpers instead of plain DEVICE_ATTR, which makes the code a bit shorter and easier to read. Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-31perf: Fix data race between pin_count increment/decrementMarco Elver1-0/+2
KCSAN reports a data race between increment and decrement of pin_count: write to 0xffff888237c2d4e0 of 4 bytes by task 15740 on cpu 1: find_get_context kernel/events/core.c:4617 __do_sys_perf_event_open kernel/events/core.c:12097 [inline] __se_sys_perf_event_open kernel/events/core.c:11933 ... read to 0xffff888237c2d4e0 of 4 bytes by task 15743 on cpu 0: perf_unpin_context kernel/events/core.c:1525 [inline] __do_sys_perf_event_open kernel/events/core.c:12328 [inline] __se_sys_perf_event_open kernel/events/core.c:11933 ... Because neither read-modify-write here is atomic, this can lead to one of the operations being lost, resulting in an inconsistent pin_count. Fix it by adding the missing locking in the CPU-event case. Fixes: fe4b04fa31a6 ("perf: Cure task_oncpu_function_call() races") Reported-by: [email protected] Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-05-31tick/nohz: Only check for RCU deferred wakeup on user/guest entry when neededFrederic Weisbecker2-2/+4
Checking for and processing RCU-nocb deferred wakeup upon user/guest entry is only relevant when nohz_full runs on the local CPU, otherwise the periodic tick should take care of it. Make sure we don't needlessly pollute these fast-paths as a -3% performance regression on a will-it-scale.per_process_ops has been reported so far. Fixes: 47b8ff194c1f (entry: Explicitly flush pending rcuog wakeup before last rescheduling point) Fixes: 4ae7dc97f726 (entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point) Reported-by: kernel test robot <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-05-31sched/fair: Make sure to update tg contrib for blocked loadVincent Guittot1-1/+1
During the update of fair blocked load (__update_blocked_fair()), we update the contribution of the cfs in tg->load_avg if cfs_rq's pelt has decayed. Nevertheless, the pelt values of a cfs_rq could have been recently updated while propagating the change of a child. In this case, cfs_rq's pelt will not decayed because it has already been updated and we don't update tg->load_avg. __update_blocked_fair ... for_each_leaf_cfs_rq_safe: child cfs_rq update cfs_rq_load_avg() for child cfs_rq ... update_load_avg(cfs_rq_of(se), se, 0) ... update cfs_rq_load_avg() for parent cfs_rq -propagation of child's load makes parent cfs_rq->load_sum becoming null -UPDATE_TG is not set so it doesn't update parent cfs_rq->tg_load_avg_contrib .. for_each_leaf_cfs_rq_safe: parent cfs_rq update cfs_rq_load_avg() for parent cfs_rq - nothing to do because parent cfs_rq has already been updated recently so cfs_rq->tg_load_avg_contrib is not updated ... parent cfs_rq is decayed list_del_leaf_cfs_rq parent cfs_rq - but it still contibutes to tg->load_avg we must set UPDATE_TG flags when propagting pending load to the parent Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path") Reported-by: Odin Ugedal <[email protected]> Signed-off-by: Vincent Guittot <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Odin Ugedal <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-05-31sched/fair: Keep load_avg and load_sum syncedVincent Guittot1-6/+5
when removing a cfs_rq from the list we only check _sum value so we must ensure that _avg and _sum stay synced so load_sum can't be null whereas load_avg is not after propagating load in the cgroup hierarchy. Use load_avg to compute load_sum similarly to what is done for util_sum and runnable_sum. Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation") Reported-by: Odin Ugedal <[email protected]> Signed-off-by: Vincent Guittot <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Odin Ugedal <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-05-29Merge tag 'seccomp-fixes-v5.13-rc4' of ↵Linus Torvalds1-14/+16
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fixes from Kees Cook: "This fixes a hard-to-hit race condition in the addfd user_notif feature of seccomp, visible since v5.9. And a small documentation fix" * tag 'seccomp-fixes-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Refactor notification handler to prepare for new semantics Documentation: seccomp: Fix user notification documentation
2021-05-29seccomp: Refactor notification handler to prepare for new semanticsSargun Dhillon1-14/+16
This refactors the user notification code to have a do / while loop around the completion condition. This has a small change in semantic, in that previously we ignored addfd calls upon wakeup if the notification had been responded to, but instead with the new change we check for an outstanding addfd calls prior to returning to userspace. Rodrigo Campos also identified a bug that can result in addfd causing an early return, when the supervisor didn't actually handle the syscall [1]. [1]: https://lore.kernel.org/lkml/[email protected]/ Fixes: 7cf97b125455 ("seccomp: Introduce addfd ioctl to seccomp user notifier") Signed-off-by: Sargun Dhillon <[email protected]> Acked-by: Tycho Andersen <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Kees Cook <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2021-05-28bpf, devmap: Remove drops variable from bq_xmit_all()Hangbin Liu1-5/+2
As Colin pointed out, the first drops assignment after declaration will be overwritten by the second drops assignment before using, which makes it useless. Since the drops variable will be used only once. Just remove it and use "cnt - sent" in trace_xdp_devmap_xmit(). Fixes: cb261b594b41 ("bpf: Run devmap xdp_prog on flush instead of bulk enqueue") Reported-by: Colin Ian King <[email protected]> Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-28cred: add missing return error code when set_cred_ucounts() failedYang Yingliang1-1/+2
If set_cred_ucounts() failed, we need return the error code. Fixes: 905ae01c4ae2 ("Add a reference to ucounts for each cred") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: Alexey Gladkov <[email protected]> Signed-off-by: Eric W. Biederman <[email protected]>
2021-05-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski35-239/+467
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build Signed-off-by: Jakub Kicinski <[email protected]>
2021-05-27perf/hw_breakpoint: Fix DocBook warnings in perf hw_breakpointHaocheng Xie1-0/+2
Fix the following W=1 kernel build warning(s): kernel/events/hw_breakpoint.c:461: warning: Function parameter or member 'context' not described in 'register_user_hw_breakpoint' kernel/events/hw_breakpoint.c:560: warning: Function parameter or member 'context' not described in 'register_wide_hw_breakpoint' Signed-off-by: Haocheng Xie <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-27perf/core: Fix DocBook warningsHaocheng Xie1-0/+4
Fix the following W=1 kernel build warning(s): kernel/events/core.c:143: warning: Function parameter or member 'cpu' not described in 'cpu_function_call' kernel/events/core.c:11924: warning: Function parameter or member 'flags' not described in 'sys_perf_event_open' kernel/events/core.c:12382: warning: Function parameter or member 'overflow_handler' not described in 'perf_event_create_kernel_counter' kernel/events/core.c:12382: warning: Function parameter or member 'context' not described in 'perf_event_create_kernel_counter' Signed-off-by: Haocheng Xie <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-27perf/core: Make local function perf_pmu_snapshot_aux() staticHaocheng Xie1-4/+4
Fixes the following W=1 kernel build warning: kernel/events/core.c:6670:6: warning: no previous prototype for 'perf_pmu_snapshot_aux' [-Wmissing-prototypes] Signed-off-by: Haocheng Xie <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-26Merge tag 'net-5.13-rc4' of ↵Linus Torvalds8-72/+216
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc4, including fixes from bpf, netfilter, can and wireless trees. Notably including fixes for the recently announced "FragAttacks" WiFi vulnerabilities. Rather large batch, touching some core parts of the stack, too, but nothing hair-raising. Current release - regressions: - tipc: make node link identity publish thread safe - dsa: felix: re-enable TAS guard band mode - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid() - stmmac: fix system hang if change mac address after interface ifdown Current release - new code bugs: - mptcp: avoid OOB access in setsockopt() - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers - ethtool: stats: fix a copy-paste error - init correct array size Previous releases - regressions: - sched: fix packet stuck problem for lockless qdisc - net: really orphan skbs tied to closing sk - mlx4: fix EEPROM dump support - bpf: fix alu32 const subreg bound tracking on bitwise operations - bpf: fix mask direction swap upon off reg sign change - bpf, offload: reorder offload callback 'prepare' in verifier - stmmac: Fix MAC WoL not working if PHY does not support WoL - packetmmap: fix only tx timestamp on request - tipc: skb_linearize the head skb when reassembling msgs Previous releases - always broken: - mac80211: address recent "FragAttacks" vulnerabilities - mac80211: do not accept/forward invalid EAPOL frames - mptcp: avoid potential error message floods - bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent out of buffer writes - bpf: forbid trampoline attach for functions with variable arguments - bpf: add deny list of functions to prevent inf recursion of tracing programs - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT - can: isotp: prevent race between isotp_bind() and isotp_setsockopt() - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version Misc: - bpf: add kconfig knob for disabling unpriv bpf by default" * tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits) net: phy: Document phydev::dev_flags bits allocation mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer mptcp: avoid error message on infinite mapping mptcp: drop unconditional pr_warn on bad opt mptcp: avoid OOB access in setsockopt() nfp: update maintainer and mailing list addresses net: mvpp2: add buffer header handling in RX bnx2x: Fix missing error code in bnx2x_iov_init_one() net: zero-initialize tc skb extension on allocation net: hns: Fix kernel-doc sctp: fix the proc_handler for sysctl encap_port sctp: add the missing setting for asoc encap_port bpf, selftests: Adjust few selftest result_unpriv outcomes bpf: No need to simulate speculative domain for immediates bpf: Fix mask direction swap upon off reg sign change bpf: Wrap aux data inside bpf_sanitize_info container bpf: Fix BPF_LSM kconfig symbol dependency selftests/bpf: Add test for l3 use of bpf_redirect_peer bpftool: Add sock_release help info for cgroup attach/prog load command net: dsa: microchip: enable phy errata workaround on 9567 ...
2021-05-27kbuild: clean up ${quiet} checks in shell scriptsMasahiro Yamada1-3/+1
There were efforts to make 'make -s' really silent when it is a warning-free build. The conventional way was to let a shell script check ${quiet}, and if it is 'silent_', suppress the stdout by itself. With the previous commit, the 'cmd' takes care of it now. The 'cmd' is also invoked from if_changed, if_changed_dep, and if_changed_rule. You can omit ${quiet} checks in shell scripts when they are invoked from the 'cmd' macro. Signed-off-by: Masahiro Yamada <[email protected]>