aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-01-24Merge tag 'x86_urgent_for_v5.11_rc5' of ↵Linus Torvalds1-2/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new Intel model number for Alder Lake - Differentiate which aspects of the FPU state get saved/restored when the FPU is used in-kernel and fix a boot crash on K7 due to early MXCSR access before CR4.OSFXSR is even set. - A couple of noinstr annotation fixes - Correct die ID setting on AMD for users of topology information which need the correct die ID - A SEV-ES fix to handle string port IO to/from kernel memory properly * tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add another Alder Lake CPU to the Intel family x86/mmx: Use KFPU_387 for MMX string operations x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state x86/topology: Make __max_die_per_package available unconditionally x86: __always_inline __{rd,wr}msr() x86/mce: Remove explicit/superfluous tracing locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP locking/lockdep: Cure noinstr fail x86/sev: Fix nonistr violation x86/entry: Fix noinstr fail x86/cpu/amd: Set __max_die_per_package on AMD x86/sev-es: Handle string port IO to kernel memory properly
2021-01-24Merge tag 'for-linus-2021-01-24' of ↵Linus Torvalds3-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull misc fixes from Christian Brauner: - Jann reported sparse complaints because of a missing __user annotation in a helper we added way back when we added pidfd_send_signal() to avoid compat syscall handling. Fix it. - Yanfei replaces a reference in a comment to the _do_fork() helper I removed a while ago with a reference to the new kernel_clone() replacement - Alexander Guril added a simple coding style fix * tag 'for-linus-2021-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: kthread: remove comments about old _do_fork() helper Kernel: fork.c: Fix coding style: Do not use {} around single-line statements signal: Add missing __user annotation to copy_siginfo_from_user_any
2021-01-22bpf, inode_storage: Put file handler if no storage was foundPan Bian1-1/+5
Put file f if inode_storage_ptr() returns NULL. Fixes: 8ea636848aca ("bpf: Implement bpf_local_storage for inodes") Signed-off-by: Pan Bian <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: KP Singh <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-22bpf, cgroup: Fix problematic bounds checkLoris Reiff1-1/+1
Since ctx.optlen is signed, a larger value than max_value could be passed, as it is later on used as unsigned, which causes a WARN_ON_ONCE in the copy_to_user. Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Signed-off-by: Loris Reiff <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-22bpf, cgroup: Fix optlen WARN_ON_ONCE toctouLoris Reiff1-0/+5
A toctou issue in `__cgroup_bpf_run_filter_getsockopt` can trigger a WARN_ON_ONCE in a check of `copy_from_user`. `*optlen` is checked to be non-negative in the individual getsockopt functions beforehand. Changing `*optlen` in a race to a negative value will result in a `copy_from_user(ctx.optval, optval, ctx.optlen)` with `ctx.optlen` being a negative integer. Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Signed-off-by: Loris Reiff <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-22sched: Relax the set_cpus_allowed_ptr() semanticsPeter Zijlstra1-11/+10
Now that we have KTHREAD_IS_PER_CPU to denote the critical per-cpu tasks to retain during CPU offline, we can relax the warning in set_cpus_allowed_ptr(). Any spurious kthread that wants to get on at the last minute will get pushed off before it can run. While during CPU online there is no harm, and actual benefit, to allowing kthreads back on early, it simplifies hotplug code and fixes a number of outstanding races. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Lai jiangshan <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22sched: Fix CPU hotplug / tighten is_per_cpu_kthread()Peter Zijlstra1-4/+35
Prior to commit 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") we'd leave any task on the dying CPU and break affinity and force them off at the very end. This scheme had to change in order to enable migrate_disable(). One cannot wait for migrate_disable() to complete while stuck in stop_machine(). Furthermore, since we need at the very least: idle, hotplug and stop threads at any point before stop_machine, we can't break affinity and/or push those away. Under the assumption that all per-cpu kthreads are sanely handled by CPU hotplug, the new code no long breaks affinity or migrates any of them (which then includes the critical ones above). However, there's an important difference between per-cpu kthreads and kthreads that happen to have a single CPU affinity which is lost. The latter class very much relies on the forced affinity breaking and migration semantics previously provided. Use the new kthread_is_per_cpu() infrastructure to tighten is_per_cpu_kthread() and fix the hot-unplug problems stemming from the change. Fixes: 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22sched: Prepare to use balance_push in ttwu()Peter Zijlstra2-5/+7
In preparation of using the balance_push state in ttwu() we need it to provide a reliable and consistent state. The immediate problem is that rq->balance_callback gets cleared every schedule() and then re-set in the balance_push_callback() itself. This is not a reliable signal, so add a variable that stays set during the entire time. Also move setting it before the synchronize_rcu() in sched_cpu_deactivate(), such that we get guaranteed visibility to ttwu(), which is a preempt-disable region. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22workqueue: Restrict affinity change to rescuerPeter Zijlstra1-6/+3
create_worker() will already set the right affinity using kthread_bind_mask(), this means only the rescuer will need to change it's affinity. Howveer, while in cpu-hot-unplug a regular task is not allowed to run on online&&!active as it would be pushed away quite agressively. We need KTHREAD_IS_PER_CPU to survive in that environment. Therefore set the affinity after getting that magic flag. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22workqueue: Tag bound workers with KTHREAD_IS_PER_CPUPeter Zijlstra1-2/+9
Mark the per-cpu workqueue workers as KTHREAD_IS_PER_CPU. Workqueues have unfortunate semantics in that per-cpu workers are not default flushed and parked during hotplug, however a subset does manual flush on hotplug and hard relies on them for correctness. Therefore play silly games.. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22kthread: Extract KTHREAD_IS_PER_CPUPeter Zijlstra2-1/+27
There is a need to distinguish geniune per-cpu kthreads from kthreads that happen to have a single CPU affinity. Geniune per-cpu kthreads are kthreads that are CPU affine for correctness, these will obviously have PF_KTHREAD set, but must also have PF_NO_SETAFFINITY set, lest userspace modify their affinity and ruins things. However, these two things are not sufficient, PF_NO_SETAFFINITY is also set on other tasks that have their affinities controlled through other means, like for instance workqueues. Therefore another bit is needed; it turns out kthread_create_per_cpu() already has such a bit: KTHREAD_IS_PER_CPU, which is used to make kthread_park()/kthread_unpark() work correctly. Expose this flag and remove the implicit setting of it from kthread_create_on_cpu(); the io_uring usage of it seems dubious at best. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22sched: Don't run cpu-online with balance_push() enabledPeter Zijlstra1-2/+14
We don't need to push away tasks when we come online, mark the push complete right before the CPU dies. XXX hotplug state machine has trouble with rollback here. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinityLai Jiangshan1-1/+1
The scheduler won't break affinity for us any more, and we should "emulate" the same behavior when the scheduler breaks affinity for us. The behavior is "changing the cpumask to cpu_possible_mask". And there might be some other CPUs online later while the worker is still running with the pending work items. The worker should be allowed to use the later online CPUs as before and process the work items ASAP. If we use cpu_active_mask here, we can't achieve this goal but using cpu_possible_mask can. Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug") Signed-off-by: Lai Jiangshan <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Acked-by: Tejun Heo <[email protected]> Tested-by: Paul E. McKenney <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-22sched/core: Print out straggler tasks in sched_cpu_dying()Valentin Schneider1-1/+23
Since commit 1cf12e08bc4d ("sched/hotplug: Consolidate task migration on CPU unplug") tasks are expected to move themselves out of a out-going CPU. For most tasks this will be done automagically via BALANCE_PUSH, but percpu kthreads will have to cooperate and move themselves away one way or another. Currently, some percpu kthreads (workqueues being a notable exemple) do not cooperate nicely and can end up on an out-going CPU at the time sched_cpu_dying() is invoked. Print the dying rq's tasks to shed some light on the stragglers. Signed-off-by: Valentin Schneider <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Tested-by: Valentin Schneider <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-21Merge tag 'printk-for-5.11-printk-rework-fixup' of ↵Linus Torvalds2-12/+30
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fixes from Petr Mladek: - Fix line counting and buffer size calculation. Both regressions caused that a reader buffer might not get filled as much as possible. - Restore non-documented behavior of printk() reader API and make it official. It did not fill the last byte of the provided buffer before 5.10. Two architectures, powerpc and um, used it to add the trailing '\0'. There might theoretically be more callers depending on this behavior in userspace. * tag 'printk-for-5.11-printk-rework-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: fix buffer overflow potential for print_text() printk: fix kmsg_dump_get_buffer length calulations printk: ringbuffer: fix line counting
2021-01-21Merge branch 'printk-rework' into for-linusPetr Mladek2-12/+30
2021-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8-13/+27
Conflicts: drivers/net/can/dev.c commit 03f16c5075b2 ("can: dev: can_restart: fix use after free bug") commit 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir") Code move. drivers/net/dsa/b53/b53_common.c commit 8e4052c32d6b ("net: dsa: b53: fix an off by one in checking "vlan->vid"") commit b7a9e0da2d1c ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects") Field rename. Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-20Merge tag 'net-5.11-rc5' of ↵Linus Torvalds7-13/+24
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.11-rc5, including fixes from bpf, wireless, and can trees. Current release - regressions: - nfc: nci: fix the wrong NCI_CORE_INIT parameters Current release - new code bugs: - bpf: allow empty module BTFs Previous releases - regressions: - bpf: fix signed_{sub,add32}_overflows type handling - tcp: do not mess with cloned skbs in tcp_add_backlog() - bpf: prevent double bpf_prog_put call from bpf_tracing_prog_attach - bpf: don't leak memory in bpf getsockopt when optlen == 0 - tcp: fix potential use-after-free due to double kfree() - mac80211: fix encryption issues with WEP - devlink: use right genl user_ptr when handling port param get/set - ipv6: set multicast flag on the multicast route - tcp: fix TCP_USER_TIMEOUT with zero window Previous releases - always broken: - bpf: local storage helpers should check nullness of owner ptr passed - mac80211: fix incorrect strlen of .write in debugfs - cls_flower: call nla_ok() before nla_next() - skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too" * tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) net: systemport: free dev before on error path net: usb: cdc_ncm: don't spew notifications net: mscc: ocelot: Fix multicast to the CPU port tcp: Fix potential use-after-free due to double kfree() bpf: Fix signed_{sub,add32}_overflows type handling can: peak_usb: fix use after free bugs can: vxcan: vxcan_xmit: fix use after free bug can: dev: can_restart: fix use after free bug tcp: fix TCP socket rehash stats mis-accounting net: dsa: b53: fix an off by one in checking "vlan->vid" tcp: do not mess with cloned skbs in tcp_add_backlog() selftests: net: fib_tests: remove duplicate log test net: nfc: nci: fix the wrong NCI_CORE_INIT parameters sh_eth: Fix power down vs. is_opened flag ordering net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled netfilter: rpfilter: mask ecn bits before fib lookup udp: mask TOS bits in udp_v4_early_demux() xsk: Clear pool even for inactive queues bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback sh_eth: Make PHY access aware of Runtime PM to fix reboot crash ...
2021-01-20bpf: Fix signed_{sub,add32}_overflows type handlingDaniel Borkmann1-3/+3
Fix incorrect signed_{sub,add32}_overflows() input types (and a related buggy comment). It looks like this might have slipped in via copy/paste issue, also given prior to 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") the signature of signed_sub_overflows() had s64 a and s64 b as its input args whereas now they are truncated to s32. Thus restore proper types. Also, the case of signed_add32_overflows() is not consistent to signed_sub32_overflows(). Both have s32 as inputs, therefore align the former. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: De4dCr0w <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Acked-by: Alexei Starovoitov <[email protected]>
2021-01-19Merge tag 'task_work-2021-01-19' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+3
Pull task_work fix from Jens Axboe: "The TIF_NOTIFY_SIGNAL change inadvertently removed the unconditional task_work run we had in get_signal(). This caused a regression for some setups, since we're relying on eg ____fput() being run to close and release, for example, a pipe and wake the other end. For 5.11, I prefer the simple solution of just reinstating the unconditional run, even if it conceptually doesn't make much sense - if you need that kind of guarantee, you should be using TWA_SIGNAL instead of TWA_NOTIFY. But it's the trivial fix for 5.11, and would ensure that other potential gotchas/assumptions for task_work don't regress for 5.11. We're looking into further simplifying the task_work notifications for 5.12 which would resolve that too" * tag 'task_work-2021-01-19' of git://git.kernel.dk/linux-block: task_work: unconditionally run task_work from get_signal()
2021-01-19bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callbackMircea Cirjaliu1-1/+1
I assume this was obtained by copy/paste. Point it to bpf_map_peek_elem() instead of bpf_map_pop_elem(). In practice it may have been less likely hit when under JIT given shielded via 84430d4232c3 ("bpf, verifier: avoid retpoline for map push/pop/peek operation"). Fixes: f1a2e44a3aec ("bpf: add queue and stack maps") Signed-off-by: Mircea Cirjaliu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Mauricio Vasquez <[email protected]> Link: https://lore.kernel.org/bpf/AM7PR02MB6082663DFDCCE8DA7A6DD6B1BBA30@AM7PR02MB6082.eurprd02.prod.outlook.com
2021-01-19printk: fix buffer overflow potential for print_text()John Ogness1-9/+27
Before the commit 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer"), msg_print_text() would only write up to size-1 bytes into the provided buffer. Some callers expect this behavior and append a terminator to returned string. In particular: arch/powerpc/xmon/xmon.c:dump_log_buf() arch/um/kernel/kmsg_dump.c:kmsg_dumper_stdout() msg_print_text() has been replaced by record_print_text(), which currently fills the full size of the buffer. This causes a buffer overflow for the above callers. Change record_print_text() so that it will only use size-1 bytes for text data. Also, for paranoia sakes, add a terminator after the text data. And finally, document this behavior so that it is clear that only size-1 bytes are used and a terminator is added. Fixes: 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer") Cc: [email protected] # 5.10+ Signed-off-by: John Ogness <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Acked-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-15Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski7-206/+366
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-01-16 1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support, that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman. 2) Add support for using kernel module global variables (__ksym externs in BPF programs) retrieved via module's BTF, from Andrii Nakryiko. 3) Generalize BPF stackmap's buildid retrieval and add support to have buildid stored in mmap2 event for perf, from Jiri Olsa. 4) Various fixes for cross-building BPF sefltests out-of-tree which then will unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker. 5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann. 6) Clean up driver's XDP buffer init and split into two helpers to init per- descriptor and non-changing fields during processing, from Lorenzo Bianconi. 7) Minor misc improvements to libbpf & bpftool, from Ian Rogers. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) perf: Add build id data in mmap2 event bpf: Add size arg to build_id_parse function bpf: Move stack_map_get_build_id into lib bpf: Document new atomic instructions bpf: Add tests for new BPF atomic operations bpf: Add bitwise atomic instructions bpf: Pull out a macro for interpreting atomic ALU operations bpf: Add instructions for atomic_[cmp]xchg bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Move BPF_STX reserved field check into BPF_STX verifier code bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: x86: Factor out a lookup table for some ALU opcodes bpf: x86: Factor out emission of REX byte bpf: x86: Factor out emission of ModR/M for *(reg + off) tools/bpftool: Add -Wall when building BPF programs bpf, libbpf: Avoid unused function warning on bpf_tail_call_static selftests/bpf: Install btf_dump test cases selftests/bpf: Fix installation of urandom_read selftests/bpf: Move generated test files to $(TEST_GEN_FILES) selftests/bpf: Fix out-of-tree build ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-15Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski6-9/+20
Daniel Borkmann says: ==================== pull-request: bpf 2021-01-16 1) Fix a double bpf_prog_put() for BPF_PROG_{TYPE_EXT,TYPE_TRACING} types in link creation's error path causing a refcount underflow, from Jiri Olsa. 2) Fix BTF validation errors for the case where kernel modules don't declare any new types and end up with an empty BTF, from Andrii Nakryiko. 3) Fix BPF local storage helpers to first check their {task,inode} owners for being NULL before access, from KP Singh. 4) Fix a memory leak in BPF setsockopt handling for the case where optlen is zero and thus temporary optval buffer should be freed, from Stanislav Fomichev. 5) Fix a syzbot memory allocation splat in BPF_PROG_TEST_RUN infra for raw_tracepoint caused by too big ctx_size_in, from Song Liu. 6) Fix LLVM code generation issues with verifier where PTR_TO_MEM{,_OR_NULL} registers were spilled to stack but not recognized, from Gilad Reti. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: MAINTAINERS: Update my email address selftests/bpf: Add verifier test for PTR_TO_MEM spill bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling bpf: Reject too big ctx_size_in for raw_tp test run libbpf: Allow loading empty BTFs bpf: Allow empty module BTFs bpf: Don't leak memory in bpf getsockopt when optlen == 0 bpf: Update local storage test to check handling of null ptrs bpf: Fix typo in bpf_inode_storage.c bpf: Local storage helpers should check nullness of owner ptr passed bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attach ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-15printk: fix kmsg_dump_get_buffer length calulationsJohn Ogness1-2/+2
kmsg_dump_get_buffer() uses @syslog to determine if the syslog prefix should be written to the buffer. However, when calculating the maximum number of records that can fit into the buffer, it always counts the bytes from the syslog prefix. Use @syslog when calculating the maximum number of records that can fit into the buffer. Fixes: e2ae715d66bf ("kmsg - kmsg_dump() use iterator to receive log buffer content") Signed-off-by: John Ogness <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Acked-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-15printk: ringbuffer: fix line countingJohn Ogness1-1/+1
Counting text lines in a record simply involves counting the number of newline characters (+1). However, it is searching the full data block for newline characters, even though the text data can be (and often is) a subset of that area. Since the extra area in the data block was never initialized, the result is that extra newlines may be seen and counted. Restrict newline searching to the text data length. Fixes: b6cf8b3f3312 ("printk: add lockless ringbuffer") Signed-off-by: John Ogness <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Acked-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-14perf: Add build id data in mmap2 eventJiri Olsa1-4/+28
Adding support to carry build id data in mmap2 event. The build id data replaces maj/min/ino/ino_generation fields, which are also used to identify map's binary, so it's ok to replace them with build id data: union { struct { u32 maj; u32 min; u64 ino; u64 ino_generation; }; struct { u8 build_id_size; u8 __reserved_1; u16 __reserved_2; u8 build_id[20]; }; }; Replaced maj/min/ino/ino_generation fields give us size of 24 bytes. We use 20 bytes for build id data, 1 byte for size and rest is unused. There's new misc bit for mmap2 to signal there's build id data in it: #define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14) Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Add size arg to build_id_parse functionJiri Olsa1-1/+1
It's possible to have other build id types (other than default SHA1). Currently there's also ld support for MD5 build id. Adding size argument to build_id_parse function, that returns (if defined) size of the parsed build id, so we can recognize the build id type. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Move stack_map_get_build_id into libJiri Olsa1-139/+4
Moving stack_map_get_build_id into lib with declaration in linux/buildid.h header: int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id); This function returns build id for given struct vm_area_struct. There is no functional change to stack_map_get_build_id function. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Add bitwise atomic instructionsBrendan Jackman3-4/+26
This adds instructions for atomic[64]_[fetch_]and atomic[64]_[fetch_]or atomic[64]_[fetch_]xor All these operations are isomorphic enough to implement with the same verifier, interpreter, and x86 JIT code, hence being a single commit. The main interesting thing here is that x86 doesn't directly support the fetch_ version these operations, so we need to generate a CMPXCHG loop in the JIT. This requires the use of two temporary registers, IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Pull out a macro for interpreting atomic ALU operationsBrendan Jackman1-41/+39
Since the atomic operations that are added in subsequent commits are all isomorphic with BPF_ADD, pull out a macro to avoid the interpreter becoming dominated by lines of atomic-related code. Note that this sacrificies interpreter performance (combining STX_ATOMIC_W and STX_ATOMIC_DW into single switch case means that we need an extra conditional branch to differentiate them) in favour of compact and (relatively!) simple C code. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Add instructions for atomic_[cmp]xchgBrendan Jackman3-2/+52
This adds two atomic opcodes, both of which include the BPF_FETCH flag. XCHG without the BPF_FETCH flag would naturally encode atomic_set. This is not supported because it would be of limited value to userspace (it doesn't imply any barriers). CMPXCHG without BPF_FETCH woulud be an atomic compare-and-write. We don't have such an operation in the kernel so it isn't provided to BPF either. There are two significant design decisions made for the CMPXCHG instruction: - To solve the issue that this operation fundamentally has 3 operands, but we only have two register fields. Therefore the operand we compare against (the kernel's API calls it 'old') is hard-coded to be R0. x86 has similar design (and A64 doesn't have this problem). A potential alternative might be to encode the other operand's register number in the immediate field. - The kernel's atomic_cmpxchg returns the old value, while the C11 userspace APIs return a boolean indicating the comparison result. Which should BPF do? A64 returns the old value. x86 returns the old value in the hard-coded register (and also sets a flag). That means return-old-value is easier to JIT, so that's what we use. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Add BPF_FETCH field / create atomic_fetch_add instructionBrendan Jackman3-9/+44
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC instructions, in order to have the previous value of the atomically-modified memory location loaded into the src register after an atomic op is carried out. Suggested-by: Yonghong Song <[email protected]> Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Move BPF_STX reserved field check into BPF_STX verifier codeBrendan Jackman1-7/+6
I can't find a reason why this code is in resolve_pseudo_ldimm64; since I'll be modifying it in a subsequent commit, tidy it up. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-14bpf: Rename BPF_XADD and prepare to encode other atomics in .immBrendan Jackman3-21/+40
A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations. In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD. This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero. All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-13bpf: Support PTR_TO_MEM{,_OR_NULL} register spillingGilad Reti1-0/+2
Add support for pointer to mem register spilling, to allow the verifier to track pointers to valid memory addresses. Such pointers are returned for example by a successful call of the bpf_ringbuf_reserve helper. The patch was partially contributed by CyberArk Software, Inc. Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Suggested-by: Yonghong Song <[email protected]> Signed-off-by: Gilad Reti <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: KP Singh <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-13genirq: Export irq_check_status_bit()Thomas Gleixner1-0/+1
One of the users can be built modular: ERROR: modpost: "irq_check_status_bit" [drivers/perf/arm_spe_pmu.ko] undefined! Fixes: fdd029630434 ("genirq: Move status flag checks to core") Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-12bpf: Support BPF ksym variables in kernel modulesAndrii Nakryiko3-30/+178
Add support for directly accessing kernel module variables from BPF programs using special ldimm64 instructions. This functionality builds upon vmlinux ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow specifying kernel module BTF's FD in insn[1].imm field. During BPF program load time, verifier will resolve FD to BTF object and will take reference on BTF object itself and, for module BTFs, corresponding module as well, to make sure it won't be unloaded from under running BPF program. The mechanism used is similar to how bpf_prog keeps track of used bpf_maps. One interesting change is also in how per-CPU variable is determined. The logic is to find .data..percpu data section in provided BTF, but both vmlinux and module each have their own .data..percpu entries in BTF. So for module's case, the search for DATASEC record needs to look at only module's added BTF types. This is implemented with custom search function. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Hao Luo <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12bpf: Fix a verifier message for alloc size helper argBrendan Jackman1-1/+1
The error message here is misleading, the argument will be rejected unless it is a known constant. Signed-off-by: Brendan Jackman <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12Merge tag 'irqchip-fixes-5.11-1' of ↵Thomas Gleixner1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix the MIPS CPU interrupt controller hierarchy - Simplify the PRUSS Kconfig entry - Eliminate trivial build warnings on the MIPS Loongson liointc - Fix error path in devm_platform_get_irqs_affinity() - Turn the BCM2836 IPI irq_eoi callback into irq_ack - Fix initialisation of on-stack msi_alloc_info - Cleanup spurious comma in irq-sl28cpld Link: https://lore.kernel.org/r/[email protected]
2021-01-12ntp: Fix RTC synchronization on 32-bit platformsGeert Uytterhoeven1-2/+2
Due to an integer overflow, RTC synchronization now happens every 2s instead of the intended 11 minutes. Fix this by forcing 64-bit arithmetic for the sync period calculation. Annotate the other place which multiplies seconds for consistency as well. Fixes: c9e6189fb03123a7 ("ntp: Make the RTC synchronization more reliable") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-12timekeeping: Remove unused get_seconds()Chunguang Xu1-2/+1
The get_seconds() cleanup seems to have been completed, now it is time to delete the legacy interface to avoid misuse later. Signed-off-by: Chunguang Xu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-12bpf: Allow empty module BTFsAndrii Nakryiko1-1/+1
Some modules don't declare any new types and end up with an empty BTF, containing only valid BTF header and no types or strings sections. This currently causes BTF validation error. There is nothing wrong with such BTF, so fix the issue by allowing module BTFs with no types or strings. Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs") Reported-by: Christopher William Snowhill <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEPPeter Zijlstra1-1/+6
vmlinux.o: warning: objtool: lock_is_held_type()+0x60: call to check_flags.part.0() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-12locking/lockdep: Cure noinstr failPeter Zijlstra1-1/+1
When the compiler doesn't feel like inlining, it causes a noinstr fail: vmlinux.o: warning: objtool: lock_is_held_type()+0xb: call to lockdep_enabled() leaves .noinstr.text section Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion") Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-01-12bpf: Don't leak memory in bpf getsockopt when optlen == 0Stanislav Fomichev1-2/+3
optlen == 0 indicates that the kernel should ignore BPF buffer and use the original one from the user. We, however, forget to free the temporary buffer that we've allocated for BPF. Fixes: d8fe449a9c51 ("bpf: Don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE") Reported-by: Martin KaFai Lau <[email protected]> Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12bpf: Fix typo in bpf_inode_storage.cKP Singh1-2/+2
Fix "gurranteed" -> "guaranteed" in bpf_inode_storage.c Suggested-by: Andrii Nakryiko <[email protected]> Signed-off-by: KP Singh <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12bpf: Local storage helpers should check nullness of owner ptr passedKP Singh2-2/+8
The verifier allows ARG_PTR_TO_BTF_ID helper arguments to be NULL, so helper implementations need to check this before dereferencing them. This was already fixed for the socket storage helpers but not for task and inode. The issue can be reproduced by attaching an LSM program to inode_rename hook (called when moving files) which tries to get the inode of the new file without checking for its nullness and then trying to move an existing file to a new path: mv existing_file new_file_does_not_exist The report including the sample program and the steps for reproducing the bug: https://lore.kernel.org/bpf/CANaYP3HWkH91SN=wTNO9FL_2ztHfqcXKX38SSE-JJ2voh+vssw@mail.gmail.com Fixes: 4cf1bc1f1045 ("bpf: Implement task local storage") Fixes: 8ea636848aca ("bpf: Implement bpf_local_storage for inodes") Reported-by: Gilad Reti <[email protected]> Signed-off-by: KP Singh <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-12bpf: Prevent double bpf_prog_put call from bpf_tracing_prog_attachJiri Olsa1-2/+4
The bpf_tracing_prog_attach error path calls bpf_prog_put on prog, which causes refcount underflow when it's called from link_create function. link_create prog = bpf_prog_get <-- get ... tracing_bpf_link_attach(prog.. bpf_tracing_prog_attach(prog.. out_put_prog: bpf_prog_put(prog); <-- put if (ret < 0) bpf_prog_put(prog); <-- put Removing bpf_prog_put call from bpf_tracing_prog_attach and making sure its callers call it instead. Fixes: 4a1e7c0c63e0 ("bpf: Support attaching freplace programs to multiple attach points") Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-01-11Merge tag 'trace-v5.11-rc2' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Blacklist properly on all archs. The code to blacklist notrace functions for kprobes was not using the right kconfig option, which caused some archs (powerpc) to possibly not blacklist them" * tag 'trace-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Do the notrace functions check without kprobes on ftrace