Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull procfs fixes from Christian Brauner:
"Mode changes to files under /proc/<pid>/ aren't supported ever since
commit 6d76fa58b050 ("Don't allow chmod() on the /proc/<pid>/ files").
Due to an oversight in commit 1b3044e39a89 ("procfs: fix pthread
cross-thread naming if !PR_DUMPABLE") in switching from REG to NOD,
mode changes on /proc/thread-self/comm were accidently allowed.
Similar, mode changes for all files beneath /proc/<pid>/net/ are
blocked but mode changes on /proc/<pid>/net itself were accidently
allowed.
Both issues come down to not using the generic proc_setattr() helper
which blocks all mode changes. This is rectified with this pull
request.
This also removes a strange nolibc test that abused /proc/<pid>/net
for testing mode changes. Using procfs for this test never made a lot
of sense given procfs has special semantics for almost everything
anway.
Both changes are minor user-visible changes. It is however very
unlikely that mode changes on proc/<pid>/net and
/proc/thread-self/comm are something that userspace relies on"
* tag 'v6.6-fs.proc.uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
procfs: block chmod on /proc/thread-self/comm
proc: use generic setattr() for /proc/$PID/net
selftests/nolibc: drop test chmod_net
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fchmodat2 system call from Christian Brauner:
"This adds the fchmodat2() system call. It is a revised version of the
fchmodat() system call, adding a missing flag argument. Support for
both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.
Adding this system call revision has been a longstanding request but
so far has always fallen through the cracks. While the kernel
implementation of fchmodat() does not have a flag argument the libc
provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
(see [1] and [2]).
The workaround is brittle because it relies not just on O_PATH and
O_NOFOLLOW semantics and procfs magic links but also on our rather
inconsistent symlink semantics.
This gives userspace a proper fchmodat2() system call that libcs can
use to properly implement fchmodat(3) and allows them to get rid of
their hacks. In this case it will immediately benefit them as the
current workaround is already defunct because of aformentioned
inconsistencies.
In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
AT_EMPTY_PATH with fchmodat2(). This is already possible with
fchownat() so there's no reason to not also support it for
fchmodat2().
The implementation is simple and comes with selftests. Implementation
of the system call and wiring up the system call are done as separate
patches even though they could arguably be one patch. But in case
there are merge conflicts from other system call additions it can be
beneficial to have separate patches"
Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]
* tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: fchmodat2: remove duplicate unneeded defines
fchmodat2: add support for AT_EMPTY_PATH
selftests: Add fchmodat2 selftest
arch: Register fchmodat2, usually as syscall 452
fs: Add fchmodat2()
Non-functional cleanup of a "__user * filename"
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v6.6
The rest of the updates for v6.6, some of the highlights include:
- A big API cleanup from Morimoto-san, rationalising the places we put
functions.
- Lots of work on the SOF framework, AMD and Intel drivers, including a
lot of cleanup and new device support.
- Standardisation of the presentation of jacks from drivers.
- Provision of some generic sound card DT properties.
- Conversion oof more drivers to the maple tree register cache.
- New drivers for AMD Van Gogh, AWInic AW88261, Cirrus Logic cs42l43,
various Intel platforms, Mediatek MT7986, RealTek RT1017 and StarFive
JH7110.
|
|
Test different variations of single-stepping into interrupts:
- SVC and PGM interrupts;
- Interrupts generated by ISKE;
- Interrupts generated by instructions emulated by KVM;
- Interrupts generated by instructions emulated by userspace.
Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: Ilya Leoshkevich <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Claudio Imbrenda <[email protected]>
[[email protected]: s/ASSERT_EQ/TEST_ASSERT_EQ/ because function was
renamed in the selftest printf series]
Signed-off-by: Janosch Frank <[email protected]>
|
|
If failed to set link1_1 to netns client, we should delete link1_1 in the
cleanup path. But if set link1_1 to netns client successfully, delete
link1_1 will report warning. So it will be safer creating directly the
devices in the target namespaces.
Reported-by: Hangbin Liu <[email protected]>
Closes: https://lore.kernel.org/all/ZNyJx1HtXaUzOkNA@Laptop-X1/
Signed-off-by: Zhengchao Shao <[email protected]>
Acked-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This test will need to be updated if new ciphers are added.
Signed-off-by: Sabrina Dubroca <[email protected]>
Link: https://lore.kernel.org/r/bfcfa9cffda56d2064296ab7c99a05775dd4c28e.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The kernel accepts fetching either just the version and cipher type,
or exactly the per-cipher struct. Also check that getsockopt returns
what we just passed to the kernel.
Signed-off-by: Sabrina Dubroca <[email protected]>
Link: https://lore.kernel.org/r/81a007ca13de9a74f4af45635d06682cdb385a54.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Only supported for TLS1.2.
Signed-off-by: Sabrina Dubroca <[email protected]>
Link: https://lore.kernel.org/r/ccf4a4d3f3820f8ff30431b7629f5210cb33fa89.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add support for using NLM_F_REPLACE, _EXCL, _CREATE and _APPEND flags
in requests.
Signed-off-by: Donald Hunter <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add support for the 'array-nest' attribute type that is used by several
netlink-raw families.
Signed-off-by: Donald Hunter <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Refactor the ynl code to encapsulate protocol specifics into
NetlinkProtocol and GenlProtocol.
Signed-off-by: Donald Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move decode_fixed_header into YnlFamily and add a _fixed_header_size
method to allow extack decoding to skip the fixed header.
Signed-off-by: Donald Hunter <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a SpecMcastGroup class to the nlspec lib.
Signed-off-by: Donald Hunter <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We use a tempfile for code generation, to avoid wiping the target
file out if the code generator crashes. File contents are copied
from tempfile to actual destination at the end of main().
uAPI generation is relatively simple so when generating the uAPI
header we return from main() early, and never reach the "copy code
over" stage. Since commit under Fixes uAPI headers are not updated
by ynl-gen.
Move the copy/commit of the code into CodeWriter, to make it
easier to call at any point in time. Hook it into the destructor
to make sure we don't miss calling it.
Fixes: f65f305ae008 ("tools: ynl-gen: use temporary file for rendering")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-25
We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).
The main changes are:
1) Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds,
from Jiri Olsa.
2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
from Xu Kuohai.
3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
from Pu Lehui.
4) Fix LWT BPF xmit hooks wrt their return values where propagating
the result from skb_do_redirect() would trigger a use-after-free,
from Yan Zhai.
5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
where the map's value kptr type and locally allocated obj type
mismatch, from Yonghong Song.
6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
root/node which bypassed reg->off == 0 enforcement,
from Kumar Kartikeya Dwivedi.
7) Lift BPF verifier restriction in networking BPF programs to treat
comparison of packet pointers not as a pointer leak,
from Yafang Shao.
8) Remove unmaintained XDP BPF samples as they are maintained
in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.
9) Batch of fixes for the tracing programs from BPF samples in order
to make them more libbpf-aware, from Daniel T. Lee.
10) Fix a libbpf signedness determination bug in the CO-RE relocation
handling logic, from Andrii Nakryiko.
11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
fixes for bpf_refcount shared ownership implementation,
both from Dave Marchevsky.
12) Add a new bpf_object__unpin() API function to libbpf,
from Daniel Xu.
13) Fix a memory leak in libbpf to also free btf_vmlinux
when the bpf_object gets closed, from Hao Luo.
14) Small error output improvements to test_bpf module, from Helge Deller.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
bpf: Consider non-owning refs to refcounted nodes RCU protected
bpf: Reenable bpf_refcount_acquire
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
bpf: Consider non-owning refs trusted
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
selftests/bpf: Enable cpu v4 tests for RV64
riscv, bpf: Support unconditional bswap insn
riscv, bpf: Support signed div/mod insns
riscv, bpf: Support 32-bit offset jmp insn
riscv, bpf: Support sign-extension mov insns
riscv, bpf: Support sign-extension load insns
riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
samples/bpf: Cleanup .gitignore
samples/bpf: Remove the xdp_sample_pkts utility
samples/bpf: Remove the xdp1 and xdp2 utilities
samples/bpf: Remove the xdp_rxq_info utility
samples/bpf: Remove the xdp_redirect* utilities
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Merge devfreq changes and power management tools changes for 6.6-rc1:
- Fix memory leak in devfreq_dev_release() (Boris Brezillon).
- Rewrite devfreq_monitor_start() kerneldoc comment (Manivannan
Sadhasivam).
- Explicitly include correct DT includes in devfreq (Rob Herring).
- Add turbo-boost support to cpupower (Wyes Karny).
- Add support for amd_pstate mode change to cpupower (Wyes Karny).
- Fix 'cpupower idle_set' command to accept only numeric values of
arguments (Likhitha Korrapati).
* pm-devfreq:
PM / devfreq: Fix leak in devfreq_dev_release()
PM / devfreq: Reword the kernel-doc comment for devfreq_monitor_start() API
PM / devfreq: Explicitly include correct DT includes
* pm-tools:
cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.
cpupower: Add turbo-boost support in cpupower
cpupower: Add support for amd_pstate mode change
cpupower: Add EPP value change support
cpupower: Add is_valid_path API
cpupower: Recognise amd-pstate active mode driver
cpupower: Bump soname version
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
issues or aren't considered suitable for a -stable backport"
* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
shmem: fix smaps BUG sleeping while atomic
selftests: cachestat: catch failing fsync test on tmpfs
selftests: cachestat: test for cachestat availability
maple_tree: disable mas_wr_append() when other readers are possible
madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
mm: multi-gen LRU: don't spin during memcg release
mm: memory-failure: fix unexpected return value in soft_offline_page()
radix tree: remove unused variable
mm: add a call to flush_cache_vmap() in vmap_pfn()
selftests/mm: FOLL_LONGTERM need to be updated to 0x100
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
selftests: cgroup: fix test_kmem_basic less than error
mm: enable page walking API to lock vmas during the walk
smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
|
|
Confirm that the following sleepable prog states fail verification:
* bpf_rcu_read_unlock before bpf_spin_unlock
* RCU CS will last at least as long as spin_lock CS
Also confirm that correct usage passes verification, specifically:
* Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
* Implied RCU CS due to spin_lock CS
None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.
This a revert of:
* commit f3514a5d6740 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
* commit 7deca5eae833 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")
Signed-off-by: Dave Marchevsky <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Explicitly set the exception vector to #UD when potentially injecting an
exception in sync_regs_test's subtests that try to detect TOCTOU bugs
in KVM's handling of exceptions injected by userspace. A side effect of
the original KVM bug was that KVM would clear the vector, but relying on
KVM to clear the vector (i.e. make it #DE) makes it less likely that the
test would ever find *new* KVM bugs, e.g. because only the first iteration
would run with a legal vector to start.
Explicitly inject #UD for race_events_inj_pen() as well, e.g. so that it
doesn't inherit the illegal 255 vector from race_events_exc(), which
currently runs first.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Reload known good vCPU state if the vCPU triple faults in any of the
race_sync_regs() subtests, e.g. if KVM successfully injects an exception
(the vCPU isn't configured to handle exceptions). On Intel, the VMCS
is preserved even after shutdown, but AMD's APM states that the VMCB is
undefined after a shutdown and so KVM synthesizes an INIT to sanitize
vCPU/VMCB state, e.g. to guard against running with a garbage VMCB.
The synthetic INIT results in the vCPU never exiting to userspace, as it
gets put into Real Mode at the reset vector, which is full of zeros (as is
GPA 0 and beyond), and so executes ADD for a very, very long time.
Fixes: 60c4063b4752 ("KVM: selftests: Extend x86's sync_regs_test to check for event vector races")
Cc: Michal Luczaj <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
* for-next/selftests: (22 commits)
kselftest/arm64: Fix hwcaps selftest build
kselftest/arm64: add jscvt feature to hwcap test
kselftest/arm64: add pmull feature to hwcap test
kselftest/arm64: add AES feature check to hwcap test
kselftest/arm64: add SHA1 and related features to hwcap test
kselftest/arm64: build BTI tests in output directory
kselftest/arm64: fix a memleak in zt_regs_run()
kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
kselftest/arm64: add lse and lse2 features to hwcap test
kselftest/arm64: add test item that support to capturing the SIGBUS signal
kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers
kselftest/arm64: add crc32 feature to hwcap test
kselftest/arm64: add float-point feature to hwcap test
kselftest/arm64: Use the tools/include compiler.h rather than our own
kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
kselftest/arm64: Make the tools/include headers available
tools include: Add some common function attributes
tools compiler.h: Add OPTIMIZER_HIDE_VAR()
kselftest/arm64: Exit streaming mode after collecting signal context
kselftest/arm64: add RCpc load-acquire to hwcap test
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix ring buffer being permanently disabled due to missed
record_disabled()
Changing the trace cpu mask will disable the ring buffers for the
CPUs no longer in the mask. But it fails to update the snapshot
buffer. If a snapshot takes place, the accounting for the ring buffer
being disabled is corrupted and this can lead to the ring buffer
being permanently disabled.
- Add test case for snapshot and cpu mask working together
- Fix memleak by the function graph tracer not getting closed properly.
The iterator is used to read the ring buffer. When it opens, it calls
the open function of a tracer, and when it is closed, it calls the
close iteration. While a trace is being read, it is still possible to
change the tracer.
If this happens between the function graph tracer and the wakeup
tracer (which uses function graph tracing), the tracers are not
closed properly during when the iterator sees the switch, and the
wakeup function did not initialize its private pointer to NULL, which
is used to know if the function graph tracer was the last tracer. It
could be fooled in thinking it is, but then on exit it does not call
the close function of the function graph tracer to clean up its data.
- Fix synthetic events on big endian machines, by introducing a union
that does the conversions properly.
- Fix synthetic events from printing out the number of elements in the
stacktrace when it shouldn't.
- Fix synthetic events stacktrace to not print a bogus value at the
end.
- Introduce a pipe_cpumask that prevents the trace_pipe files from
being opened by more than one task (file descriptor).
There was a race found where if splice is called, the iter->ent could
become stale and events could be missed. There's no point reading a
producer/consumer file by more than one task as they will corrupt
each other anyway. Add a cpumask that keeps track of the per_cpu
trace_pipe files as well as the global trace_pipe file that prevents
more than one open of a trace_pipe file that represents the same ring
buffer. This prevents the race from happening.
- Fix ftrace samples for arm64 to work with older compilers.
* tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
samples: ftrace: Replace bti assembly with hint for older compiler
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
tracing: Fix memleak due to race between current_tracer and trace
tracing/synthetic: Allocate one additional element for size
tracing/synthetic: Skip first entry for stack traces
tracing/synthetic: Use union instead of casts
selftests/ftrace: Add a basic testcase for snapshot
tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
|
|
Differentiate between empty list and None for member lists.
New families may want to create request responses with no attribute.
If we treat those the same as None we end up rendering
a full parsing policy in user space, instead of an empty one.
Reviewed-by: Donald Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We look for attributes inside do.request, but there's another
layer of nesting in the spec, look inside do.request.attributes.
This bug had no effect as all global policies we generate (fou)
seem to be full, anyway, and we treat full and empty the same.
Next patch will change the treatment of empty policies.
Reviewed-by: Donald Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Remember to set the length field in the request setters.
Reviewed-by: Donald Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Recent changes made us assume that input for binary data is in hex.
When using YNL as a Python library it's possible to pass in raw bytes.
Bring the ability to do that back.
Reviewed-by: Donald Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Remove comparing pointer to 0 to avoid this warning from coccinelle:
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0, suggest !E
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Anh Tuan Phan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Currently, not all kernel memory usage is being accounted for. This
commit switches to using the kernel entry within memory.stat which
already includes kernel_stack, pagetables, and slab. The kernel entry
also includes vmalloc and other additional kernel memory use cases which
were missing.
Link: https://lkml.kernel.org/r/bvrhe2tpsts2azaroq4ubp2slawmop6orndsswrewuscw3ugvk@kmemmrttsnc7
Signed-off-by: Lucas Karpinski <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Zefan Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "New page table range API", v6.
This patchset changes the API used by the MM to set up page table entries.
The four APIs are:
set_ptes(mm, addr, ptep, pte, nr)
update_mmu_cache_range(vma, addr, ptep, nr)
flush_dcache_folio(folio)
flush_icache_pages(vma, page, nr)
flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.
The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you.
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.
One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking. This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.
The point of all this is better performance, and Fengwei Yin has measured
improvement on x86. I suspect you'll see improvement on your architecture
too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/[email protected]/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.
This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.
This patch (of 38):
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type. Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
|
|
The cachestat kselftest runs a test on a normal file, which is created
temporarily in the current directory. Among the tests it runs there is a
call to fsync(), which is expected to clean all dirty pages used by the
file.
However the tmpfs filesystem implements fsync() as noop_fsync(), so the
call will not even attempt to clean anything when this test file happens
to live on a tmpfs instance. This happens in an initramfs, or when the
current directory is in /dev/shm or sometimes /tmp.
To avoid this test failing wrongly, use statfs() to check which filesystem
the test file lives on. If that is "tmpfs", we skip the fsync() test.
Since the fsync test is only one part of the "normal file" test, we now
execute this twice, skipping the fsync part on the first call. This way
only the second test, including the fsync part, would be skipped.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andre Przywara <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Nhat Pham <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "selftests: cachestat: fix run on older kernels", v2.
I ran all kernel selftests on some test machine, and stumbled upon
cachestat failing (among others). These patches fix the run on older
kernels and when the current directory is on a tmpfs instance.
This patch (of 2):
As cachestat is a new syscall, it won't be available on older kernels, for
instance those running on a development machine. At the moment the test
reports all tests as "not ok" in this case.
Test for the cachestat syscall availability first, before doing further
tests, and bail out early with a TAP SKIP comment.
This also uses the opportunity to add the proper TAP headers, and add one
check for proper error handling (illegal file descriptor).
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andre Przywara <[email protected]>
Acked-by: Nhat Pham <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
include/net/inet_sock.h
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
c274af224269 ("inet: introduce inet->inet_flags")
https://lore.kernel.org/all/[email protected]/
Adjacent changes:
drivers/net/bonding/bond_alb.c
e74216b8def3 ("bonding: fix macvlan over alb bond support")
f11e5bd159b0 ("bonding: support balance-alb with openvswitch")
drivers/net/ethernet/broadcom/bgmac.c
d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()")
23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()")
drivers/net/ethernet/broadcom/genet/bcmmii.c
32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()")
acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()")
net/sctp/socket.c
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Enable cpu v4 tests for RV64, and the relevant tests have passed.
Signed-off-by: Pu Lehui <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from wifi, can and netfilter.
Fixes to fixes:
- nf_tables:
- GC transaction race with abort path
- defer gc run if previous batch is still pending
Previous releases - regressions:
- ipv4: fix data-races around inet->inet_id
- phy: fix deadlocking in phy_error() invocation
- mdio: fix C45 read/write protocol
- ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
- ice: fix NULL pointer deref during VF reset
- i40e: fix potential NULL pointer dereferencing of pf->vf in
i40e_sync_vsi_filters()
- tg3: use slab_build_skb() when needed
- mtk_eth_soc: fix NULL pointer on hw reset
Previous releases - always broken:
- core: validate veth and vxcan peer ifindexes
- sched: fix a qdisc modification with ambiguous command request
- devlink: add missing unregister linecard notification
- wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
- batman:
- do not get eth header before batadv_check_management_packet
- fix batadv_v_ogm_aggr_send memory leak
- bonding: fix macvlan over alb bond support
- mlxsw: set time stamp fields also when its type is MIRROR_UTC"
* tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
selftests: bonding: add macvlan over bond testing
selftest: bond: add new topo bond_topo_2d1c.sh
bonding: fix macvlan over alb bond support
rtnetlink: Reject negative ifindexes in RTM_NEWLINK
netfilter: nf_tables: defer gc run if previous batch is still pending
netfilter: nf_tables: fix out of memory error handling
netfilter: nf_tables: use correct lock to protect gc_list
netfilter: nf_tables: GC transaction race with abort path
netfilter: nf_tables: flush pending destroy work before netlink notifier
netfilter: nf_tables: validate all pending tables
ibmveth: Use dcbf rather than dcbfl
i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
net/sched: fix a qdisc modification with ambiguous command request
igc: Fix the typo in the PTM Control macro
batman-adv: Hold rtnl lock during MTU update via netlink
igb: Avoid starting unnecessary workqueues
can: raw: add missing refcount for memory leak fix
can: isotp: fix support for transmission of SF without flow control
bnx2x: new flag for track HW resource allocation
sfc: allocate a big enough SKB for loopback selftest packet
...
|
|
Add a local kptr test with no special fields in the struct. Without the
previous patch, the following warning will hit:
[ 44.683877] WARNING: CPU: 3 PID: 485 at kernel/bpf/syscall.c:660 bpf_obj_free_fields+0x220/0x240
[ 44.684640] Modules linked in: bpf_testmod(OE)
[ 44.685044] CPU: 3 PID: 485 Comm: kworker/u8:5 Tainted: G OE 6.5.0-rc5-01703-g260d855e9b90 #248
[ 44.685827] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 44.686693] Workqueue: events_unbound bpf_map_free_deferred
[ 44.687297] RIP: 0010:bpf_obj_free_fields+0x220/0x240
[ 44.687775] Code: e8 55 17 1f 00 49 8b 74 24 08 4c 89 ef e8 e8 14 05 00 e8 a3 da e2 ff e9 55 fe ff ff 0f 0b e9 4e fe ff
ff 0f 0b e9 47 fe ff ff <0f> 0b e8 d9 d9 e2 ff 31 f6 eb d5 48 83 c4 10 5b 41 5c e
[ 44.689353] RSP: 0018:ffff888106467cb8 EFLAGS: 00010246
[ 44.689806] RAX: 0000000000000000 RBX: ffff888112b3a200 RCX: 0000000000000001
[ 44.690433] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8881128ad988
[ 44.691094] RBP: 0000000000000002 R08: ffffffff81370bd0 R09: 1ffff110216231a5
[ 44.691643] R10: dffffc0000000000 R11: ffffed10216231a6 R12: ffff88810d68a488
[ 44.692245] R13: ffff88810767c288 R14: ffff88810d68a400 R15: ffff88810d68a418
[ 44.692829] FS: 0000000000000000(0000) GS:ffff8881f7580000(0000) knlGS:0000000000000000
[ 44.693484] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.693964] CR2: 000055c7f2afce28 CR3: 000000010fee4002 CR4: 0000000000370ee0
[ 44.694513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 44.695102] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 44.695747] Call Trace:
[ 44.696001] <TASK>
[ 44.696183] ? __warn+0xfe/0x270
[ 44.696447] ? bpf_obj_free_fields+0x220/0x240
[ 44.696817] ? report_bug+0x220/0x2d0
[ 44.697180] ? handle_bug+0x3d/0x70
[ 44.697507] ? exc_invalid_op+0x1a/0x50
[ 44.697887] ? asm_exc_invalid_op+0x1a/0x20
[ 44.698282] ? btf_find_struct_meta+0xd0/0xd0
[ 44.698634] ? bpf_obj_free_fields+0x220/0x240
[ 44.699027] ? bpf_obj_free_fields+0x1e2/0x240
[ 44.699414] array_map_free+0x1a3/0x260
[ 44.699763] bpf_map_free_deferred+0x7b/0xe0
[ 44.700154] process_one_work+0x46d/0x750
[ 44.700523] worker_thread+0x49e/0x900
[ 44.700892] ? pr_cont_work+0x270/0x270
[ 44.701224] kthread+0x1ae/0x1d0
[ 44.701516] ? kthread_blkcg+0x50/0x50
[ 44.701860] ret_from_fork+0x34/0x50
[ 44.702178] ? kthread_blkcg+0x50/0x50
[ 44.702508] ret_from_fork_asm+0x11/0x20
[ 44.702880] </TASK>
With the previous patch, there is no warnings.
Signed-off-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Test the basic locking stuff on 2 fds: multiple read locks,
conflicts between read and write locks, use of len==0 for queries.
Also tests for F_UNLCK F_OFD_GETLK extension.
[ jlayton: fix unlink() pathname in selftest ]
Cc: Jeff Layton <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stas Sergeev <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
|
|
These don't have any particularly good reason to belong in lppaca.h,
move them into their own header.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Add a macvlan over bonding test with mode active-backup, balance-tlb
and balance-alb.
]# ./bond_macvlan.sh
TEST: active-backup: IPv4: client->server [ OK ]
TEST: active-backup: IPv6: client->server [ OK ]
TEST: active-backup: IPv4: client->macvlan_1 [ OK ]
TEST: active-backup: IPv6: client->macvlan_1 [ OK ]
TEST: active-backup: IPv4: client->macvlan_2 [ OK ]
TEST: active-backup: IPv6: client->macvlan_2 [ OK ]
TEST: active-backup: IPv4: macvlan_1->macvlan_2 [ OK ]
TEST: active-backup: IPv6: macvlan_1->macvlan_2 [ OK ]
TEST: active-backup: IPv4: server->client [ OK ]
TEST: active-backup: IPv6: server->client [ OK ]
TEST: active-backup: IPv4: macvlan_1->client [ OK ]
TEST: active-backup: IPv6: macvlan_1->client [ OK ]
TEST: active-backup: IPv4: macvlan_2->client [ OK ]
TEST: active-backup: IPv6: macvlan_2->client [ OK ]
TEST: active-backup: IPv4: macvlan_2->macvlan_2 [ OK ]
TEST: active-backup: IPv6: macvlan_2->macvlan_2 [ OK ]
[...]
TEST: balance-alb: IPv4: client->server [ OK ]
TEST: balance-alb: IPv6: client->server [ OK ]
TEST: balance-alb: IPv4: client->macvlan_1 [ OK ]
TEST: balance-alb: IPv6: client->macvlan_1 [ OK ]
TEST: balance-alb: IPv4: client->macvlan_2 [ OK ]
TEST: balance-alb: IPv6: client->macvlan_2 [ OK ]
TEST: balance-alb: IPv4: macvlan_1->macvlan_2 [ OK ]
TEST: balance-alb: IPv6: macvlan_1->macvlan_2 [ OK ]
TEST: balance-alb: IPv4: server->client [ OK ]
TEST: balance-alb: IPv6: server->client [ OK ]
TEST: balance-alb: IPv4: macvlan_1->client [ OK ]
TEST: balance-alb: IPv6: macvlan_1->client [ OK ]
TEST: balance-alb: IPv4: macvlan_2->client [ OK ]
TEST: balance-alb: IPv6: macvlan_2->client [ OK ]
TEST: balance-alb: IPv4: macvlan_2->macvlan_2 [ OK ]
TEST: balance-alb: IPv6: macvlan_2->macvlan_2 [ OK ]
Signed-off-by: Hangbin Liu <[email protected]>
Acked-by: Jay Vosburgh <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Add a new testing topo bond_topo_2d1c.sh which is used more commonly.
Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the
extra link.
Signed-off-by: Hangbin Liu <[email protected]>
Acked-by: Jay Vosburgh <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Back-merge the 6.5-devel branch for the clean patch application for
6.6 and resolving merge conflicts.
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we
need to check that first before inferring signedness.
Closes: https://github.com/libbpf/libbpf/issues/704
Reported-by: Lorenz Bauer <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin KaFai Lau <[email protected]>
|
|
It seems like it was forgotten to add uprobe_multi binary to .gitignore.
Fix this trivial omission.
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin KaFai Lau <[email protected]>
|
|
For bpf_object__pin_programs() there is bpf_object__unpin_programs().
Likewise bpf_object__unpin_maps() for bpf_object__pin_maps().
But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds
symmetry to the API.
It's also convenient for cleanup in application code. It's an API I
would've used if it was available for a repro I was writing earlier.
Signed-off-by: Daniel Xu <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Reviewed-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
|
|
Add tests that enforce mmap hint address behavior. mmap should default
to sv48. mmap will provide an address at the highest address space that
can fit into the hint address, unless the hint address is less than sv39
and not 0, then it will return a sv39 address.
These tests are split into two files: mmap_default.c and mmap_bottomup.c
because a new process must be exec'd in order to change the mmap layout.
The run_mmap.sh script sets the stack to be unlimited for the
mmap_bottomup.c test which triggers a bottomup layout.
Signed-off-by: Charlie Jenkins <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
|
|
- Without prev commit
$ tools/testing/selftests/bpf/test_progs --name=tc_bpf
#232/1 tc_bpf/tc_bpf_root:OK
test_tc_bpf_non_root:PASS:set_cap_bpf_cap_net_admin 0 nsec
test_tc_bpf_non_root:PASS:disable_cap_sys_admin 0 nsec
0: R1=ctx(off=0,imm=0) R10=fp0
; if ((long)(iph + 1) > (long)skb->data_end)
0: (61) r2 = *(u32 *)(r1 +80) ; R1=ctx(off=0,imm=0) R2_w=pkt_end(off=0,imm=0)
; struct iphdr *iph = (void *)(long)skb->data + sizeof(struct ethhdr);
1: (61) r1 = *(u32 *)(r1 +76) ; R1_w=pkt(off=0,r=0,imm=0)
; if ((long)(iph + 1) > (long)skb->data_end)
2: (07) r1 += 34 ; R1_w=pkt(off=34,r=0,imm=0)
3: (b4) w0 = 1 ; R0_w=1
4: (2d) if r1 > r2 goto pc+1
R2 pointer comparison prohibited
processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
test_tc_bpf_non_root:FAIL:test_tc_bpf__open_and_load unexpected error: -13
#233/2 tc_bpf_non_root:FAIL
- With prev commit
$ tools/testing/selftests/bpf/test_progs --name=tc_bpf
#232/1 tc_bpf/tc_bpf_root:OK
#232/2 tc_bpf/tc_bpf_non_root:OK
#232 tc_bpf:OK
Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Yafang Shao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Having __sysret() as an inline function has the unfortunate effect of
adding casts and large constants comparisons after the syscall returns
that significantly inflate some light code that's otherwise syscall-
heavy. Even nolibc-test grew by ~1%.
Let's switch back to a macro for this, and use it only with signed
arguments. Note that it is also possible to design a slightly more
complex macro covering unsigned and pointers but we only have 3 such
syscalls so it is pointless, and these were just addressed not to use
this macro anymore. Now for the argument (the local variable containing
the syscall return value), any negative value is an error, that results
in -1 being returned and errno to be assigned the opposite value.
This may be revisited again in the future if really needed but for now
let's get back to something sane.
Fixes: 428905da6ec4 ("tools/nolibc: sys.h: add a syscall return helper")
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Cc: Zhangjin Wu <[email protected]>
Cc: David Laight <[email protected]>
Cc: Thomas Weißschuh <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
|
|
The __sysret() function causes some undesirable casts so we'll revert
it. In order to keep it simple it will now only support integer return
values like in the past, so we must basically revert the changes that
were made to these 3 syscalls which return a pointer so that they
simply rely on their own test and the SET_ERRNO() macro.
Fixes: 4201cfce15fe ("tools/nolibc: clean up sbrk() routine")
Fixes: 924e9539aeaa ("tools/nolibc: clean up mmap() routine")
Fixes: d27447bc2e0a ("tools/nolibc: sys.h: apply __sysret() helper")
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Cc: Zhangjin Wu <[email protected]>
Cc: David Laight <[email protected]>
Cc: Thomas Weißschuh <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
|
|
Silence the following warnings reported by the new -Wall -Wextra options
with pure assembly code.
In file included from sysroot/powerpc/include/stdio.h:13,
from nolibc-test.c:13:
sysroot/powerpc/include/arch.h: In function '_start':
sysroot/powerpc/include/arch.h:192:32: warning: unused variable 'r2' [-Wunused-variable]
192 | register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start;
| ^~
sysroot/powerpc/include/arch.h:187:97: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
187 | void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
| ^~~~~~
Since only elfv2 ABI requires to save the TOC/GOT pointer to r2
register, when using elfv1 ABI, the old C code is simply ignored by the
compiler, but the compiler can not ignore the inline assembly code and
will introduce build failure or running segfaults. So, let's further
only add the new assembly code for elfv2 ABI with the checking of
_CALL_ELF == 2.
Link: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
Link: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Signed-off-by: Zhangjin Wu <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
|