| Age | Commit message (Collapse) | Author | Files | Lines |
|
Now that cpumask types are split out to a separate smaller header, many
frequently included core headers may switch to using it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Yury Norov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Many core headers include cpumask.h for nothing. Drop it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Yury Norov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
sched.h needs cpumask.h mostly for types declaration. Now that we have
cpumask_types.h, which is a significantly smaller header, we can rely on
it.
The only exception is UP stub for set_cpus_allowed_ptr(). The function
needs to test bit #0 in a @new_mask, which can be trivially opencoded.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Yury Norov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Many core headers, like sched.h, include cpumask.h mostly for struct
cpumask and cpumask_var_t. Those are frequently used headers and
shouldn't pull more than the bare minimum.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Viresh Kumar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
<linux/sched.h> indirectly via cpumask.h path includes the ilog2.h header
to calculate ilog2(TASK_REPORT_MAX). The following patches drops sched.h
dependency on cpumask.h, and to have a successful build, the header has to
be included explicitly.
sched.h is a frequently included header, and it's better to keep the
dependency list as small as possible. So, instead of including ilog2.h
for a single BUILD_BUG_ON() check, the same check may be implemented by
taking exponent of the other part of equation.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Yury Norov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Update min_heap_push() to use min_heap_sift_up() rather than its origin
inline version.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
After adding min_heap_sift_up(), the naming convention has been adjusted
to maintain consistency with the min_heap_sift_up(). Consequently,
min_heapify() has been renamed to min_heap_sift_down().
Link: https://lkml.kernel.org/CAP-5=fVcBAxt8Mw72=NCJPRJfjDaJcqk4rjbadgouAEAHz_q1A@mail.gmail.com
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Modify the min_heap_push() and min_heap_pop() to return a boolean value.
They now return false when the operation fails and true when it succeeds.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add min_heap_del() to delete the element at index 'idx' in the heap.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add min_heap_sift_up() to sift up the element at index 'idx' in the
heap.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add a third parameter 'args' for the 'less' and 'swp' functions in the
'struct min_heap_callbacks'. This additional parameter allows these
comparison and swap functions to handle extra arguments when necessary.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add min_heap_full() which returns a boolean value indicating whether the
heap has reached its maximum capacity.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add min_heap_peek() to retrieve a pointer to the smallest element. The
pointer is cast to the appropriate type of heap elements.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add min_heap_init() for initializing heap with data, nr, and size.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Implement a type-safe interface for min_heap using strong type pointers
instead of void * in the data field. This change includes adding small
macro wrappers around functions, enabling the use of __minheap_cast and
__minheap_obj_size macros for type casting and obtaining element size.
This implementation removes the necessity of passing element size in
min_heap_callbacks. Additionally, introduce the MIN_HEAP_PREALLOCATED
macro for preallocating some elements.
Link: https://lkml.kernel.org/ioyfizrzq7w7mjrqcadtzsfgpuntowtjdw5pgn4qhvsdp4mqqg@nrlek5vmisbu
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Wei Chiu <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Ching-Chun (Jim) Huang <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Sakai <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Drop one '-' to adhere to coding style.
Replace 'arbitray' with 'arbitrary'.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei-Hsin Yeh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
THP-sized allocations") no longer differentiates the migration type of
pages in THP-sized PCP list, it's possible that non-movable allocation
requests may get a CMA page from the list, in some cases, it's not
acceptable.
If a large number of CMA memory are configured in system (for example, the
CMA memory accounts for 50% of the system memory), starting a virtual
machine with device passthrough will get stuck. During starting the
virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM,
...) to pin memory. Normally if a page is present and in CMA area,
pin_user_pages_remote() will migrate the page from CMA area to non-CMA
area because of FOLL_LONGTERM flag. But if non-movable allocation
requests return CMA memory, migrate_longterm_unpinnable_pages() will
migrate a CMA page to another CMA page, which will fail to pass the check
in check_and_migrate_movable_pages() and cause migration endless.
Call trace:
pin_user_pages_remote
--__gup_longterm_locked // endless loops in this function
----_get_user_pages_locked
----check_and_migrate_movable_pages
------migrate_longterm_unpinnable_pages
--------alloc_migration_target
This problem will also have a negative impact on CMA itself. For example,
when CMA is borrowed by THP, and we need to reclaim it through cma_alloc()
or dma_alloc_coherent(), we must move those pages out to ensure CMA's
users can retrieve that contigous memory. Currently, CMA's memory is
occupied by non-movable pages, meaning we can't relocate them. As a
result, cma_alloc() is more likely to fail.
To fix the problem above, we add one PCP list for THP, which will not
introduce a new cacheline for struct per_cpu_pages. THP will have 2 PCP
lists, one PCP list is used by MOVABLE allocation, and the other PCP list
is used by UNMOVABLE allocation. MOVABLE allocation contains GPF_MOVABLE,
and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
Signed-off-by: yangge <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Barry Song <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Changing PG_slab from a page flag to a page type in commit 46df8e73a4a3
("mm: free up PG_slab") in has the unintended consequence of removing the
PG_slab constant from kernel debuginfo. The commit does add the value to
the vmcoreinfo note, which allows debuggers to find the value without
hardcoding it. However it's most flexible to continue representing the
constant with an enum. To that end, convert the page type fields into an
enum. Debuggers will now be able to detect that PG_slab's type has
changed from enum pageflags to enum pagetype.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 46df8e73a4a3 ("mm: free up PG_slab")
Signed-off-by: Stephen Brennan <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Hao Ge <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Omar Sandoval <[email protected]>
Cc: Vishal Moola (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add sl in /proc/pid/smaps to indicate vma is sealed
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 8be7258aad44 ("mseal: add mseal syscall")
Signed-off-by: Jeff Xu <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Adhemerval Zanella <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Jorge Lucangeli Obes <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stephen Röttger <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When the intermediate CQE aux cache got removed, any usage of the this
member went away. As it isn't used anymore, kill it.
Fixes: 902ce82c2aa1 ("io_uring: get rid of intermediate aux cqe caches")
Signed-off-by: Jens Axboe <[email protected]>
|
|
The per-CPU flush lists, which are accessed from within the NAPI callback
(xdp_do_flush() for instance), are per-CPU. There are subject to the
same problem as struct bpf_redirect_info.
Add the per-CPU lists cpu_map_flush_list, dev_map_flush_list and
xskmap_map_flush_list to struct bpf_net_context. Add wrappers for the
access. The lists initialized on first usage (similar to
bpf_net_ctx_get_ri()).
Cc: "Björn Töpel" <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: Jonathan Lemon <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Maciej Fijalkowski <[email protected]>
Cc: Magnus Karlsson <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Cc: Yonghong Song <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The XDP redirect process is two staged:
- bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the
packet and makes decisions. While doing that, the per-CPU variable
bpf_redirect_info is used.
- Afterwards xdp_do_redirect() is invoked and accesses bpf_redirect_info
and it may also access other per-CPU variables like xskmap_flush_list.
At the very end of the NAPI callback, xdp_do_flush() is invoked which
does not access bpf_redirect_info but will touch the individual per-CPU
lists.
The per-CPU variables are only used in the NAPI callback hence disabling
bottom halves is the only protection mechanism. Users from preemptible
context (like cpu_map_kthread_run()) explicitly disable bottom halves
for protections reasons.
Without locking in local_bh_disable() on PREEMPT_RT this data structure
requires explicit locking.
PREEMPT_RT has forced-threaded interrupts enabled and every
NAPI-callback runs in a thread. If each thread has its own data
structure then locking can be avoided.
Create a struct bpf_net_context which contains struct bpf_redirect_info.
Define the variable on stack, use bpf_net_ctx_set() to save a pointer to
it, bpf_net_ctx_clear() removes it again.
The bpf_net_ctx_set() may nest. For instance a function can be used from
within NET_RX_SOFTIRQ/ net_rx_action which uses bpf_net_ctx_set() and
NET_TX_SOFTIRQ which does not. Therefore only the first invocations
updates the pointer.
Use bpf_net_ctx_get_ri() as a wrapper to retrieve the current struct
bpf_redirect_info. The returned data structure is zero initialized to
ensure nothing is leaked from stack. This is done on first usage of the
struct. bpf_net_ctx_set() sets bpf_redirect_info::kern_flags to 0 to
note that initialisation is required. First invocation of
bpf_net_ctx_get_ri() will memset() the data structure and update
bpf_redirect_info::kern_flags.
bpf_redirect_info::nh is excluded from memset because it is only used
once BPF_F_NEIGH is set which also sets the nh member. The kern_flags is
moved past nh to exclude it from memset.
The pointer to bpf_net_context is saved task's task_struct. Using
always the bpf_net_context approach has the advantage that there is
almost zero differences between PREEMPT_RT and non-PREEMPT_RT builds.
Cc: Andrii Nakryiko <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Cc: Yonghong Song <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
softnet_data::process_queue is a per-CPU variable and relies on disabled
BH for its locking. Without per-CPU locking in local_bh_disable() on
PREEMPT_RT this data structure requires explicit locking.
softnet_data::input_queue_head can be updated lockless. This is fine
because this value is only update CPU local by the local backlog_napi
thread.
Add a local_lock_t to softnet_data and use local_lock_nested_bh() for locking
of process_queue. This change adds only lockdep coverage and does not
alter the functional behaviour for !PREEMPT_RT.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Softirq is preemptible on PREEMPT_RT. Without a per-CPU lock in
local_bh_disable() there is no guarantee that only one device is
transmitting at a time.
With preemption and multiple senders it is possible that the per-CPU
`recursion' counter gets incremented by different threads and exceeds
XMIT_RECURSION_LIMIT leading to a false positive recursion alert.
The `more' member is subject to similar problems if set by one thread
for one driver and wrongly used by another driver within another thread.
Instead of adding a lock to protect the per-CPU variable it is simpler
to make xmit per-task. Sending and receiving skbs happens always
in thread context anyway.
Having a lock to protected the per-CPU counter would block/ serialize two
sending threads needlessly. It would also require a recursive lock to
ensure that the owner can increment the counter further.
Make the softnet_data.xmit a task_struct member on PREEMPT_RT. Add
needed wrapper.
Cc: Ben Segall <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add local_lock_nested_bh() locking. It is based on local_lock_t and the
naming follows the preempt_disable_nested() example.
For !PREEMPT_RT + !LOCKDEP it is a per-CPU annotation for locking
assumptions based on local_bh_disable(). The macro is optimized away
during compilation.
For !PREEMPT_RT + LOCKDEP the local_lock_nested_bh() is reduced to
the usual lock-acquire plus lockdep_assert_in_softirq() - ensuring that
BH is disabled.
For PREEMPT_RT local_lock_nested_bh() acquires the specified per-CPU
lock. It does not disable CPU migration because it relies on
local_bh_disable() disabling CPU migration.
With LOCKDEP it performans the usual lockdep checks as with !PREEMPT_RT.
Due to include hell the softirq check has been moved spinlock.c.
The intention is to use this locking in places where locking of a per-CPU
variable relies on BH being disabled. Instead of treating disabled
bottom halves as a big per-CPU lock, PREEMPT_RT can use this to reduce
the locking scope to what actually needs protecting.
A side effect is that it also documents the protection scope of the
per-CPU variables.
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Introduce lock guard definition for local_lock_t. There are no users
yet.
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Implement callback to display the host transport address by
adding a callback 'host_traddr' for nvmet_fc_target_template.
Signed-off-by: Hannes Reinecke <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: James Smart <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
|
|
CDR/MORE/DNR fields are not belonging to SC in the NVMe spec, rename
them to NVME_STATUS_* to avoid confusion.
Signed-off-by: Weiwen Hu <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
|
|
Replaced some magic numbers about SC and SCT with enum and macro.
Signed-off-by: Weiwen Hu <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
|
|
These seem useless since we use the SLUB_RED_INACTIVE and SLUB_RED_ACTIVE,
so just delete them, no functional change.
Reviewed-by: Vlastimil Babka <[email protected]>
Signed-off-by: Chengming Zhou <[email protected]>
Reviewed-by: Christoph Lameter (Ampere) <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
|
|
Take care of the inverse polarity of the BLK_FEAT_ROTATIONAL flag
vs the old nonrot helper.
Fixes: bd4a633b6f7c ("block: move the nonrot flag to queue_limits")
Reported-by: kernel test robot <[email protected]>
Reported-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add support for forcing the WMI driver core to bind
a certain WMI driver to a WMI device. This will be
necessary to support generic WMI drivers without an
ID table
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Armin Wolf <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Ilpo Järvinen <[email protected]>
Signed-off-by: Ilpo Järvinen <[email protected]>
|
|
The distributor and PMR/RPR can present different views of the interrupt
priority space dependent upon the values of GICD_CTLR.DS and
SCR_EL3.FIQ. Currently we treat the distributor's view of the priority
space as canonical, and when the two differ we change the way we handle
values in the PMR/RPR, using the `gic_nonsecure_priorities` static key
to decide what to do.
This approach works, but it's sub-optimal. When using pseudo-NMI we
manipulate the distributor rarely, and we manipulate the PMR/RPR
registers very frequently in code spread out throughout the kernel (e.g.
local_irq_{save,restore}()). It would be nicer if we could use fixed
values for the PMR/RPR, and dynamically choose the values programmed
into the distributor.
This patch changes the GICv3 driver and arm64 code accordingly. PMR
values are chosen at compile time, and the GICv3 driver determines the
appropriate values to program into the distributor at boot time. This
removes the need for the `gic_nonsecure_priorities` static key and
results in smaller and better generated code for saving/restoring the
irqflags.
Before this patch, local_irq_disable() compiles to:
| 0000000000000000 <outlined_local_irq_disable>:
| 0: d503201f nop
| 4: d50343df msr daifset, #0x3
| 8: d65f03c0 ret
| c: d503201f nop
| 10: d2800c00 mov x0, #0x60 // #96
| 14: d5184600 msr icc_pmr_el1, x0
| 18: d65f03c0 ret
| 1c: d2801400 mov x0, #0xa0 // #160
| 20: 17fffffd b 14 <outlined_local_irq_disable+0x14>
After this patch, local_irq_disable() compiles to:
| 0000000000000000 <outlined_local_irq_disable>:
| 0: d503201f nop
| 4: d50343df msr daifset, #0x3
| 8: d65f03c0 ret
| c: d2801800 mov x0, #0xc0 // #192
| 10: d5184600 msr icc_pmr_el1, x0
| 14: d65f03c0 ret
... with 3 fewer instructions per call.
For defconfig + CONFIG_PSEUDO_NMI=y, this results in a minor saving of
~4K of text, and will make it easier to make further improvements to the
way we manipulate irqflags and DAIF bits.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Tested-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
|
|
In subsequent patches the GICv3 driver will choose the regular interrupt
priority at boot time.
In preparation for using dynamic priorities, place the priorities in
variables and update the code to pass these as parameters. Users of
GICD_INT_DEF_PRI_X4 are modified to replicate the priority byte using
REPEAT_BYTE_U32().
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Tested-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
|
|
In some cases it's necessary to replicate a byte across a u32 value, for
which REPEAT_BYTE() would be helpful. Currently this requires explicit
masking of the result to avoid sparse warnings, as e.g.
(u32)REPEAT_BYTE(0xa0))
... will result in a warning:
cast truncates bits from constant value (a0a0a0a0a0a0a0a0 becomes a0a0a0a0)
Add a new REPEAT_BYTE_U32() which does the necessary masking internally,
so that we don't need to duplicate this for every usage.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
|
|
The old ftruncate() syscall, using the 32-bit off_t misses a sign
extension when called in compat mode on 64-bit architectures. As a
result, passing a negative length accidentally succeeds in truncating
to file size between 2GiB and 4GiB.
Changing the type of the compat syscall to the signed compat_off_t
changes the behavior so it instead returns -EINVAL.
The native entry point, the truncate() syscall and the corresponding
loff_t based variants are all correct already and do not suffer
from this mistake.
Fixes: 3f6d078d4acc ("fix compat truncate/ftruncate")
Reviewed-by: Christian Brauner <[email protected]>
Cc: [email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Commit ca78476e4888f1f1 ("mfd: Remove toshiba tmio drivers") removed the
last users of the .set_pwr() callback in the tmio_mmc_data structure.
Remove the callback, and all related infrastructure.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Lee Jones <[email protected]>
Reviewed-by: Wolfram Sang <[email protected]>
Tested-by: Wolfram Sang <[email protected]>
Reviewed-by: Lad Prabhakar <[email protected]>
Link: https://lore.kernel.org/r/fbbc13ddd19df2c40933ffa3b82fb14841bf1d4c.1718897545.git.geert+renesas@glider.be
|
|
Commit bef64d2908e825c5 ("mmc: remove tmio_mmc driver") removed the last
user of the .set_clk_div() callback in the tmio_mmc_data structure.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Wolfram Sang <[email protected]>
Tested-by: Wolfram Sang <[email protected]>
Reviewed-by: Lad Prabhakar <[email protected]>
Acked-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/e0fa98f138a7b2836128178f8b3a757978517307.1718897545.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <[email protected]>
|
|
With slab accounting, allocating and freeing memory has considerable
overhead. Add a basic alloc cache for the io_kiocb allocations that
msg_ring needs to do. Unlike other caches, this one is used by the
sender, grabbing it from the remote ring. When the remote ring gets
the posted completion, it'll free it locally. Hence it is separately
locked, using ctx->msg_lock.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Analogue to uart_port_tx_flags() introduced in commit 3ee07964d407
("serial: core: introduce uart_port_tx_flags()"), add a _flags variant
for uart_port_tx_limited().
Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()")
Cc: [email protected]
Signed-off-by: Jonas Gorski <[email protected]>
Signed-off-by: Doug Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This reverts commit 7bfb915a597a301abb892f620fe5c283a9fdbd77.
This commit broke pxa and omap-serial, because it inhibited them from
calling stop_tx() if their TX FIFOs weren't completely empty. This
resulted in these two drivers hanging during transmits because the TX
interrupt would stay enabled, and a new TX interrupt would never fire.
Cc: [email protected]
Fixes: 7bfb915a597a ("serial: core: only stop transmit when HW fifo is empty")
Signed-off-by: Doug Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Add serial support for RZ/V2H(P) SoC with earlycon.
The SCIF interface in the Renesas RZ/V2H(P) is similar to that available
in the RZ/G2L (R9A07G044) SoC, with the following differences:
- RZ/V2H(P) SoC has three additional interrupts: one for Tx end/Rx ready
and two for Rx and Tx buffer full, all of which are edge-triggered.
- RZ/V2H(P) supports asynchronous mode, whereas RZ/G2L supports both
synchronous and asynchronous modes.
- There are differences in the configuration of certain registers such
as SCSMR, SCFCR, and SCSPTR between the two SoCs.
To handle these differences on RZ/V2H(P) SoC SCIx_RZV2H_SCIF_REGTYPE
is added.
Signed-off-by: Lad Prabhakar <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
SHM Bridge is a safety mechanism allowing to limit the amount of memory
shared between the kernel and the TrustZone to regions explicitly marked
as such.
Add low-level primitives for enabling SHM bridge support as well as
creating and destroying SHM bridges to qcom-scm.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Acked-by: Andrew Halaney <[email protected]>
Tested-by: Andrew Halaney <[email protected]> # sc8280xp-lenovo-thinkpad-x13s
Tested-by: Deepti Jaggi <[email protected]> #sa8775p-ride
Reviewed-by: Elliot Berman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Drop the DMA mapping operations from qcom_scm_qseecom_app_send() and
convert all users of it in the qseecom module to using the TZ allocator
for creating SCM call buffers. As this is largely a module separate from
the SCM driver, let's use a separate memory pool. Set the initial size to
4K and - if we run out - add twice the current amount to the pool.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Reviewed-by: Elliot Berman <[email protected]>
Reviewed-by: Amirreza Zarrabi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
We have several SCM calls that require passing buffers to the TrustZone
on top of the SMC core which allocates memory for calls that require
more than 4 arguments.
Currently every user does their own thing which leads to code
duplication. Many users call dma_alloc_coherent() for every call which
is terribly unperformant (speed- and size-wise).
Provide a set of library functions for creating and managing pools of
memory which is suitable for sharing with the TrustZone, that is:
page-aligned, contiguous and non-cachable as well as provides a way of
mapping of kernel virtual addresses to physical space.
Make the allocator ready for extending with additional modes of operation
which will allow us to support the SHM bridge safety mechanism once all
users convert.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Reviewed-by: Andrew Halaney <[email protected]>
Tested-by: Andrew Halaney <[email protected]> # sc8280xp-lenovo-thinkpad-x13s
Tested-by: Deepti Jaggi <[email protected]> #sa8775p-ride
Reviewed-by: Elliot Berman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"The core gains placeholders for recently added functions when
CONFIG_I2C is not defined as well documentation fixes to start using
inclusive terminology.
The drivers get paths in DT bindings fixed as well as proper interrupt
handling for the ocores driver"
* tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs
docs: i2c: summary: document 'local' and 'remote' targets
docs: i2c: summary: document use of inclusive language
docs: i2c: summary: update speed mode description
docs: i2c: summary: update I2C specification link
docs: i2c: summary: start sentences consistently.
i2c: Add nop fwnode operations
i2c: ocores: set IACK bit after core is enabled
dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
dt-bindings: i2c: atmel,at91sam: correct path to i2c-controller schema
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix fragility in checks for unset node ID.
Use numa_valid_node() function to verify that nid is a valid node
ID instead of inconsistent comparisons with either NUMA_NO_NODE or
MAX_NUMNODES"
* tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: use numa_valid_node() helper to check for invalid node ID
|
|
Merge series from David Lechner <[email protected]>:
In the IIO subsystem, we are finding that it is common to call
spi_optimize_message() during driver probe since the SPI message
doesn't change for the lifetime of the driver. This patch adds a
devm_spi_optimize_message() helper to simplify this common pattern.
|
|
Add a dump of the EQ entries headers upon a EQ heartbeat failure.
Signed-off-by: Tomer Tayar <[email protected]>
Reviewed-by: Ofir Bitton <[email protected]>
Signed-off-by: Ofir Bitton <[email protected]>
|
|
Add cpld_timestamp field to cpucp_info structure and return cpld
timestamp as part of cpld version
Signed-off-by: Vitaly Margolin <[email protected]>
Reviewed-by: Ofir Bitton <[email protected]>
Signed-off-by: Ofir Bitton <[email protected]>
|