| Age | Commit message (Collapse) | Author | Files | Lines |
|
This moves all page alloc related sysctls to its own file, as part of the
kernel/sysctl.c spring cleaning, also move some functions declarations
from mm.h into internal.h.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
pm_restrict_gfp_mask()/pm_restore_gfp_mask() only used in power, let's
move them out of page_alloc.c.
Adding a general gfp_has_io_fs() function which return true if gfp with
both __GFP_IO and __GFP_FS flags, then use it inside of
pm_suspended_storage(), also the pm_suspended_storage() is moved into
suspend.h.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The mark_free_page() is only used in kernel/power/snapshot.c, move it out
to reduce a bit of page_alloc.c
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Move DEBUG_PAGEALLOC related functions into a single file to reduce a bit
of page_alloc.c.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
... to a single file to reduce a bit of page_alloc.c.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
set_zone_contiguous() is only used in mm init/hotplug, and
clear_zone_contiguous() only used in hotplug, move them from page_alloc.c
to the more appropriate file.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The memory_failure_attr_group is only called if MEMORY_FAILURE enabled,
move it under this configuration.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
gcc-13 warns about function definitions for builtin interfaces that have a
different prototype, e.g.:
In file included from kasan_test.c:31:
kasan.h:574:6: error: conflicting types for built-in function '__asan_register_globals'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch]
574 | void __asan_register_globals(struct kasan_global *globals, size_t size);
kasan.h:577:6: error: conflicting types for built-in function '__asan_alloca_poison'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch]
577 | void __asan_alloca_poison(unsigned long addr, size_t size);
kasan.h:580:6: error: conflicting types for built-in function '__asan_load1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch]
580 | void __asan_load1(unsigned long addr);
kasan.h:581:6: error: conflicting types for built-in function '__asan_store1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch]
581 | void __asan_store1(unsigned long addr);
kasan.h:643:6: error: conflicting types for built-in function '__hwasan_tag_memory'; expected 'void(void *, unsigned char, long int)' [-Werror=builtin-declaration-mismatch]
643 | void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
The two problems are:
- Addresses are passes as 'unsigned long' in the kernel, but gcc-13
expects a 'void *'.
- sizes meant to use a signed ssize_t rather than size_t.
Change all the prototypes to match these. Using 'void *' consistently for
addresses gets rid of a couple of type casts, so push that down to the
leaf functions where possible.
This now passes all randconfig builds on arm, arm64 and x86, but I have
not tested it on the other architectures that support kasan, since they
tend to fail randconfig builds in other ways. This might fail if any of
the 32-bit architectures expect a 'long' instead of 'int' for the size
argument.
The __asan_allocas_unpoison() function prototype is somewhat weird, since
it uses a pointer for 'stack_top' and an size_t for 'stack_bottom'. This
looks like it is meant to be 'addr' and 'size' like the others, but the
implementation clearly treats them as 'top' and 'bottom'.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
page_endio() is not used anymore. Remove it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Pankaj Raghav <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Matthew Wilcox (Oracle) <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There is currently no good way to query the page cache state of large file
sets and directory trees. There is mincore(), but it scales poorly: the
kernel writes out a lot of bitmap data that userspace has to aggregate,
when the user really doesn not care about per-page information in that
case. The user also needs to mmap and unmap each file as it goes along,
which can be quite slow as well.
Some use cases where this information could come in handy:
* Allowing database to decide whether to perform an index scan or
direct table queries based on the in-memory cache state of the
index.
* Visibility into the writeback algorithm, for performance issues
diagnostic.
* Workload-aware writeback pacing: estimating IO fulfilled by page
cache (and IO to be done) within a range of a file, allowing for
more frequent syncing when and where there is IO capacity, and
batching when there is not.
* Computing memory usage of large files/directory trees, analogous to
the du tool for disk usage.
More information about these use cases could be found in the following
thread:
https://lore.kernel.org/lkml/[email protected]/
This patch implements a new syscall that queries cache state of a file and
summarizes the number of cached pages, number of dirty pages, number of
pages marked for writeback, number of (recently) evicted pages, etc. in a
given range. Currently, the syscall is only wired in for x86
architecture.
NAME
cachestat - query the page cache statistics of a file.
SYNOPSIS
#include <sys/mman.h>
struct cachestat_range {
__u64 off;
__u64 len;
};
struct cachestat {
__u64 nr_cache;
__u64 nr_dirty;
__u64 nr_writeback;
__u64 nr_evicted;
__u64 nr_recently_evicted;
};
int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
struct cachestat *cstat, unsigned int flags);
DESCRIPTION
cachestat() queries the number of cached pages, number of dirty
pages, number of pages marked for writeback, number of evicted
pages, number of recently evicted pages, in the bytes range given by
`off` and `len`.
An evicted page is a page that is previously in the page cache but
has been evicted since. A page is recently evicted if its last
eviction was recent enough that its reentry to the cache would
indicate that it is actively being used by the system, and that
there is memory pressure on the system.
These values are returned in a cachestat struct, whose address is
given by the `cstat` argument.
The `off` and `len` arguments must be non-negative integers. If
`len` > 0, the queried range is [`off`, `off` + `len`]. If `len` ==
0, we will query in the range from `off` to the end of the file.
The `flags` argument is unused for now, but is included for future
extensibility. User should pass 0 (i.e no flag specified).
Currently, hugetlbfs is not supported.
Because the status of a page can change after cachestat() checks it
but before it returns to the application, the returned values may
contain stale information.
RETURN VALUE
On success, cachestat returns 0. On error, -1 is returned, and errno
is set to indicate the error.
ERRORS
EFAULT cstat or cstat_args points to an invalid address.
EINVAL invalid flags.
EBADF invalid file descriptor.
EOPNOTSUPP file descriptor is of a hugetlbfs file
[[email protected]: replace rounddown logic with the existing helper]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nhat Pham <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "cachestat: a new syscall for page cache state of files",
v13.
There is currently no good way to query the page cache statistics of large
files and directory trees. There is mincore(), but it scales poorly: the
kernel writes out a lot of bitmap data that userspace has to aggregate,
when the user really does not care about per-page information in that
case. The user also needs to mmap and unmap each file as it goes along,
which can be quite slow as well.
Some use cases where this information could come in handy:
* Allowing database to decide whether to perform an index scan or direct
table queries based on the in-memory cache state of the index.
* Visibility into the writeback algorithm, for performance issues
diagnostic.
* Workload-aware writeback pacing: estimating IO fulfilled by page cache
(and IO to be done) within a range of a file, allowing for more
frequent syncing when and where there is IO capacity, and batching
when there is not.
* Computing memory usage of large files/directory trees, analogous to
the du tool for disk usage.
More information about these use cases could be found in this thread:
https://lore.kernel.org/lkml/[email protected]/
This series of patches introduces a new system call, cachestat, that
summarizes the page cache statistics (number of cached pages, dirty pages,
pages marked for writeback, evicted pages etc.) of a file, in a specified
range of bytes. It also include a selftest suite that tests some typical
usage. Currently, the syscall is only wired in for x86 architecture.
This interface is inspired by past discussion and concerns with fincore,
which has a similar design (and as a result, issues) as mincore. Relevant
links:
https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html
https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html
I have also developed a small tool that computes the memory usage of files
and directories, analogous to the du utility. User can choose between
mincore or cachestat (with cachestat exporting more information than
mincore). To compare the performance of these two options, I benchmarked
the tool on the root directory of a Meta's server machine, each for five
runs:
Using cachestat
real -- Median: 33.377s, Average: 33.475s, Standard Deviation: 0.3602
user -- Median: 4.08s, Average: 4.1078s, Standard Deviation: 0.0742
sys -- Median: 28.823s, Average: 28.8866s, Standard Deviation: 0.2689
Using mincore:
real -- Median: 102.352s, Average: 102.3442s, Standard Deviation: 0.2059
user -- Median: 10.149s, Average: 10.1482s, Standard Deviation: 0.0162
sys -- Median: 91.186s, Average: 91.2084s, Standard Deviation: 0.2046
I also ran both syscalls on a 2TB sparse file:
Using cachestat:
real 0m0.009s
user 0m0.000s
sys 0m0.009s
Using mincore:
real 0m37.510s
user 0m2.934s
sys 0m34.558s
Very large files like this are the pathological case for mincore. In
fact, to compute the stats for a single 2TB file, mincore takes as long as
cachestat takes to compute the stats for the entire tree! This could
easily happen inadvertently when we run it on subdirectories. Mincore is
clearly not suitable for a general-purpose command line tool.
Regarding security concerns, cachestat() should not pose any additional
issues. The caller already has read permission to the file itself (since
they need an fd to that file to call cachestat). This means that the
caller can access the underlying data in its entirety, which is a much
greater source of information (and as a result, a much greater security
risk) than the cache status itself.
The latest API change (in v13 of the patch series) is suggested by Jens
Axboe. It allows for 64-bit length argument, even on 32-bit architecture
(which is previously not possible due to the limit on the number of
syscall arguments). Furthermore, it eliminates the need for compatibility
handling - every user can use the same ABI.
This patch (of 4):
In preparation for computing recently evicted pages in cachestat, refactor
workingset_refault and lru_gen_refault to expose a helper function that
would test if an evicted page is recently evicted.
[[email protected]: add missing rcu_read_unlock() in lru_gen_refault()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nhat Pham <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Previous patches removed the only caller of cgroup_rstat_flush_atomic().
Remove the function and simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yosry Ahmed <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michal Koutný <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Previous patches removed all callers of mem_cgroup_flush_stats_atomic().
Remove the function and simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yosry Ahmed <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michal Koutný <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add the accelerator PCIe class and match the
class in amdgpu for 0x1002 devices of that class.
From PCI spec:
"PCI Code and ID Assignment, r1.9, sec 1, 1.19"
Signed-off-by: Shiwu Zhang <[email protected]>
Acked-by: Lijo Lazar <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]> # pci_ids.h
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add i2c_get_match_data() to get match data for I2C, ACPI and
DT-based matching, so that we can optimize the driver code.
Suggested-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Biju Das <[email protected]>
[wsa: simplified var initialization]
Signed-off-by: Wolfram Sang <[email protected]>
|
|
After mmsys and drm change DITHER enum to DDP_COMPONENT_DITHER0,
mmsys header can remove the useless DDP_COMPONENT_DITHER enum.
Signed-off-by: Jason-JH.Lin <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Reviewed-by: Rex-BC Chen <[email protected]>
Acked-by: Matthias Brugger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matthias Brugger <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/drivers
Renesas driver updates for v6.5 (take two)
- Convert the R-Mobile SYSC driver to readl_poll_timeout_atomic().
* tag 'renesas-drivers-for-v6.5-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
soc: renesas: rmobile-sysc: Convert to readl_poll_timeout_atomic()
iopoll: Do not use timekeeping in read_poll_timeout_atomic()
iopoll: Call cpu_relax() in busy loops
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Broadcom PHYs have two LEDs selector registers which allow us to control
the LED assignment, including how to turn them on/off.
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
These registers are common to most PHYs and are not specific to the
BCM5482, renamed the constants accordingly, no functional change.
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add an optional method, ->splice_eof(), to allow splice to indicate the
premature termination of a splice to struct file_operations and struct
proto_ops.
This is called if sendfile() or splice() encounters all of the following
conditions inside splice_direct_to_actor():
(1) the user did not set SPLICE_F_MORE (splice only), and
(2) an EOF condition occurred (->splice_read() returned 0), and
(3) we haven't read enough to fulfill the request (ie. len > 0 still), and
(4) we have already spliced at least one byte.
A further patch will modify the behaviour of SPLICE_F_MORE to always be
passed to the actor if either the user set it or we haven't yet read
sufficient data to fulfill the request.
Suggested-by: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Christoph Hellwig <[email protected]>
cc: Al Viro <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: Jan Kara <[email protected]>
cc: Jeff Layton <[email protected]>
cc: David Hildenbrand <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Chuck Lever <[email protected]>
cc: Boris Pismenny <[email protected]>
cc: John Fastabend <[email protected]>
cc: [email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Replace generic_splice_sendpage() + splice_from_pipe + pipe_to_sendpage()
with a net-specific handler, splice_to_socket(), that calls sendmsg() with
MSG_SPLICE_PAGES set instead of calling ->sendpage().
MSG_MORE is used to indicate if the sendmsg() is expected to be followed
with more data.
This allows multiple pipe-buffer pages to be passed in a single call in a
BVEC iterator, allowing the processing to be pushed down to a loop in the
protocol driver. This helps pave the way for passing multipage folios down
too.
Protocols that haven't been converted to handle MSG_SPLICE_PAGES yet should
just ignore it and do a normal sendmsg() for now - although that may be a
bit slower as it may copy everything.
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
It is necessary to allow MSG_SENDPAGE_* to be passed into ->sendmsg() to
allow sendmsg(MSG_SPLICE_PAGES) to replace ->sendpage(). Unblocking them
in the network protocol, however, allows these flags to be passed in by
userspace too[1].
Fix this by marking MSG_SENDPAGE_NOPOLICY, MSG_SENDPAGE_NOTLAST and
MSG_SENDPAGE_DECRYPTED as internal flags, which causes sendmsg() to object
if they are passed to sendmsg() by userspace. Network protocol ->sendmsg()
implementations can then allow them through.
Note that it should be possible to remove MSG_SENDPAGE_NOTLAST once
sendpage is removed as a whole slew of pages will be passed in in one go by
splice through sendmsg, with MSG_MORE being set if it has more data waiting
in the pipe.
Signed-off-by: David Howells <[email protected]>
cc: Chuck Lever <[email protected]>
cc: Boris Pismenny <[email protected]>
cc: John Fastabend <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/ [1]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-06-06
1) Support 4 ports VF LAG, part 2/2
2) Few extra trivial cleanup patches
Shay Drory Says:
================
Support 4 ports VF LAG, part 2/2
This series continues the series[1] "Support 4 ports VF LAG, part1/2".
This series adds support for 4 ports VF LAG (single FDB E-Switch).
This series of patches refactoring LAG code that make assumptions
about VF LAG supporting only two ports and then enable 4 ports VF LAG.
Patch 1:
- Fix for ib rep code
Patches 2-5:
- Refactors LAG layer.
Patches 6-7:
- Block LAG types which doesn't support 4 ports.
Patch 8:
- Enable 4 ports VF LAG.
This series specifically allows HCAs with 4 ports to create a VF LAG
with only 4 ports. It is not possible to create a VF LAG with 2 or 3
ports using HCAs that have 4 ports.
Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
However, upcoming patches will introduce support for HCAs with 4 ports.
In order to activate VF LAG a user can execute:
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink dev eswitch set pci/0000:08:00.1 mode switchdev
devlink dev eswitch set pci/0000:08:00.2 mode switchdev
devlink dev eswitch set pci/0000:08:00.3 mode switchdev
ip link add name bond0 type bond
ip link set dev bond0 type bond mode 802.3ad
ip link set dev eth2 master bond0
ip link set dev eth3 master bond0
ip link set dev eth4 master bond0
ip link set dev eth5 master bond0
Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.
User can verify LAG state and type via debugfs:
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type
[1]
https://lore.kernel.org/netdev/[email protected]/T/#mf1d2083780970ba277bfe721554d4925f03f36d1
================
* tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: simplify condition after napi budget handling change
mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager
net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure
net/mlx5e: TC, refactor access to hash key
net/mlx5e: Remove RX page cache leftovers
net/mlx5e: Expose catastrophic steering error counters
net/mlx5: Enable 4 ports VF LAG
net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports
net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports
net/mlx5: LAG, change mlx5_shared_fdb_supported() to static
net/mlx5: LAG, generalize handling of shared FDB
net/mlx5: LAG, check if all eswitches are paired for shared FDB
{net/RDMA}/mlx5: introduce lag_for_each_peer
RDMA/mlx5: Free second uplink ib port
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We no longer need to export lynx_pcs_create() for drivers to use as we
now have all the functionality we need in the two new creation helpers.
Remove the export and prototype, and make it static.
Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a helper to create a lynx PCS from a fwnode handle.
Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
lynx_get_mdio_device() is no longer necessary, let's remove it so the
lynx PCS code is always managing the lifetime of the mdiodev.
Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Some clock drivers do not want to allow any reparenting on a given
clock, but usually do so by not providing any determine_rate
implementation.
Whenever we call clk_round_rate() or clk_set_rate(), this leads to
clk_core_can_round() returning false and thus the rest of the function
either forwarding the rate request to its current parent if
CLK_SET_RATE_PARENT is set, or just returning the current clock rate.
This behaviour happens implicitly, and as we move forward to making a
determine_rate implementation required for muxes, we need some way to
explicitly opt-in for that behaviour.
Fortunately, this is exactly what the clk_core_determine_rate_no_reparent()
function is doing, so we can simply make it available to drivers.
Cc: Abel Vesa <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: "Andreas Färber" <[email protected]>
Cc: AngeloGioacchino Del Regno <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Charles Keepax <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Chunyan Zhang <[email protected]>
Cc: Claudiu Beznea <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Lechner <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Jaroslav Kysela <[email protected]>
Cc: Jernej Skrabec <[email protected]>
Cc: Jonathan Hunter <[email protected]>
Cc: Kishon Vijay Abraham I <[email protected]>
Cc: Liam Girdwood <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Luca Ceresoli <[email protected]>
Cc: Manivannan Sadhasivam <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Markus Schneider-Pargmann <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Mikko Perttunen <[email protected]>
Cc: Miles Chen <[email protected]>
Cc: Nicolas Ferre <[email protected]>
Cc: Orson Zhai <[email protected]>
Cc: Paul Cercueil <[email protected]>
Cc: Peng Fan <[email protected]>
Cc: Peter De Schrijver <[email protected]>
Cc: Prashant Gaikwad <[email protected]>
Cc: Richard Fitzgerald <[email protected]>
Cc: Samuel Holland <[email protected]>
Cc: Sascha Hauer <[email protected]>
Cc: Sekhar Nori <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: Thierry Reding <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vinod Koul <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: NXP Linux Team <[email protected]>
Cc: [email protected]
Cc: Pengutronix Kernel Team <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
| Reported-by: kernel test robot <[email protected]>:
|
|
The security keys sysctls are already declared on its own file,
just move the sysctl registration to its own file to help avoid
merge conflicts on sysctls.c, and help with clearing up sysctl.c
further.
This creates a small penalty of 23 bytes:
./scripts/bloat-o-meter vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_security_keys_sysctls - 33 +33
__pfx_init_security_keys_sysctls - 16 +16
sysctl_init_bases 85 59 -26
Total: Before=21256937, After=21256960, chg +0.00%
But soon we'll be saving tons of bytes anyway, as we modify the
sysctl registrations to use ARRAY_SIZE and so we get rid of all the
empty array elements so let's just clean this up now.
Reviewed-by: Paul Moore <[email protected]>
Acked-by: Jarkko Sakkinen <[email protected]>
Acked-by: David Howells <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
|
|
Move the umh sysctl registration to its own file, the array is
already there. We do this to remove the clutter out of kernel/sysctl.c
to avoid merge conflicts.
This also lets the sysctls not be built at all now when CONFIG_SYSCTL
is not enabled.
This has a small penalty of 23 bytes but soon we'll be removing
all the empty entries on sysctl arrays so just do this cleanup
now:
./scripts/bloat-o-meter vmlinux.base vmlinux.1
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_umh_sysctls - 33 +33
__pfx_init_umh_sysctls - 16 +16
sysctl_init_bases 111 85 -26
Total: Before=21256914, After=21256937, chg +0.00%
Acked-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
|
|
This change makes function kallsyms_show_value() as
generic function without dependency on CONFIG_KALLSYMS.
Now module address will be displayed with lsmod and /proc/modules.
Earlier:
=======
/ # insmod test.ko
/ # lsmod
test 12288 0 - Live 0x0000000000000000 (O) // No Module Load address
/ #
With change:
==========
/ # insmod test.ko
/ # lsmod
test 12288 0 - Live 0xffff800000fc0000 (O) // Module address
/ # cat /proc/modules
test 12288 0 - Live 0xffff800000fc0000 (O)
Co-developed-by: Onkarnath <[email protected]>
Signed-off-by: Onkarnath <[email protected]>
Signed-off-by: Maninder Singh <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
net/sched/sch_taprio.c
d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")
net/ipv4/sysctl_net_ipv4.c
e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294")
ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear")
https://lore.kernel.org/all/[email protected]/
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move struct rt5033_battery from the mfd header into the battery driver because
it's not used by others.
Within struct rt5033_battery, remove the line "struct rt5033_dev *rt5033;"
because it doesn't get used.
In rt5033.h, remove #include <linux/power_supply.h>, it's not necessary
anymore.
In rt5033_battery.c, remove #include <linux/mfd/rt5033.h>, it's not necessary
anymore either. Instead add #include <linux/regmap.h> and
Signed-off-by: Jakob Hauser <[email protected]>
Acked-by: Sebastian Reichel <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/736e1cbee257853cb3d1da6f05c184e9a053263b.1684182964.git.jahau@rocketmail.com
|
|
This patch adds device driver of Richtek RT5033 PMIC. The driver supports
switching charger. rt5033 charger provides three charging modes. The charging
modes are pre-charge mode, fast charge mode and constant voltage mode. They
vary in charge rate, the charge parameters can be controlled by i2c interface.
Tested-by: Raymond Hackley <[email protected]>
Signed-off-by: Jakob Hauser <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Acked-by: Sebastian Reichel <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/9556d4ebb30fd321e37aa0eb343554122e4720c9.1684182964.git.jahau@rocketmail.com
|
|
Order the register blocks to have the masks in descending manner.
Add new defines for constant voltage shift (RT5033_CHGCTRL2_CV_SHIFT),
MIVR mask (RT5033_CHGCTRL4_MIVR_MASK), pre-charge current shift
(RT5033_CHGCTRL4_IPREC_SHIFT), internal timer disable
(RT5033_INT_TIMER_DISABLE), termination disable (RT5033_TE_DISABLE),
CFO disable (RT5033_CFO_DISABLE), UUG disable (RT5033_CHARGER_UUG_DISABLE).
The fast charge timer type needs to be written on mask 0x38
(RT5033_CHGCTRL3_TIMER_MASK). To avoid a bit shift on application, change the
values of the timer types to fit the mask. Added the timout duration as a
comment. And the timer between TIMER8 and TIMER12 is most likely TIMER10, see
e.g. RT5036 [1] page 28 bottom.
Add value options for MIVR (Minimum Input Voltage Regulation).
Move RT5033_TE_ENABLE_MASK to the block "RT5033 CHGCTRL1 register", in order
to have the masks of the register collected there. To fit the naming scheme,
rename it to RT5033_CHGCTRL1_TE_EN_MASK.
Move RT5033_CHG_MAX_CURRENT to the block "RT5033 charger fast-charge current".
Add new defines RT5033_CV_MAX_VOLTAGE and RT5033_CHG_MAX_PRE_CURRENT to the
blocks "RT5033 charger constant charge voltage" and "RT5033 charger pre-charge
current limits".
In include/linux/mfd/rt5033.h, turn power_supply "psy" into a pointer in order
to use it in devm_power_supply_register().
[1] https://media.digikey.com/pdf/Data%20Sheets/Richtek%20PDF/RT5036%20%20Preliminary.pdf
Signed-off-by: Jakob Hauser <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/31c750ae13a1c1896b51d8f0a0d9869f8b85624f.1684182964.git.jahau@rocketmail.com
|
|
The charger state mask RT5033_CHG_STAT_MASK should be 0x30 [1][2].
The high impedance mask RT5033_RT_HZ_MASK is actually value 0x02 [3] and is
assosiated to the RT5033 CHGCTRL1 register [4]. Accordingly also change
RT5033_CHARGER_HZ_ENABLE to 0x02 to avoid the need of a bit shift upon
application.
For input current limiting AICR mode, the define for the 1000 mA step was
missing [5]. Additionally add the define for DISABLE option. Concerning the
mask, remove RT5033_AICR_MODE_MASK because there is already
RT5033_CHGCTRL1_IAICR_MASK further up. They are redundant and the upper one
makes more sense to have the masks of a register colleted there as an
overview.
[1] https://github.com/msm8916-mainline/linux-downstream/blob/GT-I9195I/drivers/battery/rt5033_charger.c#L669-L682
[2] https://github.com/torvalds/linux/blob/v6.0/include/linux/mfd/rt5033-private.h#L59-L62
[3] https://github.com/msm8916-mainline/linux-downstream/blob/GT-I9195I/include/linux/battery/charger/rt5033_charger.h#L44
[4] https://github.com/msm8916-mainline/linux-downstream/blob/GT-I9195I/drivers/battery/rt5033_charger.c#L223
[5] https://github.com/msm8916-mainline/linux-downstream/blob/GT-I9195I/drivers/battery/rt5033_charger.c#L278
Signed-off-by: Jakob Hauser <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/2f17beec3d6c59b41d7e2451d177dc8aaeb7efe2.1684182964.git.jahau@rocketmail.com
|
|
After reading the data from the DEVICE_ID register, mask 0x0f needs to be
applied to extract the revision of the chip [1].
The other part of the DEVICE_ID register, mask 0xf0, is a vendor identification
code. That's how it is set up at similar products of Richtek, e.g. RT9455 [2]
page 21 top.
[1] https://github.com/msm8916-mainline/linux-downstream/blob/GT-I9195I/drivers/mfd/rt5033_core.c#L484
[2] https://www.richtek.com/assets/product_file/RT9455/DS9455-00.pdf
Signed-off-by: Jakob Hauser <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/9a98521ffdf76851d5d344afa6ce65f692ecc024.1684182964.git.jahau@rocketmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can, wifi, netfilter, bluetooth and ebpf.
Current release - regressions:
- bpf: sockmap: avoid potential NULL dereference in
sk_psock_verdict_data_ready()
- wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()
- phylink: actually fix ksettings_set() ethtool call
- eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3
Current release - new code bugs:
- wifi: mt76: fix possible NULL pointer dereference in
mt7996_mac_write_txwi()
Previous releases - regressions:
- netfilter: fix NULL pointer dereference in nf_confirm_cthelper
- wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS
- openvswitch: fix upcall counter access before allocation
- bluetooth:
- fix use-after-free in hci_remove_ltk/hci_remove_irk
- fix l2cap_disconnect_req deadlock
- nic: bnxt_en: prevent kernel panic when receiving unexpected
PHC_UPDATE event
Previous releases - always broken:
- core: annotate rfs lockless accesses
- sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values
- netfilter: add null check for nla_nest_start_noflag() in
nft_dump_basechain_hook()
- bpf: fix UAF in task local storage
- ipv4: ping_group_range: allow GID from 2147483648 to 4294967294
- ipv6: rpl: fix route of death.
- tcp: gso: really support BIG TCP
- mptcp: fixes for user-space PM address advertisement
- smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT
- can: avoid possible use-after-free when j1939_can_rx_register fails
- batman-adv: fix UaF while rescheduling delayed work
- eth: qede: fix scheduling while atomic
- eth: ice: make writes to /dev/gnssX synchronous"
* tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
bnxt_en: Implement .set_port / .unset_port UDP tunnel callbacks
bnxt_en: Prevent kernel panic when receiving unexpected PHC_UPDATE event
bnxt_en: Skip firmware fatal error recovery if chip is not accessible
bnxt_en: Query default VLAN before VNIC setup on a VF
bnxt_en: Don't issue AP reset during ethtool's reset operation
bnxt_en: Fix bnxt_hwrm_update_rss_hash_cfg()
net: bcmgenet: Fix EEE implementation
eth: ixgbe: fix the wake condition
eth: bnxt: fix the wake condition
lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release()
bpf: Add extra path pointer check to d_path helper
net: sched: fix possible refcount leak in tc_chain_tmplt_add()
net: sched: act_police: fix sparse errors in tcf_police_dump()
net: openvswitch: fix upcall counter access before allocation
net: sched: move rtm_tca_policy declaration to include file
ice: make writes to /dev/gnssX synchronous
net: sched: add rcu annotations around qdisc->qdisc_sleeping
rfs: annotate lockless accesses to RFS sock flow table
rfs: annotate lockless accesses to sk->sk_rxhash
virtio_net: use control_buf for coalesce params
...
|
|
Exposes consumer library functions providing support for interfaces
compatible with the venerable Intel 8254 Programmable Interval Timer
(PIT).
The Intel 8254 PIT first appeared in the early 1980s and was used
initially in IBM PC compatibles. The popularity of the original Intel
825x family of chips led to many subsequent variants and clones of the
interface in various chips and integrated circuits. Although still
popular, interfaces compatible with the Intel 8254 PIT are nowdays
typically found embedded in larger VLSI processing chips and FPGA
components rather than as discrete ICs.
A CONFIG_I8254 Kconfig option is introduced by this patch. Modules
wanting access to these i8254 library functions should select this
Kconfig option, and import the I8254 symbol namespace.
Link: https://lore.kernel.org/r/f6fe32c2db9525d816ab1a01f45abad56c081652.1681665189.git.william.gray@linaro.org/
Signed-off-by: William Breathitt Gray <[email protected]>
|
|
beep_enable, inX_beep, currX_beep, fanX_beep, and tempX_beep
are standard attributes mentioned in the sysfs interface
specification but not implemented in the hwmon core. Since
these are not deprecated, implement them.
Adding beep_mask is not necessary, as it is deprecated and
the drivers already using it are manually defining it.
Signed-off-by: James Seo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Move netfs_extract_iter_to_sg() to lib/scatterlist.c as it's going to be
used by more than just network filesystems (AF_ALG, for example).
Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Rename netfs_extract_iter_to_sg() and its auxiliary functions to drop the
netfs_ prefix.
Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "Matthew Wilcox (Oracle)" <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Add basic support for XPCS using 10GBASE-R interface. This mode will
be extended to use interrupt, so set pcs.poll false. And avoid soft
reset so that the device using this mode is in the default configuration.
Signed-off-by: Jiawen Wu <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The SLAB_TYPESAFE_BY_RCU example code snippet uses a single RCU
read-side critical section for retries.
'Documentation/RCU/rculist_nulls.rst' has similar example code snippet,
and commit da82af04352b ("doc: Update and wordsmith rculist_nulls.rst")
broke it up. Apply the change to SLAB_TYPESAFE_BY_RCU example code
snippet, too.
Signed-off-by: SeongJae Park <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
|
|
An example code snippet for SLAB_TYPESAFE_BY_RCU is missing a semicolon.
Add it.
Signed-off-by: SeongJae Park <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
|
|
Move all of the ARM-specific initialization into one function namely
acpi_arm_init(), so it is not necessary to modify/update bus.c every
time a new piece of it is added.
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Suggested-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Hanjun Guo <[email protected]>
Link: https://lore.kernel.org/r/CAJZ5v0iBZRZmV_oU+VurqxnVMbFN_ttqrL=cLh0sUH+=u0PYsw@mail.gmail.com
Signed-off-by: Sudeep Holla <[email protected]>
Reviewed-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Shaoqin Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
|
|
Add generated_pkt_steering_fail and handled_pkt_steering_fail to devlink
heatlth reporter.
generated_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation within the vport steering domain.
handled_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation, originated by the vport.
Also, update devlink reporter functionality documentation with the newly
exposed counters.
Signed-off-by: Lama Kayal <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Introduce a generic APIs to iterate over all the devices which are part
of the LAG. This API replace mlx5_lag_get_peer_mdev() which retrieve
only a single peer device from the lag.
Signed-off-by: Shay Drory <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
'nocb.2023.05.11a', 'rcu-tasks.2023.05.10a', 'torture.2023.05.15a' and 'rcu-urgent.2023.06.06a' into HEAD
doc.2023.05.10a: Documentation updates
fixes.2023.05.11a: Miscellaneous fixes
kvfree.2023.05.10a: kvfree_rcu updates
nocb.2023.05.11a: Callback-offloading updates
rcu-tasks.2023.05.10a: Tasks RCU updates
torture.2023.05.15a: Torture-test updates
rcu-urgent.2023.06.06a: Urgent SRCU fix
|
|
In commit 95433f726301 ("srcu: Begin offloading srcu_struct fields to
srcu_update"), a new struct srcu_usage field was added, but was not
properly initialized. This led to a "spinlock bad magic" BUG when the
SRCU notifier was ever used. This was observed in the MediaTek CCI
devfreq driver on next-20230525. The trimmed stack trace is as follows:
BUG: spinlock bad magic on CPU#4, swapper/0/1
lock: 0xffffff80ff529ac0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Call trace:
spin_bug+0xa4/0xe8
do_raw_spin_lock+0xec/0x120
_raw_spin_lock_irqsave+0x78/0xb8
synchronize_srcu+0x3c/0x168
srcu_notifier_chain_unregister+0x5c/0xa0
cpufreq_unregister_notifier+0x94/0xe0
devfreq_passive_event_handler+0x7c/0x3e0
devfreq_remove_device+0x48/0xe8
Add __SRCU_USAGE_INIT() to SRCU_NOTIFIER_INIT() so that srcu_usage gets
initialized properly.
Reported-by: Jon Hunter <[email protected]>
Fixes: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update")
Signed-off-by: Chen-Yu Tsai <[email protected]>
Tested-by: AngeloGioacchino Del Regno <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Michał Mirosław" <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Sachin Sant <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]
Tested-by: Jon Hunter <[email protected]>
Acked-by: Zqiang <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
We may use traditional dev_*() macros instead of custom ones
provided by the driver.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|