aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-08-20net: validate veth and vxcan peer ifindexesJakub Kicinski1-2/+2
veth and vxcan need to make sure the ifindexes of the peer are not negative, core does not validate this. Using iproute2 with user-space-level checking removed: Before: # ./ip link add index 10 type veth peer index -1 # ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:74:b2:03 brd ff:ff:ff:ff:ff:ff 10: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 8a:90:ff:57:6d:5d brd ff:ff:ff:ff:ff:ff -1: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:ed:18:e6:fa:7f brd ff:ff:ff:ff:ff:ff Now: $ ./ip link add index 10 type veth peer index -1 Error: ifindex can't be negative. This problem surfaced in net-next because an explicit WARN() was added, the root cause is older. Fixes: e6f8f1a739b6 ("veth: Allow to create peer link with given ifindex") Fixes: a8f820a380a2 ("can: add Virtual CAN Tunnel driver (vxcan)") Reported-by: [email protected] Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-20RDMA/mlx5: Handles RoCE MACsec steering rules addition and deletionPatrisious Haddad1-1/+3
Add RoCE MACsec rules when a gid is added for the MACsec netdevice and handle their cleanup when the gid is removed or the MACsec SA is deleted. Also support alias IP for the MACsec device, as long as we don't have more ips than what the gid table can hold. In addition handle the case where a gid is added but there are still no SAs added for the MACsec device, so the rules are added later on when the SAs are added. Signed-off-by: Patrisious Haddad <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20net/mlx5: Add RoCE MACsec steering infrastructure in corePatrisious Haddad3-0/+36
Adds all the core steering helper functions that are needed in order to setup RoCE steering rules which includes both the RX and TX rules addition and deletion. As well as exporting the function to be ready to use from the IB driver where we expose functions to allow deletion of all rules, which is needed when a GID is deleted, or a deletion of a specific rule when an SA is deleted, and a similar manner for the rules addition. These functions are used in a later patch by IB driver to trigger the rules addition/deletion when needed. Signed-off-by: Patrisious Haddad <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20net/mlx5: Add MACsec priorities in RDMA namespacesPatrisious Haddad1-0/+2
Add MACsec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the MACsec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20RDMA/mlx5: Implement MACsec gid addition and deletionPatrisious Haddad1-0/+44
Handle MACsec IP ambiguity issue, since mlx5 hw can't support programming both the MACsec and the physical gid when they have the same IP address, because it wouldn't know to whom to steer the traffic. Hence in such case we delete the physical gid from the hw gid table, which would then cause all traffic sent over it to fail, and we'll only be able to send traffic over the MACsec gid. Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Raed Salem <[email protected]> Reviewed-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20net/mlx5e: Move MACsec flow steering and statistics database from ethernet ↵Patrisious Haddad1-0/+3
to core Since now MACsec flow steering (macsec_fs) and MACsec statistics (stats) are maintained by the core driver, move their data as well to be saved inside core structures instead of staying part of ethernet MACsec database. In addition cleanup all MACsec stats functions from the ethernet MACsec code and move what's needed to be part of macsec_fs instead. Signed-off-by: Patrisious Haddad <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20macsec: add functions to get macsec real netdevice and check offloadPatrisious Haddad1-0/+2
Given a macsec net_device add two functions to return the real net_device for that device, and check if that macsec device is offloaded or not. This is needed for auxiliary drivers that implement MACsec offload, but have flows which are triggered over the macsec net_device, this allows the drivers in such cases to verify if the device is offloaded or not, and to access the real device of that macsec device, which would belong to the driver, and would be needed for the offload procedure. Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Raed Salem <[email protected]> Reviewed-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-08-20Merge tag 'tty-6.5-rc7' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty and serial core fixes for 6.5-rc7 that resolve a lot of reported issues. Primarily in here are the fixes for the serial bus code from Tony that came in -rc1, as it hit wider testing with the huge number of different types of systems and serial ports. All of the reported issues with duplicate names and other issues with this code are now resolved. Other than that included in here is: - n_gsm fix for a previous fix - 8250 lockdep annotation fix - fsl_lpuart serial driver fix - TIOCSTI documentation update for previous CAP_SYS_ADMIN change All of these have been in linux-next for a while with no reported problems" * tag 'tty-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: core: Fix serial core port id, including multiport devices serial: 8250: drop lockdep annotation from serial8250_clear_IER() tty: n_gsm: fix the UAF caused by race condition in gsm_cleanup_mux serial: core: Revert port_id use TIOCSTI: Document CAP_SYS_ADMIN behaviour in Kconfig serial: 8250: Fix oops for port->pm on uart_change_pm() serial: 8250: Reinit port_id when adding back serial8250_isa_devs serial: core: Fix kmemleak issue for serial core device remove MAINTAINERS: Merge TTY layer and serial drivers serial: core: Fix serial_base_match() after fixing controller port name serial: core: Fix serial core controller port name to show controller id serial: core: Fix serial core port id to not use port->line serial: core: Controller id cannot be negative tty: serial: fsl_lpuart: Clear the error flags by writing 1 for lpuart32 platforms
2023-08-19Merge tag 'fbdev-for-6.5-rc7' of ↵Linus Torvalds1-12/+0
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes and cleanups from Helge Deller: - various code cleanups in amifb, atmel_lcdfb, ssd1307fb, kyro and goldfishfb * tag 'fbdev-for-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: goldfishfb: Do not check 0 for platform_get_irq() fbdev: atmel_lcdfb: Remove redundant of_match_ptr() fbdev: kyro: Remove unused declarations fbdev: ssd1307fb: Print the PWM's label instead of its number fbdev: mmp: fix value check in mmphw_probe() fbdev: amifb: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
2023-08-19net: add skb_queue_purge_reason and __skb_queue_purge_reasonEric Dumazet2-4/+22
skb_queue_purge() and __skb_queue_purge() become wrappers around the new generic functions. New SKB_DROP_REASON_QUEUE_PURGE drop reason is added, but users can start adding more specific reasons. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-19net/smc: Extend SMCR v2 linkgroup netlink attributeGuangguan Wang1-0/+2
Add SMC_NLA_LGR_R_V2_MAX_CONNS and SMC_NLA_LGR_R_V2_MAX_LINKS to SMCR v2 linkgroup netlink attribute SMC_NLA_LGR_R_V2 for linkgroup's detail info showing. Signed-off-by: Guangguan Wang <[email protected]> Reviewed-by: Jan Karcher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-18sock: annotate data-races around prot->memory_pressureEric Dumazet1-3/+4
*prot->memory_pressure is read/writen locklessly, we need to add proper annotations. A recent commit added a new race, it is time to audit all accesses. Fixes: 2d0c88e84e48 ("sock: Fix misuse of sk_under_memory_pressure()") Fixes: 4d93df0abd50 ("[SCTP]: Rewrite of sctp buffer management code") Signed-off-by: Eric Dumazet <[email protected]> Cc: Abel Wu <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-18ASoC: cs42l43: Add support for the cs42l43Charles Keepax1-0/+17
The CS42L43 is an audio CODEC with integrated MIPI SoundWire interface (Version 1.2.1 compliant), I2C, SPI, and I2S/TDM interfaces designed for portable applications. It provides a high dynamic range, stereo DAC for headphone output, two integrated Class D amplifiers for loudspeakers, and two ADCs for wired headset microphone input or stereo line input. PDM inputs are provided for digital microphones. The ASoC component provides the majority of the functionality of the device, all the audio functions. Signed-off-by: Charles Keepax <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-08-18Merge branch '40GbE' of ↵Jakub Kicinski1-45/+82
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== virtchnl: fix fake 1-elem arrays Alexander Lobakin says: 6.5-rc1 started spitting warning splats when composing virtchnl messages, precisely on virtchnl_rss_key and virtchnl_lut: [ 84.167709] memcpy: detected field-spanning write (size 52) of single field "vrk->key" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1095 (size 1) [ 84.169915] WARNING: CPU: 3 PID: 11 at drivers/net/ethernet/intel/ iavf/iavf_virtchnl.c:1095 iavf_set_rss_key+0x123/0x140 [iavf] ... [ 84.191982] Call Trace: [ 84.192439] <TASK> [ 84.192900] ? __warn+0xc9/0x1a0 [ 84.193353] ? iavf_set_rss_key+0x123/0x140 [iavf] [ 84.193818] ? report_bug+0x12c/0x1b0 [ 84.194266] ? handle_bug+0x42/0x70 [ 84.194714] ? exc_invalid_op+0x1a/0x50 [ 84.195149] ? asm_exc_invalid_op+0x1a/0x20 [ 84.195592] ? iavf_set_rss_key+0x123/0x140 [iavf] [ 84.196033] iavf_watchdog_task+0xb0c/0xe00 [iavf] ... [ 84.225476] memcpy: detected field-spanning write (size 64) of single field "vrl->lut" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1127 (size 1) [ 84.227190] WARNING: CPU: 27 PID: 1044 at drivers/net/ethernet/intel/ iavf/iavf_virtchnl.c:1127 iavf_set_rss_lut+0x123/0x140 [iavf] ... [ 84.246601] Call Trace: [ 84.247228] <TASK> [ 84.247840] ? __warn+0xc9/0x1a0 [ 84.248263] ? iavf_set_rss_lut+0x123/0x140 [iavf] [ 84.248698] ? report_bug+0x12c/0x1b0 [ 84.249122] ? handle_bug+0x42/0x70 [ 84.249549] ? exc_invalid_op+0x1a/0x50 [ 84.249970] ? asm_exc_invalid_op+0x1a/0x20 [ 84.250390] ? iavf_set_rss_lut+0x123/0x140 [iavf] [ 84.250820] iavf_watchdog_task+0xb16/0xe00 [iavf] Gustavo already tried to fix those back in 2021[0][1]. Unfortunately, a VM can run a different kernel than the host, meaning that those structures are sorta ABI. However, it is possible to have proper flex arrays + struct_size() calculations and still send the very same messages with the same sizes. The common rule is: elem[1] -> elem[] size = struct_size() + <difference between the old and the new msg size> The "old" size in the current code is calculated 3 different ways for 10 virtchnl structures total. Each commit addresses one of the ways cumulatively instead of per-structure. I was planning to send it to -net initially, but given that virtchnl was renamed from i40evf and got some fat style cleanup commits in the past, it's not very straightforward to even pick appropriate SHAs, not speaking of automatic portability. I may send manual backports for a couple of the latest supported kernels later on if anyone needs it at all. [0] https://lore.kernel.org/all/20210525230912.GA175802@embeddedor [1] https://lore.kernel.org/all/20210525231851.GA176647@embeddedor * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: virtchnl: fix fake 1-elem arrays for structures allocated as `nents` virtchnl: fix fake 1-elem arrays in structures allocated as `nents + 1` virtchnl: fix fake 1-elem arrays in structs allocated as `nents + 1` - 1 ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-18Add cs42l43 PC focused SoundWire CODECMark Brown28-105/+1445
Merge series from Charles Keepax <[email protected]>: This patch chain adds support for the Cirrus Logic cs42l43 PC focused SoundWire CODEC.
2023-08-18regulator: tps65086: Select dedicated regulator config for chip variantAndre Werner1-0/+3
Some configurations differ between chip variants, e,g. the register to control the on of state of LDOA1 and SWB2. Thus, it is necessary to choose the correct configuration for a dedicated device. If the wrong configuration was used, the LDOA1 output that was disabled by the bootloader was enabled in Kernel again. Each chip variant gets its dedicated configuration selected by the chip ID previously collected from MFD probe function. The VTT enum value (tps65086_regulators) is shifted because not all chip variants have a separate SWB2 switch. Sometimes they are merged. So the configuration possibilities differ, thus the regulator configuration arrays have a different length. Signed-off-by: Andre Werner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-08-18regulator: Merge dependency for tps65086Mark Brown1-6/+14
Merge Lee Jones' tag 'ib-mfd-regulator-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into [email protected] so we can build the tps65086 support: Immutable branch between MFD and Regulator due for the v6.6 merge window
2023-08-18mfd: 88pm860x: Remove unused extern declarationsYue Haibing1-6/+0
commit 260a127bfbeb ("mfd: 88pm860x-i2c: Purge unused functions") left behind this. Signed-off-by: Yue Haibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18mfd: ab8500: Remove unused extern declarationsYue Haibing1-4/+0
commit d28f1db8187d ("mfd: Remove confusing ab8500-i2c file and merge into ab8500-core") left behind this. Signed-off-by: Yue Haibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18mfd: max77686: Remove unused extern declarationsYue Haibing1-4/+0
max77686_irq_init() and max77686_irq_exit() are not used since commit 6f1c1e71d933 ("mfd: max77686: Convert to use regmap_irq"). And max77686_irq_resume() never be implemented since introduced in commit dae8a969d512 ("mfd: Add Maxim 77686 driver"). Signed-off-by: Yue Haibing <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Chanwoo Choi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18mfd: rz-mtu3: Link time dependenciesArnd Bergmann1-66/+0
The new set of drivers for RZ/G2L MTU3a tries to enable compile-testing the individual client drivers even when the MFD portion is disabled but gets it wrong, causing a link failure when the core is in a loadable module but the other drivers are built-in: x86_64-linux-ld: drivers/pwm/pwm-rz-mtu3.o: in function `rz_mtu3_pwm_apply': pwm-rz-mtu3.c:(.text+0x4bf): undefined reference to `rz_mtu3_8bit_ch_write' x86_64-linux-ld: pwm-rz-mtu3.c:(.text+0x509): undefined reference to `rz_mtu3_disable' arm-linux-gnueabi-ld: drivers/counter/rz-mtu3-cnt.o: in function `rz_mtu3_cascade_counts_enable_get': rz-mtu3-cnt.c:(.text+0xbec): undefined reference to `rz_mtu3_shared_reg_read' It seems better not to add the extra complexity here but instead just use a normal hard dependency, so remove the #else portion in the header along with the "|| COMPILE_TEST". This could also be fixed by having slightly more elaborate Kconfig dependencies or using the cursed 'IS_REACHABLE()' helper, but in practice it's already possible to compile-test all these drivers by enabling the mtd portion. Fixes: 254d3a727421c ("pwm: Add Renesas RZ/G2L MTU3a PWM driver") Fixes: 0be8907359df4 ("counter: Add Renesas RZ/G2L MTU3a counter driver") Fixes: 654c293e1687b ("mfd: Add Renesas RZ/G2L MTU3a core driver") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Thierry Reding <[email protected]> Reviewed-by: Biju Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18mfd: db8500-prcmu: Remove unused inline functionsYueHaibing1-21/+0
Since commit b0e846248de5 ("mfd: db8500-prcmu: Remove dead code for a non-existing config") these inline helpers also no need any more. Signed-off-by: YueHaibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18mfd: hi655x-pmic: Convert to devm_platform_ioremap_resource()Yangtao Li1-1/+0
Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18Merge tags 'ib-mfd-pinctrl-soundwire-v6.6' and 'ib-mfd-regulator-v6.6' into ↵Lee Jones1-6/+14
ibs-for-mfd-merged Immutable branch between MFD, Pinctrl and soundwire due for the v6.6 merge window Immutable branch between MFD and Regulator due for the v6.6 merge window
2023-08-18mfd: tps65086: Read DEVICE ID register 1 from deviceAndre Werner1-6/+14
This commit prepares a following commit for the regulator part of the MFD. The driver should support different device chips that differ in their register definitions, for instance to control LDOA1 and SWB2. So it is necessary to use a dedicated regulator description for a specific device variant. Thus, the content from DEVICEID Register 1 is used to choose a dedicated configuration between the different device variants. Signed-off-by: Andre Werner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-08-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8-17/+17
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/sfc/tc.c fa165e194997 ("sfc: don't unregister flow_indr if it was never registered") 3bf969e88ada ("sfc: add MAE table machinery for conntrack table") https://lore.kernel.org/all/[email protected]/ No adjacent changes. Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-18nmi_backtrace: allow excluding an arbitrary CPUDouglas Anderson1-7/+7
The APIs that allow backtracing across CPUs have always had a way to exclude the current CPU. This convenience means callers didn't need to find a place to allocate a CPU mask just to handle the common case. Let's extend the API to take a CPU ID to exclude instead of just a boolean. This isn't any more complex for the API to handle and allows the hardlockup detector to exclude a different CPU (the one it already did a trace for) without needing to find space for a CPU mask. Arguably, this new API also encourages safer behavior. Specifically if the caller wants to avoid tracing the current CPU (maybe because they already traced the current CPU) this makes it more obvious to the caller that they need to make sure that the current CPU ID can't change. [[email protected]: fix trigger_allbutcpu_cpu_backtrace() stub] Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid Signed-off-by: Douglas Anderson <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: kernel test robot <[email protected]> Cc: Lecopzer Chen <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Pingfan Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18range.h: Move resource API and constant to respective filesAndy Shevchenko2-8/+2
range.h works with struct range data type. The resource_size_t is an alien here. (1) Move cap_resource() implementation into its only user, and (2) rename and move RESOURCE_SIZE_MAX to limits.h. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Cc: Alexander Sverdlin <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18kthread: unexport __kthread_should_park()Greg Kroah-Hartman1-1/+0
There are no in-kernel users of __kthread_should_park() so mark it as static and do not export it. Link: https://lkml.kernel.org/r/2023080450-handcuff-stump-1d6e@gregkh Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: John Stultz <[email protected]> Cc: "Peter Zijlstra (Intel)" <[email protected]> Cc: "Arve Hjønnevåg" <[email protected]> Cc: Valentin Schneider <[email protected]> Cc: Andrew Morton <[email protected]> Cc: "Christian Brauner (Microsoft)" <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Zqiang <[email protected]> Cc: Prathu Baronia <[email protected]> Cc: Sami Tolvanen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18drm/i915: Move abs_diff() to math.hAndy Shevchenko1-0/+19
abs_diff() belongs to math.h. Move it there. This will allow others to use it. [[email protected]: add abs_diff() documentation] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix comment, per Randy] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Jiri Slaby <[email protected]> # tty/serial Acked-by: Jani Nikula <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> # gpu/ipu-v3 Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Helge Deller <[email protected]> Cc: Imre Deak <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18genetlink: replace custom CONCATENATE() implementationAndy Shevchenko2-18/+17
Replace custom implementation of the macros from args.h. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Gow <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18arm64: smccc: replace custom COUNT_ARGS() & CONCATENATE() implementationsAndy Shevchenko1-37/+32
Replace custom implementation of the macros from args.h. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Gow <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18kernel.h: split out COUNT_ARGS() and CONCATENATE() to args.hAndy Shevchenko5-8/+32
Patch series "kernel.h: Split out a couple of macros to args.h", v4. There are macros in kernel.h that can be used outside of that header. Split them to args.h and replace open coded variants. This patch (of 4): kernel.h is being used as a dump for all kinds of stuff for a long time. The COUNT_ARGS() and CONCATENATE() macros may be used in some places without need of the full kernel.h dependency train with it. Here is the attempt on cleaning it up by splitting out these macros(). While at it, include new header where it's being used. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> Cc: Bjorn Helgaas <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> [PCI] Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Gow <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18powerpc/book3s64/mm: enable transparent pud hugepageAneesh Kumar K.V1-0/+10
This is enabled only with radix translation and 1G hugepage size. This will be used with devdax device memory with a namespace alignment of 1G. Anon transparent hugepage is not supported even though we do have helpers checking pud_trans_huge(). We should never find that return true. The only expected pte bit combination is _PAGE_PTE | _PAGE_DEVMAP. Some of the helpers are never expected to get called on hash translation and hence is marked to call BUG() in such a case. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18powerpc/mm/trace: convert trace event to trace event classAneesh Kumar K.V1-7/+16
A follow-up patch will add a pud variant for this same event. Using event class makes that addition simpler. No functional change in this patch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm/vmemmap optimization: split hugetlb and devdax vmemmap optimizationAneesh Kumar K.V1-1/+1
Arm disabled hugetlb vmemmap optimization [1] because hugetlb vmemmap optimization includes an update of both the permissions (writeable to read-only) and the output address (pfn) of the vmemmap ptes. That is not supported without unmapping of pte(marking it invalid) by some architectures. With DAX vmemmap optimization we don't require such pte updates and architectures can enable DAX vmemmap optimization while having hugetlb vmemmap optimization disabled. Hence split DAX optimization support into a different config. s390, loongarch and riscv don't have devdax support. So the DAX config is not enabled for them. With this change, arm64 should be able to select DAX optimization [1] commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm/huge pud: use transparent huge pud helpers only with ↵Aneesh Kumar K.V1-0/+2
CONFIG_TRANSPARENT_HUGEPAGE pudp_set_wrprotect and move_huge_pud helpers are only used when CONFIG_TRANSPARENT_HUGEPAGE is enabled. Similar to pmdp_set_wrprotect and move_huge_pmd_helpers use architecture override only if CONFIG_TRANSPARENT_HUGEPAGE is set Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: add pud_same similar to __HAVE_ARCH_P4D_SAMEAneesh Kumar K.V1-0/+3
This helps architectures to override pmd_same and pud_same independently. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm/vmemmap: improve vmemmap_can_optimize and allow architectures to overrideAneesh Kumar K.V1-4/+23
dax vmemmap optimization requires a minimum of 2 PAGE_SIZE area within vmemmap such that tail page mapping can point to the second PAGE_SIZE area. Enforce that in vmemmap_can_optimize() function. Architectures like powerpc also want to enable vmemmap optimization conditionally (only with radix MMU translation). Hence allow architecture override. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: change pudp_huge_get_and_clear_full take vm_area_struct as argAneesh Kumar K.V1-2/+2
We will use this in a later patch to do tlb flush when clearing pud entries on powerpc. This is similar to commit 93a98695f2f9 ("mm: change pmdp_huge_get_and_clear_full take vm_area_struct as arg") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm/hugepage pud: allow arch-specific helper function to check huge page pud ↵Aneesh Kumar K.V1-0/+3
support Patch series "Add support for DAX vmemmap optimization for ppc64", v6. This patch series implements changes required to support DAX vmemmap optimization for ppc64. The vmemmap optimization is only enabled with radix MMU translation and 1GB PUD mapping with 64K page size. The patch series also splits the hugetlb vmemmap optimization as a separate Kconfig variable so that architectures can enable DAX vmemmap optimization without enabling hugetlb vmemmap optimization. This should enable architectures like arm64 to enable DAX vmemmap optimization while they can't enable hugetlb vmemmap optimization. More details of the same are in patch "mm/vmemmap optimization: Split hugetlb and devdax vmemmap optimization". With 64K page size for 16384 pages added (1G) we save 14 pages With 4K page size for 262144 pages added (1G) we save 4094 pages With 4K page size for 512 pages added (2M) we save 6 pages This patch (of 13): Architectures like powerpc would like to enable transparent huge page pud support only with radix translation. To support that add has_transparent_pud_hugepage() helper that architectures can override. [[email protected]: use the new has_transparent_pud_hugepage()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Williams <[email protected]> Cc: Joao Martins <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: allow per-VMA locks on file-backed VMAsMatthew Wilcox (Oracle)2-18/+0
Remove the TCP layering violation by allowing per-VMA locks on all VMAs. The fault path will immediately fail in handle_mm_fault(). There may be a small performance reduction from this patch as a little unnecessary work will be done on each page fault. See later patches for the improvement. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Suren Baghdasaryan <[email protected]> Cc: Arjun Roy <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Punit Agrawal <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: remove CONFIG_PER_VMA_LOCK ifdefsMatthew Wilcox (Oracle)1-0/+6
Patch series "Handle most file-backed faults under the VMA lock", v3. This patchset adds the ability to handle page faults on parts of files which are already in the page cache without taking the mmap lock. This patch (of 10): Provide lock_vma_under_rcu() when CONFIG_PER_VMA_LOCK is not defined to eliminate ifdefs in the users. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Suren Baghdasaryan <[email protected]> Cc: Punit Agrawal <[email protected]> Cc: Arjun Roy <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18maple_tree: re-introduce entry to mas_preallocate() argumentsLiam R. Howlett1-1/+1
The current preallocation strategy is to preallocate the absolute worst-case allocation for a tree modification. The entry (or NULL) is needed to know how many nodes are needed to write to the tree. Start by adding the argument to the mas_preallocate() definition. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18maple_tree: introduce __mas_set_range()Liam R. Howlett1-3/+18
mas_set_range() resets the node to MAS_START, which will cause a re-walk of the tree to the range. This is unnecessary when the maple state is already at the correct location of the write. Add a function that only sets the range to avoid unnecessary re-walking of the tree. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: change do_vmi_align_munmap() tracking of VMAs to removeLiam R. Howlett1-2/+2
The majority of the calls to munmap a vm range is within a single vma. The maple tree is able to store a single entry at 0, with a size of 1 as a pointer and avoid any allocations. Change do_vmi_align_munmap() to store the VMAs being munmap()'ed into a tree indexed by the count. This will leverage the ability to store the first entry without a node allocation. Storing the entries into a tree by the count and not the vma start and end means changing the functions which iterate over the entries. Update unmap_vmas() and free_pgtables() to take a maple state and a tree end address to support this functionality. Passing through the same maple state to unmap_vmas() and free_pgtables() means the state needs to be reset between calls. This happens in the static unmap_region() and exit_mmap(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: don't drop VMA locks in mm_drop_all_locks()Jann Horn2-0/+13
Despite its name, mm_drop_all_locks() does not drop _all_ locks; the mmap lock is held write-locked by the caller, and the caller is responsible for dropping the mmap lock at a later point (which will also release the VMA locks). Calling vma_end_write_all() here is dangerous because the caller might have write-locked a VMA with the expectation that it will stay write-locked until the mmap_lock is released, as usual. This _almost_ becomes a problem in the following scenario: An anonymous VMA A and an SGX VMA B are mapped adjacent to each other. Userspace calls munmap() on a range starting at the start address of A and ending in the middle of B. Hypothetical call graph with additional notes in brackets: do_vmi_align_munmap [begin first for_each_vma_range loop] vma_start_write [on VMA A] vma_mark_detached [on VMA A] __split_vma [on VMA B] sgx_vma_open [== new->vm_ops->open] sgx_encl_mm_add __mmu_notifier_register [luckily THIS CAN'T ACTUALLY HAPPEN] mm_take_all_locks mm_drop_all_locks vma_end_write_all [drops VMA lock taken on VMA A before] vma_start_write [on VMA B] vma_mark_detached [on VMA B] [end first for_each_vma_range loop] vma_iter_clear_gfp [removes VMAs from maple tree] mmap_write_downgrade unmap_region mmap_read_unlock In this hypothetical scenario, while do_vmi_align_munmap() thinks it still holds a VMA write lock on VMA A, the VMA write lock has actually been invalidated inside __split_vma(). The call from sgx_encl_mm_add() to __mmu_notifier_register() can't actually happen here, as far as I understand, because we are duplicating an existing SGX VMA, but sgx_encl_mm_add() only calls __mmu_notifier_register() for the first SGX VMA created in a given process. So this could only happen in fork(), not on munmap(). But in my view it is just pure luck that this can't happen. Also, we wouldn't actually have any bad consequences from this in do_vmi_align_munmap(), because by the time the bug drops the lock on VMA A, we've already marked VMA A as detached, which makes it completely ineligible for any VMA-locked page faults. But again, that's just pure luck. So remove the vma_end_write_all(), so that VMA write locks are only ever released on mmap_write_unlock() or mmap_write_downgrade(). Also add comments to document the locking rules established by this patch. Link: https://lkml.kernel.org/r/[email protected] Fixes: eeff9a5d47f8 ("mm/mmap: prevent pagefault handler from racing with mmu_notifier registration") Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm/page_io: introduce bio_first_folio_all()ZhangPeng1-0/+5
Introduce bio_first_folio_all() to return a folio, which makes it easier to use. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Cc: Sidhartha Kumar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mm: fix obsolete function name above debug_pagealloc_enabled_static()Miaohe Lin1-2/+2
Since commit 04013513cc84 ("mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters"), init_debug_pagealloc() is converted to init_mem_debugging_and_hardening(). Later it's renamed to mem_debugging_and_hardening_init() via commit f2fc4b44ec2b ("mm: move init_mem_debugging_and_hardening() to mm/mm_init.c"). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-18mmu_notifiers: rename invalidate_range notifierAlistair Popple1-24/+24
There are two main use cases for mmu notifiers. One is by KVM which uses mmu_notifier_invalidate_range_start()/end() to manage a software TLB. The other is to manage hardware TLBs which need to use the invalidate_range() callback because HW can establish new TLB entries at any time. Hence using start/end() can lead to memory corruption as these callbacks happen too soon/late during page unmap. mmu notifier users should therefore either use the start()/end() callbacks or the invalidate_range() callbacks. To make this usage clearer rename the invalidate_range() callback to arch_invalidate_secondary_tlbs() and update documention. Link: https://lkml.kernel.org/r/6f77248cd25545c8020a54b4e567e8b72be4dca1.1690292440.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <[email protected]> Suggested-by: Jason Gunthorpe <[email protected]> Acked-by: Catalin Marinas <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Cc: Andrew Donnellan <[email protected]> Cc: Chaitanya Kumar Borah <[email protected]> Cc: Frederic Barrat <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Nicolin Chen <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zhi Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>