blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-12-06	units: add missing header	Andy Shevchenko	1	-0/+1
	BITS_PER_BYTE is defined in bits.h. Link: https://lkml.kernel.org/r/[email protected] Fixes: e8eed5f7366f ("units: Add BYTES_PER_*BIT") Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Cc: Damian Muszynski <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Herbert Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-06	hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write	Mike Kravetz	1	-4/+1
	The routine __vma_private_lock tests for the existence of a reserve map associated with a private hugetlb mapping. A pointer to the reserve map is in vma->vm_private_data. __vma_private_lock was checking the pointer for NULL. However, it is possible that the low bits of the pointer could be used as flags. In such instances, vm_private_data is not NULL and not a valid pointer. This results in the null-ptr-deref reported by syzbot: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] CPU: 0 PID: 5048 Comm: syz-executor139 Not tainted 6.6.0-rc7-syzkaller-00142-g88 8cf78c29e2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 1 0/09/2023 RIP: 0010:__lock_acquire+0x109/0x5de0 kernel/locking/lockdep.c:5004 ... Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5753 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 hugetlb_vma_lock_write mm/hugetlb.c:300 [inline] hugetlb_vma_lock_write+0xae/0x100 mm/hugetlb.c:291 __hugetlb_zap_begin+0x1e9/0x2b0 mm/hugetlb.c:5447 hugetlb_zap_begin include/linux/hugetlb.h:258 [inline] unmap_vmas+0x2f4/0x470 mm/memory.c:1733 exit_mmap+0x1ad/0xa60 mm/mmap.c:3230 __mmput+0x12a/0x4d0 kernel/fork.c:1349 mmput+0x62/0x70 kernel/fork.c:1371 exit_mm kernel/exit.c:567 [inline] do_exit+0x9ad/0x2a20 kernel/exit.c:861 __do_sys_exit kernel/exit.c:991 [inline] __se_sys_exit kernel/exit.c:989 [inline] __x64_sys_exit+0x42/0x50 kernel/exit.c:989 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Mask off low bit flags before checking for NULL pointer. In addition, the reserve map only 'belongs' to the OWNER (parent in parent/child relationships) so also check for the OWNER flag. Link: https://lkml.kernel.org/r/[email protected] Reported-by: [email protected] Closes: https://lore.kernel.org/linux-mm/[email protected]/ Fixes: bf4916922c60 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs") Signed-off-by: Mike Kravetz <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: Edward Adam Davis <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Tom Rix <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-06	bpf: Fix prog_array_map_poke_run map poke update	Jiri Olsa	1	-0/+3
	Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in prog_array_map_poke_run function due to error value returned from bpf_arch_text_poke function. There's race window where bpf_arch_text_poke can fail due to missing bpf program kallsym symbols, which is accounted for with check for -EINVAL in that BUG_ON call. The problem is that in such case we won't update the tail call jump and cause imbalance for the next tail call update check which will fail with -EBUSY in bpf_arch_text_poke. I'm hitting following race during the program load: CPU 0 CPU 1 bpf_prog_load bpf_check do_misc_fixups prog_array_map_poke_track map_update_elem bpf_fd_array_map_update_elem prog_array_map_poke_run bpf_arch_text_poke returns -EINVAL bpf_prog_kallsyms_add After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next poke update fails on expected jump instruction check in bpf_arch_text_poke with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run. Similar race exists on the program unload. Fixing this by moving the update to bpf_arch_poke_desc_update function which makes sure we call __bpf_arch_text_poke that skips the bpf address check. Each architecture has slightly different approach wrt looking up bpf address in bpf_arch_text_poke, so instead of splitting the function or adding new 'checkip' argument in previous version, it seems best to move the whole map_poke_run update as arch specific code. [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Reported-by: [email protected] Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Cc: Lee Jones <[email protected]> Cc: Maciej Fijalkowski <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-12-06	thermal: sysfs: Rework the handling of trip point updates	Rafael J. Wysocki	1	-4/+0
	Both trip_point_temp_store() and trip_point_hyst_store() use thermal_zone_set_trip() to update a given trip point, but none of them actually needs to change more than one field in struct thermal_trip representing it. However, each of them effectively calls __thermal_zone_get_trip() twice in a row for the same trip index value, once directly and once via thermal_zone_set_trip(), which is not particularly efficient, and the way in which thermal_zone_set_trip() carries out the update is not particularly straightforward. Moreover, input processing need not be done under the thermal zone lock in any of these functions. Rework trip_point_temp_store() and trip_point_hyst_store() to address the above, move the part of thermal_zone_set_trip() that is still useful to a new function called thermal_zone_trip_updated() and drop the rest of it. While at it, make trip_point_hyst_store() reject negative hysteresis values. Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Daniel Lezcano <[email protected]>
2023-12-06	cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check	Waiman Long	2	-1/+9
	Currently, the cpu_is_isolated() function checks only the statically isolated CPUs specified via the "isolcpus" and "nohz_full" kernel command line options. This function is used by vmstat and memcg to reduce interference with isolated CPUs by not doing stat flushing or scheduling works on those CPUs. Workloads running on isolated CPUs within isolated cpuset partitions should receive the same treatment to reduce unnecessary interference. This patch introduces a new cpuset_cpu_is_isolated() function to be called by cpu_is_isolated() so that the set of dynamically created cpuset isolated CPUs will be included in the check. Assuming that testing a bit in a cpumask is atomic, no synchronization primitive is currently used to synchronize access to the cpuset's isolated_cpus mask. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2023-12-06	bpf,lsm: add BPF token LSM hooks	Andrii Nakryiko	3	-0/+33
	Wire up bpf_token_create and bpf_token_free LSM hooks, which allow to allocate LSM security blob (we add `void *security` field to struct bpf_token for that), but also control who can instantiate BPF token. This follows existing pattern for BPF map and BPF prog. Also add security_bpf_token_allow_cmd() and security_bpf_token_capable() LSM hooks that allow LSM implementation to control and negate (if necessary) BPF token's delegation of a specific bpf_cmd and capability, respectively. Acked-by: Paul Moore <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf,lsm: refactor bpf_map_alloc/bpf_map_free LSM hooks	Andrii Nakryiko	2	-4/+7
	Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc hook into bpf_map_create, taking not just struct bpf_map, but also bpf_attr and bpf_token, to give a fuller context to LSMs. Unlike bpf_prog_alloc, there is no need to move the hook around, as it currently is firing right before allocating BPF map ID and FD, which seems to be a sweet spot. But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free LSM hook is called even if bpf_map_create hook returned error, as if few LSMs are combined together it could be that one LSM successfully allocated security blob for its needs, while subsequent LSM rejected BPF map creation. The former LSM would still need to free up LSM blob, so we need to ensure security_bpf_map_free() is called regardless of the outcome. Acked-by: Paul Moore <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf,lsm: refactor bpf_prog_alloc/bpf_prog_free LSM hooks	Andrii Nakryiko	2	-7/+10
	Based on upstream discussion ([0]), rework existing bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF program struct. Also, we pass bpf_attr union with all the user-provided arguments for BPF_PROG_LOAD command. This will give LSMs as much information as we can basically provide. The hook is also BPF token-aware now, and optional bpf_token struct is passed as a third argument. bpf_prog_load LSM hook is called after a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were allocated and filled out, but right before performing full-fledged BPF verification step. bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for consistency. SELinux code is adjusted to all new names, types, and signatures. Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be used by some LSMs to allocate extra security blob, but also by other LSMs to reject BPF program loading, we need to make sure that bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one even if the hook itself returned error. If we don't do that, we run the risk of leaking memory. This seems to be possible today when combining SELinux and BPF LSM, as one example, depending on their relative ordering. Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to sleepable LSM hooks list, as they are both executed in sleepable context. Also drop bpf_prog_load hook from untrusted, as there is no issue with refcount or anything else anymore, that originally forced us to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM hook arguments as trusted"). We now trigger this hook much later and it should not be an issue anymore. [0] https://lore.kernel.org/bpf/[email protected]/ Acked-by: Paul Moore <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: consistently use BPF token throughout BPF verifier logic	Andrii Nakryiko	2	-9/+9
	Remove remaining direct queries to perfmon_capable() and bpf_capable() in BPF verifier logic and instead use BPF token (if available) to make decisions about privileges. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: take into account BPF token when fetching helper protos	Andrii Nakryiko	1	-2/+3
	Instead of performing unconditional system-wide bpf_capable() and perfmon_capable() calls inside bpf_base_func_proto() function (and other similar ones) to determine eligibility of a given BPF helper for a given program, use previously recorded BPF token during BPF_PROG_LOAD command handling to inform the decision. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: add BPF token support to BPF_PROG_LOAD command	Andrii Nakryiko	1	-0/+6
	Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: add BPF token support to BPF_MAP_CREATE command	Andrii Nakryiko	1	-0/+2
	Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: introduce BPF token object	Andrii Nakryiko	1	-0/+41
	Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a trusted unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to also have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	bpf: add BPF token delegation mount options to BPF FS	Andrii Nakryiko	1	-0/+10
	Add few new mount options to BPF FS that allow to specify that a given BPF FS instance allows creation of BPF token (added in the next patch), and what sort of operations are allowed under BPF token. As such, we get 4 new mount options, each is a bit mask - `delegate_cmds` allow to specify which bpf() syscall commands are allowed with BPF token derived from this BPF FS instance; - if BPF_MAP_CREATE command is allowed, `delegate_maps` specifies a set of allowable BPF map types that could be created with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_progs` specifies a set of allowable BPF program types that could be loaded with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_attachs` specifies a set of allowable BPF program attach types that could be loaded with BPF token; delegate_progs and delegate_attachs are meant to be used together, as full BPF program type is, in general, determined through both program type and program attach type. Currently, these mount options accept the following forms of values: - a special value "any", that enables all possible values of a given bit set; - numeric value (decimal or hexadecimal, determined by kernel automatically) that specifies a bit mask value directly; - all the values for a given mount option are combined, if specified multiple times. E.g., `mount -t bpf nodev /path/to/mount -o delegate_maps=0x1 -o delegate_maps=0x2` will result in a combined 0x3 mask. Ideally, more convenient (for humans) symbolic form derived from corresponding UAPI enums would be accepted (e.g., `-o delegate_progs=kprobe\|tracepoint`) and I intend to implement this, but it requires a bunch of UAPI header churn, so I postponed it until this feature lands upstream or at least there is a definite consensus that this feature is acceptable and is going to make it, just to minimize amount of wasted effort and not increase amount of non-essential code to be reviewed. Attentive reader will notice that BPF FS is now marked as FS_USERNS_MOUNT, which theoretically makes it mountable inside non-init user namespace as long as the process has sufficient namespaced capabilities within that user namespace. But in reality we still restrict BPF FS to be mountable only by processes with CAP_SYS_ADMIN in init userns (extra check in bpf_fill_super()). FS_USERNS_MOUNT is added to allow creating BPF FS context object (i.e., fsopen("bpf")) from inside unprivileged process inside non-init userns, to capture that userns as the owning userns. It will still be required to pass this context object back to privileged process to instantiate and mount it. This manipulation is important, because capturing non-init userns as the owning userns of BPF FS instance (super block) allows to use that userns to constraint BPF token to that userns later on (see next patch). So creating BPF FS with delegation inside unprivileged userns will restrict derived BPF token objects to only "work" inside that intended userns, making it scoped to a intended "container". Also, setting these delegation options requires capable(CAP_SYS_ADMIN), so unprivileged process cannot set this up without involvement of a privileged process. There is a set of selftests at the end of the patch set that simulates this sequence of steps and validates that everything works as intended. But careful review is requested to make sure there are no missed gaps in the implementation and testing. This somewhat subtle set of aspects is the result of previous discussions ([0]) about various user namespace implications and interactions with BPF token functionality and is necessary to contain BPF token inside intended user namespace. [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Acked-by: Christian Brauner <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-06	ACPI: bus: update acpi_dev_hid_uid_match() to support multiple types	Raag Jadav	1	-6/+1
	Now that we have _UID matching support for both integer and string types, we can support them into acpi_dev_hid_uid_match() helper as well. Signed-off-by: Raag Jadav <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2023-12-06	ACPI: bus: update acpi_dev_uid_match() to support multiple types	Raag Jadav	1	-5/+3
	According to the ACPI specification, a _UID object can evaluate to either a numeric value or a string. Update acpi_dev_uid_match() to support _UID matching for both integer and string types. Suggested-by: Mika Westerberg <[email protected]> Signed-off-by: Raag Jadav <[email protected]> [ rjw: Rename auxiliary macros, relocate kerneldoc comment ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2023-12-06	Merge tag 'ffa-fixes-6.7' of ↵	Arnd Bergmann	1	-0/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fixes for v6.7 A bunch of fixes addressing issues around the notification support that was added this cycle. They address issue in partition IDs handling in ffa_notification_info_get(), notifications cleanup path and the size of the allocation in ffa_partitions_cleanup(). It also adds check for the notification enabled state so that the drivers registering the callbacks can be rejected if not enabled/supported. It also moves the partitions setup operation after the notification initialisation so that the driver has the correct state for notification enabled/supported before the partitions are initialised/setup. It also now allows FF-A initialisation to complete successfully even when the notification initialisation fails as it is an optional support in the specification. Initial support just allowed it only if the firmware didn't support notifications. Finally, it also adds a fix for smatch warning by declaring ffa_bus_type structure in the header. * tag 'ffa-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Fix ffa_notification_info_get() IDs handling firmware: arm_ffa: Fix the size of the allocation in ffa_partitions_cleanup() firmware: arm_ffa: Fix FFA notifications cleanup path firmware: arm_ffa: Add checks for the notification enabled state firmware: arm_ffa: Setup the partitions after the notification initialisation firmware: arm_ffa: Allow FF-A initialisation even when notification fails firmware: arm_ffa: Declare ffa_bus_type structure in the header Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-12-06	Merge tag 'asahi-soc-mailbox-6.8' of https://github.com/AsahiLinux/linux ↵	Arnd Bergmann	2	-37/+0
	into soc/drivers Apple SoC mailbox updates for 6.8 This moves the mailbox driver out of the mailbox subsystem and into SoC, next to its only consumer (RTKit). It has been cooking in linux-next for a long while, so it's time to pull it in. * tag 'asahi-soc-mailbox-6.8' of https://github.com/AsahiLinux/linux: soc: apple: mailbox: Add explicit include of platform_device.h soc: apple: mailbox: Rename config symbol to APPLE_MAILBOX mailbox: apple: Delete driver soc: apple: rtkit: Port to the internal mailbox driver soc: apple: mailbox: Add ASC/M3 mailbox driver soc: apple: rtkit: Get rid of apple_rtkit_send_message_wait Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-12-06	cpu/hotplug: Remove unused CPU hotplug states	Zenghui Yu	1	-14/+0
	There are unused hotplug states which either have never been used or the removal of the usage did not remove the state constant. Drop them to reduce the size of the cpuhp_hp_states array. Signed-off-by: Zenghui Yu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-12-06	regulator: event: Add regulator netlink event support	Naresh Solanki	1	-46/+1
	This commit introduces netlink event support to the regulator subsystem. Changes: - Introduce event.c and regnl.h for netlink event handling. - Implement reg_generate_netlink_event to broadcast regulator events. - Update Makefile to include the new event.c file. Signed-off-by: Naresh Solanki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-12-06	net/tcp: Don't store TCP-AO maclen on reqsk	Dmitry Safonov	1	-6/+2
	This extra check doesn't work for a handshake when SYN segment has (current_key.maclen != rnext_key.maclen). It could be amended to preserve rnext_key.maclen instead of current_key.maclen, but that requires a lookup on listen socket. Originally, this extra maclen check was introduced just because it was cheap. Drop it and convert tcp_request_sock::maclen into boolean tcp_request_sock::used_tcp_ao. Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets") Signed-off-by: Dmitry Safonov <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-12-06	mm/slab: move the rest of slub_def.h to mm/slab.h	Vlastimil Babka	1	-150/+0
	mm/slab.h is the only place to include include/linux/slub_def.h which has allowed switching between SLAB and SLUB. Now we can simply move the contents over and remove slub_def.h. Use this opportunity to fix up some whitespace (alignment) issues. Reviewed-by: Kees Cook <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2023-12-06	mm/slab: move struct kmem_cache_cpu declaration to slub.c	Vlastimil Babka	1	-54/+0
	Nothing outside SLUB itself accesses the struct kmem_cache_cpu fields so it does not need to be declared in slub_def.h. This allows also to move enum stat_item. Reviewed-by: Kees Cook <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2023-12-06	mm/slab: remove mm/slab.c and slab_def.h	Vlastimil Babka	1	-124/+0
	Remove the SLAB implementation. Update CREDITS. Also update and properly sort the SLOB entry there. RIP SLAB allocator (1996 - 2024) Reviewed-by: Kees Cook <[email protected]> Acked-by: Christoph Lameter <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: David Rientjes <[email protected]> Acked-by: Hyeonggon Yoo <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2023-12-06	HID: Intel-ish-hid: Ishtp: Add helper functions for client connection	Even Xu	1	-0/+3
	For every ishtp client driver during initialization state, the flow is: 1 - Allocate an ISHTP client instance 2 - Reserve a host id and link the client instance 3 - Search a firmware client using UUID and get related client information 4 - Bind firmware client id to the ISHTP client instance 5 - Set the state the ISHTP client instance to CONNECTING 6 - Send connect request to firmware 7 - Register event callback for messages from the firmware During deinitizalization state, the flow is: 9 - Set the state the ISHTP client instance to ISHTP_CL_DISCONNECTING 10 - Issue disconnect request to firmware 11 - Unlike the client instance 12 - Flush message queue 13 - Free ISHTP client instance Step 2-7 are identical to the steps of client driver initialization and driver reset flow, but reallocation of the RX/TX ring buffers can be avoided in reset flow. Also for step 9-12, they are identical to the steps of client driver failure handling after connect request, driver reset flow and driver removing. So, add two helper functions to simplify client driver code. ishtp_cl_establish_connection() ishtp_cl_destroy_connection() No functional changes are expected. Signed-off-by: Even Xu <[email protected]> Acked-by: Srinivas Pandruvada <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2023-12-06	r8152: add vendor/device ID pair for ASUS USB-C2500	Kelly Kane	1	-0/+1
	The ASUS USB-C2500 is an RTL8156 based 2.5G Ethernet controller. Add the vendor and product ID values to the driver. This makes Ethernet work with the adapter. Signed-off-by: Kelly Kane <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-12-05	net: core: synchronize link-watch when carrier is queried	Johannes Berg	1	-0/+9
	There are multiple ways to query for the carrier state: through rtnetlink, sysfs, and (possibly) ethtool. Synchronize linkwatch work before these operations so that we don't have a situation where userspace queries the carrier state between the driver's carrier off->on transition and linkwatch running and expects it to work, when really (at least) TX cannot work until linkwatch has run. I previously posted a longer explanation of how this applies to wireless [1] but with this wireless can simply query the state before sending data, to ensure the kernel is ready for it. [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/20231204214706.303c62768415.I1caedccae72ee5a45c9085c5eb49c145ce1c0dd5@changeid Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-05	tcp: reorganize tcp_sock fast path variables	Coco Li	1	-114/+134
	The variables are organized according in the following way: - TX read-mostly hotpath cache lines - TXRX read-mostly hotpath cache lines - RX read-mostly hotpath cache lines - TX read-write hotpath cache line - TXRX read-write hotpath cache line - RX read-write hotpath cache line Fastpath cachelines end after rcvq_space. Cache line boundaries are enforced only between read-mostly and read-write. That is, if read-mostly tx cachelines bleed into read-mostly txrx cachelines, we do not care. We care about the boundaries between read and write cachelines because we want to prevent false sharing. Fast path variables span cache lines before change: 12 Fast path variables span cache lines after change: 8 Suggested-by: Eric Dumazet <[email protected]> Reviewed-by: Wei Wang <[email protected]> Signed-off-by: Coco Li <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-05	net-device: reorganize net_device fast path variables	Coco Li	1	-53/+64
	Reorganize fast path variables on tx-txrx-rx order Fastpath variables end after npinfo. Below data generated with pahole on x86 architecture. Fast path variables span cache lines before change: 12 Fast path variables span cache lines after change: 4 Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Coco Li <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-06	drivers: base: add arch_cpu_is_hotpluggable()	Russell King (Oracle)	1	-0/+1
	The differences between architecture specific implementations of arch_register_cpu() are down to whether the CPU is hotpluggable or not. Rather than overriding the weak version of arch_register_cpu(), provide a function that can be used to provide this detail instead. Reviewed-by: Shaoqin Huang <[email protected]> Signed-off-by: "Russell King (Oracle)" <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Gavin Shan <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-06	drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden	James Morse	1	-0/+4
	Architectures often have extra per-cpu work that needs doing before a CPU is registered, often to determine if a CPU is hotpluggable. To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move the cpu_register() call into arch_register_cpu(), which is made __weak so architectures with extra work can override it. This aligns with the way x86, ia64 and loongarch register hotplug CPUs when they become present. Signed-off-by: James Morse <[email protected]> Reviewed-by: Shaoqin Huang <[email protected]> Reviewed-by: Gavin Shan <[email protected]> Signed-off-by: "Russell King (Oracle)" <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-05	bpf: support non-r10 register spill/fill to/from stack in precision tracking	Andrii Nakryiko	1	-4/+27
	Use instruction (jump) history to record instructions that performed register spill/fill to/from stack, regardless if this was done through read-only r10 register, or any other register after copying r10 into it and potentially adjusting offset. To make this work reliably, we push extra per-instruction flags into instruction history, encoding stack slot index (spi) and stack frame number in extra 10 bit flags we take away from prev_idx in instruction history. We don't touch idx field for maximum performance, as it's checked most frequently during backtracking. This change removes basically the last remaining practical limitation of precision backtracking logic in BPF verifier. It fixes known deficiencies, but also opens up new opportunities to reduce number of verified states, explored in the subsequent patches. There are only three differences in selftests' BPF object files according to veristat, all in the positive direction (less states). File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) -------------------------------------- ------------- --------- --------- ------------- ---------- ---------- ------------- test_cls_redirect_dynptr.bpf.linked3.o cls_redirect 2987 2864 -123 (-4.12%) 240 231 -9 (-3.75%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 82848 82661 -187 (-0.23%) 5107 5073 -34 (-0.67%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 85116 84964 -152 (-0.18%) 5162 5130 -32 (-0.62%) Note, I avoided renaming jmp_history to more generic insn_hist to minimize number of lines changed and potential merge conflicts between bpf and bpf-next trees. Notice also cur_hist_entry pointer reset to NULL at the beginning of instruction verification loop. This pointer avoids the problem of relying on last jump history entry's insn_idx to determine whether we already have entry for current instruction or not. It can happen that we added jump history entry because current instruction is_jmp_point(), but also we need to add instruction flags for stack access. In this case, we don't want to entries, so we need to reuse last added entry, if it is present. Relying on insn_idx comparison has the same ambiguity problem as the one that was fixed recently in [0], so we avoid that. [0] https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ Acked-by: Eduard Zingerman <[email protected]> Reported-by: Tao Lyu <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-05	drivers: perf: arm_pmu: Drop 'pmu_lock' element from 'struct pmu_hw_events'	Anshuman Khandual	1	-6/+0
	As 'pmu_lock' element is not being used in any ARM PMU implementation, just drop this from 'struct pmu_hw_events'. Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2023-12-05	iov_iter: replace import_single_range() with import_ubuf()	Jens Axboe	1	-2/+0
	With the removal of the 'iov' argument to import_single_range(), the two functions are now fully identical. Convert the import_single_range() callers to import_ubuf(), and remove the former fully. Signed-off-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-12-05	iov_iter: remove unused 'iov' argument from import_single_range()	Jens Axboe	1	-1/+1
	It is entirely unused, just get rid of it. Signed-off-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-12-05	mm/slab: remove CONFIG_SLAB code from slab common code	Vlastimil Babka	1	-12/+2
	In slab_common.c and slab.h headers, we can now remove all code behind CONFIG_SLAB and CONFIG_DEBUG_SLAB ifdefs, and remove all CONFIG_SLUB ifdefs. Reviewed-by: Kees Cook <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2023-12-05	cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks	Vlastimil Babka	2	-9/+0
	The CPUHP_SLAB_PREPARE hooks are only used by SLAB which is removed. SLUB defines them as NULL, so we can remove those altogether. Acked-by: Thomas Gleixner <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2023-12-04	net/mlx5e: Tidy up IPsec NAT-T SA discovery	Leon Romanovsky	1	-1/+1
	IPsec NAT-T packets are UDP encapsulated packets over ESP normal ones. In case they arrive to RX, the SPI and ESP are located in inner header, while the check was performed on outer header instead. That wrong check caused to the situation where received rekeying request was missed and caused to rekey timeout, which "compensated" this failure by completing rekeying. Fixes: d65954934937 ("net/mlx5e: Support IPsec NAT-T functionality") Signed-off-by: Leon Romanovsky <[email protected]>
2023-12-04	net/mlx5e: Honor user choice of IPsec replay window size	Leon Romanovsky	1	-0/+7
	Users can configure IPsec replay window size, but mlx5 driver didn't honor their choice and set always 32bits. Fix assignment logic to configure right size from the beginning. Fixes: 7db21ef4566e ("net/mlx5e: Set IPsec replay sequence numbers") Reviewed-by: Patrisious Haddad <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-12-04	net: stmmac: fix FPE events losing	Jianheng Zhang	1	-0/+1
	The status bits of register MAC_FPE_CTRL_STS are clear on read. Using 32-bit read for MAC_FPE_CTRL_STS in dwmac5_fpe_configure() and dwmac5_fpe_send_mpacket() clear the status bits. Then the stmmac interrupt handler missing FPE event status and leads to FPE handshaking failure and retries. To avoid clear status bits of MAC_FPE_CTRL_STS in dwmac5_fpe_configure() and dwmac5_fpe_send_mpacket(), add fpe_csr to stmmac_fpe_cfg structure to cache the control bits of MAC_FPE_CTRL_STS and to avoid reading MAC_FPE_CTRL_STS in those methods. Fixes: 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure") Reviewed-by: Serge Semin <[email protected]> Signed-off-by: Jianheng Zhang <[email protected]> Link: https://lore.kernel.org/r/CY5PR12MB637225A7CF529D5BE0FBE59CBF81A@CY5PR12MB6372.namprd12.prod.outlook.com Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-04	net: Add NAPI IRQ support	Amritha Nambiar	1	-0/+6
	Add support to associate the interrupt vector number for a NAPI instance. Signed-off-by: Amritha Nambiar <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Link: https://lore.kernel.org/r/170147334728.5260.13221803396905901904.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-04	net: Add queue and napi association	Amritha Nambiar	1	-0/+8
	Add the napi pointer in netdev queue for tracking the napi instance for each queue. This achieves the queue<->napi mapping. Signed-off-by: Amritha Nambiar <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Link: https://lore.kernel.org/r/170147331483.5260.15723438819994285695.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-04	bpf: Optimize the free of inner map	Hou Tao	1	-0/+2
	When removing the inner map from the outer map, the inner map will be freed after one RCU grace period and one RCU tasks trace grace period, so it is certain that the bpf program, which may access the inner map, has exited before the inner map is freed. However there is no need to wait for one RCU tasks trace grace period if the outer map is only accessed by non-sleepable program. So adding sleepable_refcnt in bpf_map and increasing sleepable_refcnt when adding the outer map into env->used_maps for sleepable program. Although the max number of bpf program is INT_MAX - 1, the number of bpf programs which are being loaded may be greater than INT_MAX, so using atomic64_t instead of atomic_t for sleepable_refcnt. When removing the inner map from the outer map, using sleepable_refcnt to decide whether or not a RCU tasks trace grace period is needed before freeing the inner map. Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-04	bpf: Defer the free of inner map when necessary	Hou Tao	1	-1/+6
	When updating or deleting an inner map in map array or map htab, the map may still be accessed by non-sleepable program or sleepable program. However bpf_map_fd_put_ptr() decreases the ref-counter of the inner map directly through bpf_map_put(), if the ref-counter is the last one (which is true for most cases), the inner map will be freed by ops->map_free() in a kworker. But for now, most .map_free() callbacks don't use synchronize_rcu() or its variants to wait for the elapse of a RCU grace period, so after the invocation of ops->map_free completes, the bpf program which is accessing the inner map may incur use-after-free problem. Fix the free of inner map by invoking bpf_map_free_deferred() after both one RCU grace period and one tasks trace RCU grace period if the inner map has been removed from the outer map before. The deferment is accomplished by using call_rcu() or call_rcu_tasks_trace() when releasing the last ref-counter of bpf map. The newly-added rcu_head field in bpf_map shares the same storage space with work field to reduce the size of bpf_map. Fixes: bba1dc0b55ac ("bpf: Remove redundant synchronize_rcu.") Fixes: 638e4b825d52 ("bpf: Allows per-cpu maps and map-in-map in sleepable programs") Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-04	bpf: Add map and need_defer parameters to .map_fd_put_ptr()	Hou Tao	1	-1/+5
	map is the pointer of outer map, and need_defer needs some explanation. need_defer tells the implementation to defer the reference release of the passed element and ensure that the element is still alive before the bpf program, which may manipulate it, exits. The following three cases will invoke map_fd_put_ptr() and different need_defer values will be passed to these callers: 1) release the reference of the old element in the map during map update or map deletion. The release must be deferred, otherwise the bpf program may incur use-after-free problem, so need_defer needs to be true. 2) release the reference of the to-be-added element in the error path of map update. The to-be-added element is not visible to any bpf program, so it is OK to pass false for need_defer parameter. 3) release the references of all elements in the map during map release. Any bpf program which has access to the map must have been exited and released, so need_defer=false will be OK. These two parameters will be used by the following patches to fix the potential use-after-free problem for map-in-map. Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-04	vfio/migration: Add debugfs to live migration driver	Longfang Liu	1	-0/+7
	There are multiple devices, software and operational steps involved in the process of live migration. An error occurred on any node may cause the live migration operation to fail. This complex process makes it very difficult to locate and analyze the cause when the function fails. In order to quickly locate the cause of the problem when the live migration fails, I added a set of debugfs to the vfio live migration driver. +-------------------------------------------+ \| \| \| \| \| QEMU \| \| \| \| \| +---+----------------------------+----------+ \| ^ \| ^ \| \| \| \| \| \| \| \| v \| v \| +---------+--+ +---------+--+ \|src vfio_dev\| \|dst vfio_dev\| +--+---------+ +--+---------+ \| ^ \| ^ \| \| \| \| v \| \| \| +-----------+----+ +-----------+----+ \|src dev debugfs \| \|dst dev debugfs \| +----------------+ +----------------+ The entire debugfs directory will be based on the definition of the CONFIG_DEBUG_FS macro. If this macro is not enabled, the interfaces in vfio.h will be empty definitions, and the creation and initialization of the debugfs directory will not be executed. vfio \| +---<dev_name1> \| +---migration \| +--state \| +---<dev_name2> +---migration +--state debugfs will create a public root directory "vfio" file. then create a dev_name() file for each live migration device. First, create a unified state acquisition file of "migration" in this device directory. Then, create a public live migration state lookup file "state". Signed-off-by: Longfang Liu <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-12-04	pinctrl: Convert unsigned to unsigned int	Andy Shevchenko	5	-39/+39
	Simple type conversion with no functional change implied. While at it, adjust indentation where it makes sense. Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2023-12-04	usb: typec: tcpci: add vconn over current fault handling to maxim_core	RD Babiera	1	-1/+4
	Add TCPC_FAULT_STATUS_VCONN_OC constant and corresponding mask definition. Maxim TCPC is capable of detecting VConn over current faults, so add fault to alert mask. When a Vconn over current fault is triggered, put the port in an error recovery state via tcpm_port_error_recovery. Signed-off-by: RD Babiera <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-04	usb: typec: tcpm: add tcpm_port_error_recovery symbol	RD Babiera	1	-0/+1
	Add tcpm_port_error_recovery symbol and corresponding event that runs in tcpm_pd_event handler to set the port to the ERROR_RECOVERY state. tcpci drivers can use the symbol to reset the port when tcpc faults affect port functionality. Signed-off-by: RD Babiera <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-04	usb: core: Allow subclassed USB drivers to override usb_choose_configuration()	Douglas Anderson	1	-0/+6
	For some USB devices we might want to do something different for usb_choose_configuration(). One example here is the r8152 driver where we want to end up using the vendor driver with the preferred interface. The r8152 driver tried to make things work by implementing a USB generic_subclass driver and then overriding the normal config selection after it happened. This is less than ideal and also caused breakage if someone deauthorized and re-authorized the USB device because the USB core ended up going back to it's default logic for choosing the best config. I made an attempt to fix this [1] but it was a bit ugly. Let's do this better and allow USB generic_subclass drivers to override usb_choose_configuration(). [1] https://lore.kernel.org/r/20231130154337.1.Ie00e07f07f87149c9ce0b27ae4e26991d307e14b@changeid Suggested-by: Alan Stern <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Reviewed-by: Alan Stern <[email protected]> Link: https://lore.kernel.org/r/20231201102946.v2.2.Iade5fa31997f1a0ca3e1dec0591633b02471df12@changeid Signed-off-by: Greg Kroah-Hartman <[email protected]>