aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-03-23vhost_task: Allow vhost layer to use copy_processMike Christie1-0/+23
Qemu will create vhost devices in the kernel which perform network, SCSI, etc IO and management operations from worker threads created by the kthread API. Because the kthread API does a copy_process on the kthreadd thread, the vhost layer has to use kthread_use_mm to access the Qemu thread's memory and cgroup_attach_task_all to add itself to the Qemu thread's cgroups, and it bypasses the RLIMIT_NPROC limit which can result in VMs creating more threads than the admin expected. This patch adds a new struct vhost_task which can be used instead of kthreads. They allow the vhost layer to use copy_process and inherit the userspace process's mm and cgroups, the task is accounted for under the userspace's nproc count and can be seen in its process tree, and other features like namespaces work and are inherited by default. Signed-off-by: Mike Christie <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2023-03-23mmc: core: add helpers mmc_regulator_enable/disable_vqmmcHeiner Kallweit1-0/+3
There's a number of drivers (e.g. dw_mmc, meson-gx, mmci, sunxi) using the same mechanism and a private flag vqmmc_enabled to deal with enabling/disabling the vqmmc regulator. Move this to the core and create new helpers mmc_regulator_enable_vqmmc and mmc_regulator_disable_vqmmc. Signed-off-by: Heiner Kallweit <[email protected]> Acked-by: Martin Blumenstingl <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]>
2023-03-23net/mlx5: Expose bits for enabling out-of-order by defaultOr Har-Toov1-3/+7
Add needed HW bits for enabling out-of-order by default and use go_back_n when out-of-order is not needed. Signed-off-by: Or Har-Toov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/75d6dfe263989a05c08c43406132b336ea12d00a.1679230449.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-22bpf: Update the struct_ops of a bpf_link.Kui-Feng Lee1-0/+3
By improving the BPF_LINK_UPDATE command of bpf(), it should allow you to conveniently switch between different struct_ops on a single bpf_link. This would enable smoother transitions from one struct_ops to another. The struct_ops maps passing along with BPF_LINK_UPDATE should have the BPF_F_LINK flag. Signed-off-by: Kui-Feng Lee <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-22bpf: Create links for BPF struct_ops maps.Kui-Feng Lee1-0/+7
Make bpf_link support struct_ops. Previously, struct_ops were always used alone without any associated links. Upon updating its value, a struct_ops would be activated automatically. Yet other BPF program types required to make a bpf_link with their instances before they could become active. Now, however, you can create an inactive struct_ops, and create a link to activate it later. With bpf_links, struct_ops has a behavior similar to other BPF program types. You can pin/unpin them from their links and the struct_ops will be deactivated when its link is removed while previously need someone to delete the value for it to be deactivated. bpf_links are responsible for registering their associated struct_ops. You can only use a struct_ops that has the BPF_F_LINK flag set to create a bpf_link, while a structs without this flag behaves in the same manner as before and is registered upon updating its value. The BPF_LINK_TYPE_STRUCT_OPS serves a dual purpose. Not only is it used to craft the links for BPF struct_ops programs, but also to create links for BPF struct_ops them-self. Since the links of BPF struct_ops programs are only used to create trampolines internally, they are never seen in other contexts. Thus, they can be reused for struct_ops themself. To maintain a reference to the map supporting this link, we add bpf_struct_ops_link as an additional type. The pointer of the map is RCU and won't be necessary until later in the patchset. Signed-off-by: Kui-Feng Lee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-22bpf: Retire the struct_ops map kvalue->refcnt.Kui-Feng Lee1-0/+1
We have replaced kvalue-refcnt with synchronize_rcu() to wait for an RCU grace period. Maintenance of kvalue->refcnt was a complicated task, as we had to simultaneously keep track of two reference counts: one for the reference count of bpf_map. When the kvalue->refcnt reaches zero, we also have to reduce the reference count on bpf_map - yet these steps are not performed in an atomic manner and require us to be vigilant when managing them. By eliminating kvalue->refcnt, we can make our maintenance more straightforward as the refcount of bpf_map is now solely managed! To prevent the trampoline image of a struct_ops from being released while it is still in use, we wait for an RCU grace period. The setsockopt(TCP_CONGESTION, "...") command allows you to change your socket's congestion control algorithm and can result in releasing the old struct_ops implementation. It is fine. However, this function is exposed through bpf_setsockopt(), it may be accessed by BPF programs as well. To ensure that the trampoline image belonging to struct_op can be safely called while its method is in use, the trampoline safeguarde the BPF program with rcu_read_lock(). Doing so prevents any destruction of the associated images before returning from a trampoline and requires us to wait for an RCU grace period. Signed-off-by: Kui-Feng Lee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-22net: phylink: remove an_enabledRussell King (Oracle)1-2/+0
The Autoneg bit in the advertising bitmap and state->an_enabled are always identical. state->an_enabled is now no longer used by any drivers, so lets kill this duplication. Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-22netdev: Enforce index cap in netdev_get_tx_queueNick Child1-0/+1
When requesting a TX queue at a given index, warn on out-of-bounds referencing if the index is greater than the allocated number of queues. Specifically, since this function is used heavily in the networking stack use DEBUG_NET_WARN_ON_ONCE to avoid executing a new branch on every packet. Signed-off-by: Nick Child <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-23ata: drop unused ata_id_is_lba_capacity_ok()Sergey Shtylyov1-50/+0
This function was renamed from lba_capacity_is_ok()() and moved from drivers/ide/ to <linux/ata.h> but it never got used by libata, thus it became useless after drivers/ide/ removal... Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: drop unused ata_id_to_hd_driveid()Sergey Shtylyov1-21/+0
This function was renamed from ide_id_to_hd_driveid() and moved from drivers/ide/ to <linux/ata.h> but it never got used by libata, thus it became useless after drivers/ide/ removal... Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: move pata_parport.h to drivers/ata/pata_parportOndrej Zary1-96/+0
Now that paride is gone, pata_parport.h does not need to be in include/linux. Move it to drivers/ata/pata_parport. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove scratch parameter from test_proto()Ondrej Zary1-1/+1
Don't pass around a pointer to scratch buffer. Use local buffers in protocols that need it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove verbose parameter from test_proto()Ondrej Zary1-1/+1
verbose parameter of test_proto() is now unused, remove it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove scratch parameter from log_adapter()Ondrej Zary1-1/+1
scratch parameter of log_adapter() is only used by bpck driver. Remove it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove verbose parameter from log_adapter()Ondrej Zary1-1/+1
verbose parameter of log_adapter() is unused, remove it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove typedef struct PIAOndrej Zary1-2/+0
Remove typedef struct PIA and use struct pi_adapter directly. Fix formatting (excessive spaces) while at it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove device from struct pi_adapterOndrej Zary1-1/+0
device is never set in pata_parport, remove it. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: remove devtype from struct pi_adapterOndrej Zary1-3/+0
Only bpck driver uses devtype but it never gets set in pata_parport. Remove it. As most bpck devices are CD-ROMs, always run the code that depends on devtype == PI_PCD. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: Introduce module_pata_parport_driver macroOndrej Zary1-3/+11
Introduce module_pata_parport_driver macro and use it in protocol drivers to reduce boilerplate code. Remove paride_(un)register compatibility defines. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-23ata: pata_parport: Remove pi_swab16 and pi_swab32Ondrej Zary1-17/+0
Convert comm and kbic drivers to use standard swab16. Remove pi_swab16 and pi_swab32. Reviewed-by: Sergey Shtylyov <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-03-22bpf: return long from bpf_map_ops funcsJP Kobryn2-10/+10
This patch changes the return types of bpf_map_ops functions to long, where previously int was returned. Using long allows for bpf programs to maintain the sign bit in the absence of sign extension during situations where inlined bpf helper funcs make calls to the bpf_map_ops funcs and a negative error is returned. The definitions of the helper funcs are generated from comments in the bpf uapi header at `include/uapi/linux/bpf.h`. The return type of these helpers was previously changed from int to long in commit bdb7b79b4ce8. For any case where one of the map helpers call the bpf_map_ops funcs that are still returning 32-bit int, a compiler might not include sign extension instructions to properly convert the 32-bit negative value a 64-bit negative value. For example: bpf assembly excerpt of an inlined helper calling a kernel function and checking for a specific error: ; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST); ... 46: call 0xffffffffe103291c ; htab_map_update_elem ; if (err && err != -EEXIST) { 4b: cmp $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax kernel function assembly excerpt of return value from `htab_map_update_elem` returning 32-bit int: movl $0xffffffef, %r9d ... movl %r9d, %eax ...results in the comparison: cmp $0xffffffffffffffef, $0x00000000ffffffef Fixes: bdb7b79b4ce8 ("bpf: Switch most helper return values from 32-bit int to 64-bit long") Tested-by: Eduard Zingerman <[email protected]> Signed-off-by: JP Kobryn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-03-22Merge tag 'regmap-no-status' of ↵Bartosz Golaszewski1-0/+2
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into gpio/for-next regmap: Add no_status support This patch adds support for devices which don't support readback of individual interrupt statuses, we report all interrupts as firing and hope the consumers do the right thing.
2023-03-22livepatch,sched: Add livepatch task switching to cond_resched()Josh Poimboeuf3-5/+45
There have been reports [1][2] of live patches failing to complete within a reasonable amount of time due to CPU-bound kthreads. Fix it by patching tasks in cond_resched(). There are four different flavors of cond_resched(), depending on the kernel configuration. Hook into all of them. A more elegant solution might be to use a preempt notifier. However, non-ORC unwinders can't unwind a preempted task reliably. [1] https://lore.kernel.org/lkml/[email protected]/ [2] https://lkml.kernel.org/lkml/[email protected] Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Tested-by: Seth Forshee (DigitalOcean) <[email protected]> Link: https://lore.kernel.org/r/4ae981466b7814ec221014fc2554b2f86f3fb70b.1677257135.git.jpoimboe@kernel.org
2023-03-22thermal: core: Introduce thermal_cooling_device_update()Rafael J. Wysocki1-0/+1
Introduce a core thermal API function, thermal_cooling_device_update(), for updating the max_state value for a cooling device and rearranging its statistics in sysfs after a possible change of its ->get_max_state() callback return value. That callback is now invoked only once, during cooling device registration, to populate the max_state field in the cooling device object, so if its return value changes, it needs to be invoked again and the new return value needs to be stored as max_state. Moreover, the statistics presented in sysfs need to be rearranged in general, because there may not be enough room in them to store data for all of the possible states (in the case when max_state grows). The new function takes care of that (and some other minor things related to it), but some extra locking and lockdep annotations are added in several places too to protect against crashes in the cases when the statistics are not present or when a stale max_state value might be used by sysfs attributes. Note that the actual user of the new function will be added separately. Link: https://lore.kernel.org/linux-pm/[email protected]/ Signed-off-by: Rafael J. Wysocki <[email protected]> Tested-by: Zhang Rui <[email protected]> Reviewed-by: Zhang Rui <[email protected]>
2023-03-22nvme-tcp: fix nvme_tcp_term_pdu to match specCaleb Sander1-2/+3
The FEI field of C2HTermReq/H2CTermReq is 4 bytes but not 4-byte-aligned in the NVMe/TCP specification (it is located at offset 10 in the PDU). Split it into two 16-bit integers in struct nvme_tcp_term_pdu so no padding is inserted. There should also be 10 reserved bytes after. There are currently no users of this type. Fixes: fc221d05447aa6db ("nvme-tcp: Add protocol header") Reported-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Caleb Sander <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-03-21net: remove rcu_dereference_bh_rtnl()Eric Dumazet1-10/+0
This helper is no longer used in the tree. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-21cxl/pci: Fix CDAT retrieval on big endianLukas Wunner1-2/+6
The CDAT exposed in sysfs differs between little endian and big endian arches: On big endian, every 4 bytes are byte-swapped. PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors such as pci_read_config_dword() implicitly swap bytes on big endian. That way, the macros in include/uapi/linux/pci_regs.h work regardless of the arch's endianness. For an example of implicit byte-swapping, see ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx (Load Word Byte-Reverse Indexed). DOE Read/Write Data Mailbox Registers are unlike other registers in Configuration Space in that they contain or receive a 4 byte portion of an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). They need to be copied to or from the request/response buffer verbatim. So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit byte-swapping. The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume implicit byte-swapping. Byte-swap requests after constructing them with those macros and byte-swap responses before parsing them. Change the request and response type to __le32 to avoid sparse warnings. Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for consistency. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Dan Williams <[email protected]> Cc: [email protected] # v6.0+ Reviewed-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-03-21i3c: Make i3c_master_unregister() return voidUwe Kleine-König1-1/+1
The function returned zero unconditionally. Switch the return type to void and simplify the callers accordingly. Signed-off-by: Uwe Kleine-König <[email protected]> Reviewed-by: Miquel Raynal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-03-21ftrace: Show a list of all functions that have ever been enabledSteven Rostedt (Google)1-1/+4
When debugging a crash that appears to be related to ftrace, but not for sure, it is useful to know if a function was ever enabled by ftrace or not. It could be that a BPF program was attached to it, or possibly a live patch. We are having crashes in the field where this information is not always known. But having ftrace set a flag if a function has ever been attached since boot up helps tremendously in trying to know if a crash had to do with something using ftrace. For analyzing crashes, the use of a kdump image can have access to the flags. When looking at issues where the kernel did not panic, the touched_functions file can simply be used. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Catalin Marinas <[email protected]> Tested-by: Mark Rutland <[email protected]> Tested-by: Chris Li <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: selftest: remove broken trace_direct_trampMark Rutland1-0/+2
The ftrace selftest code has a trace_direct_tramp() function which it uses as a direct call trampoline. This happens to work on x86, since the direct call's return address is in the usual place, and can be returned to via a RET, but in general the calling convention for direct calls is different from regular function calls, and requires a trampoline written in assembly. On s390, regular function calls place the return address in %r14, and an ftrace patch-site in an instrumented function places the trampoline's return address (which is within the instrumented function) in %r0, preserving the original %r14 value in-place. As a regular C function will return to the address in %r14, using a C function as the trampoline results in the trampoline returning to the caller of the instrumented function, skipping the body of the instrumented function. Note that the s390 issue is not detcted by the ftrace selftest code, as the instrumented function is trivial, and returning back into the caller happens to be equivalent. On arm64, regular function calls place the return address in x30, and an ftrace patch-site in an instrumented function saves this into r9 and places the trampoline's return address (within the instrumented function) in x30. A regular C function will return to the address in x30, but will not restore x9 into x30. Consequently, using a C function as the trampoline results in returning to the trampoline's return address having corrupted x30, such that when the instrumented function returns, it will return back into itself. To avoid future issues in this area, remove the trace_direct_tramp() function, and require that each architecture with direct calls provides a stub trampoline, named ftrace_stub_direct_tramp. This can be written to handle the architecture's trampoline calling convention, and in future could be used elsewhere (e.g. in the ftrace ops sample, to measure the overhead of direct calls), so we may as well always build it in. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Rutland <[email protected]> Cc: Li Huafei <[email protected]> Cc: Xu Kuohai <[email protected]> Signed-off-by: Florent Revest <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGSFlorent Revest1-0/+6
Direct called trampolines can be called in two ways: - either from the ftrace callsite. In this case, they do not access any struct ftrace_regs nor pt_regs - Or, if a ftrace ops is also attached, from the end of a ftrace trampoline. In this case, the call_direct_funcs ops is in charge of setting the direct call trampoline's address in a struct ftrace_regs Since: commit 9705bc709604 ("ftrace: pass fregs to arch_ftrace_set_direct_caller()") The later case no longer requires a full pt_regs. It only needs a struct ftrace_regs so DIRECT_CALLS can work with both WITH_ARGS or WITH_REGS. With architectures like arm64 already abandoning WITH_REGS in favor of WITH_ARGS, it's important to have DIRECT_CALLS work WITH_ARGS only. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Florent Revest <[email protected]> Co-developed-by: Mark Rutland <[email protected]> Signed-off-by: Mark Rutland <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: Store direct called addresses in their opsFlorent Revest1-0/+3
All direct calls are now registered using the register_ftrace_direct API so each ops can jump to only one direct-called trampoline. By storing the direct called trampoline address directly in the ops we can save one hashmap lookup in the direct call ops and implement arm64 direct calls on top of call ops. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Florent Revest <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIsFlorent Revest1-10/+10
Now that the original _ftrace_direct APIs are gone, the "_multi" suffixes only add confusion. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Florent Revest <[email protected]> Acked-by: Mark Rutland <[email protected]> Tested-by: Mark Rutland <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: Remove the legacy _ftrace_direct APIFlorent Revest1-32/+0
This API relies on a single global ops, used for all direct calls registered with it. However, to implement arm64 direct calls, we need each ops to point to a single direct call trampoline. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Florent Revest <[email protected]> Acked-by: Mark Rutland <[email protected]> Tested-by: Mark Rutland <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21ftrace: Let unregister_ftrace_direct_multi() call ftrace_free_filter()Florent Revest1-2/+4
A common pattern when using the ftrace_direct_multi API is to unregister the ops and also immediately free its filter. We've noticed it's very easy for users to miss calling ftrace_free_filter(). This adds a "free_filters" argument to unregister_ftrace_direct_multi() to both remind the user they should free filters and also to make their life easier. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Florent Revest <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-03-21firmware: cs_dsp: Introduce no_core_startstop for self-booting DSPsSimon Trimmer1-0/+1
There are devices containing Halo Core DSPs that self-boot, cs_dsp is used to manage the running firmware but the host does not have direct control over starting and stopping the DSP and so cs_dsp should consider the DSP to be always running. Signed-off-by: Simon Trimmer <[email protected]> Signed-off-by: Richard Fitzgerald <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-03-21entry: Fix noinstr warning in __enter_from_user_mode()Josh Poimboeuf2-0/+3
__enter_from_user_mode() is triggering noinstr warnings with CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via ct_state(). The preemption disable isn't needed as interrupts are already disabled. And the context_tracking_enabled() check in ct_state() also isn't needed as that's already being done by the CT_WARN_ON(). Just use __ct_state() instead. Fixes the following warnings: vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t") Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org
2023-03-20mm: hugetlb: move hugeltb sysctls to its own fileKefeng Wang1-8/+0
This moves all hugetlb sysctls to its own file, also kill an useless hugetlb_treat_movable_handler() defination. Signed-off-by: Kefeng Wang <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: Muchun Song <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
2023-03-20userfaultfd: move unprivileged_userfaultfd sysctl to its own fileZhangPeng1-2/+0
The sysctl_unprivileged_userfaultfd is part of userfaultfd, move it to its own file. Signed-off-by: ZhangPeng <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
2023-03-20net: skbuff: move the fields BPF cares about directly next to the offset markerJakub Kicinski1-9/+9
To avoid more possible BPF dependencies with moving bitfields around keep the fields BPF cares about right next to the offset marker. Signed-off-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-20net: skbuff: reorder bytes 2 and 3 of the bitfieldJakub Kicinski1-10/+10
BPF needs to know the offsets of fields it tries to access. Zero-length fields are added to make offsetof() work. This unfortunately partitions the bitfield (fields across the zero-length members can't be coalesced). Reorder bytes 2 and 3, BPF needs to know the offset of fields previously in byte 3 and some fields in byte 2 should really be optional. The two bytes are always in the same cacheline so it should not matter. Signed-off-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-20net: skbuff: rename __pkt_vlan_present_offset to __mono_tc_offsetJakub Kicinski1-2/+2
vlan_present is gone since commit 354259fa73e2 ("net: remove skb->vlan_present") rename the offset field to what BPF is currently looking for in this byte - mono_delivery_time and tc_at_ingress. Signed-off-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-03-20net: pcs: add driver for MediaTek SGMII PCSDaniel Golle1-0/+13
The SGMII core found in several MediaTek SoCs is identical to what can also be found in MediaTek's MT7531 Ethernet switch IC. As this has not always been clear, both drivers developed different implementations to deal with the PCS. Recently Alexander Couzens pointed out this fact which lead to the development of this shared driver. Add a dedicated driver, mostly by copying the code now found in the Ethernet driver. The now redundant code will be removed by a follow-up commit. Suggested-by: Alexander Couzens <[email protected]> Suggested-by: Russell King (Oracle) <[email protected]> Signed-off-by: Daniel Golle <[email protected]> Tested-by: Frank Wunderlich <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-20block/io_uring: pass in issue_flags for uring_cmd task_work handlingJens Axboe1-5/+6
io_uring_cmd_done() currently assumes that the uring_lock is held when invoked, and while it generally is, this is not guaranteed. Pass in the issue_flags associated with it, so that we have IO_URING_F_UNLOCKED available to be able to lock the CQ ring appropriately when completing events. Cc: [email protected] Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Jens Axboe <[email protected]>
2023-03-20clk: Add Sunplus SP7021 clock driverQin Jian1-0/+19
Add clock driver for Sunplus SP7021 SoC. Signed-off-by: Qin Jian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]>
2023-03-20blk-mq: remove hybrid pollingKeith Busch2-14/+0
io_uring provides the only way user space can poll completions, and that always sets BLK_POLL_NOSLEEP. This effectively makes hybrid polling dead code, so remove it and everything supporting it. Hybrid polling was effectively killed off with 9650b453a3d4b1, "block: ignore RWF_HIPRI hint for sync dio", but still potentially reachable through io_uring until d729cf9acb93119, "io_uring: don't sleep when polling for I/O", but hybrid polling probably should not have been reachable through that async interface from the beginning. Fixes: 9650b453a3d4 ("block: ignore RWF_HIPRI hint for sync dio") Fixes: d729cf9acb93 ("io_uring: don't sleep when polling for I/O") Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-03-20Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-0/+1
Pull NFS client fixes from Anna Schumaker: - Fix /proc/PID/io read_bytes accounting - Fix setting NLM file_lock start and end during decoding testargs - Fix timing for setting access cache timestamps * tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Correct timing for assigning access cache timestamp lockd: set file_lock start and end when decoding nlm4 testargs NFS: Fix /proc/PID/io read_bytes for buffered reads
2023-03-20selinux: remove the runtime disable functionalityPaul Moore1-7/+0
After working with the larger SELinux-based distros for several years, we're finally at a place where we can disable the SELinux runtime disable functionality. The existing kernel deprecation notice explains the functionality and why we want to remove it: The selinuxfs "disable" node allows SELinux to be disabled at runtime prior to a policy being loaded into the kernel. If disabled via this mechanism, SELinux will remain disabled until the system is rebooted. The preferred method of disabling SELinux is via the "selinux=0" boot parameter, but the selinuxfs "disable" node was created to make it easier for systems with primitive bootloaders that did not allow for easy modification of the kernel command line. Unfortunately, allowing for SELinux to be disabled at runtime makes it difficult to secure the kernel's LSM hooks using the "__ro_after_init" feature. It is that last sentence, mentioning the '__ro_after_init' hardening, which is the real motivation for this change, and if you look at the diffstat you'll see that the impact of this patch reaches across all the different LSMs, helping prevent tampering at the LSM hook level. From a SELinux perspective, it is important to note that if you continue to disable SELinux via "/etc/selinux/config" it may appear that SELinux is disabled, but it is simply in an uninitialized state. If you load a policy with `load_policy -i`, you will see SELinux come alive just as if you had loaded the policy during early-boot. It is also worth noting that the "/sys/fs/selinux/disable" file is always writable now, regardless of the Kconfig settings, but writing to the file has no effect on the system, other than to display an error on the console if a non-zero/true value is written. Finally, in the several years where we have been working on deprecating this functionality, there has only been one instance of someone mentioning any user visible breakage. In this particular case it was an individual's kernel test system, and the workaround documented in the deprecation notice ("selinux=0" on the kernel command line) resolved the issue without problem. Acked-by: Casey Schaufler <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2023-03-20interconnect: drop racy registration APIJohan Hovold1-11/+0
Now that all interconnect drivers have been converted to the new provider registration API, the old racy interface can be removed. Reviewed-by: Konrad Dybcio <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Georgi Djakov <[email protected]>
2023-03-20net: phy: smsc: export functions for use by meson-gxl PHY driverHeiner Kallweit1-0/+6
The Amlogic Meson internal PHY's have the same register layout as certain SMSC PHY's (also for non-c22-standard registers). This seems to be more than just coincidence. Apparently they also need the same workaround for EDPD mode (energy detect power down). Therefore let's export SMSC PHY driver functionality for use by the meson-gxl PHY driver. Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Chris Healy <[email protected]> Signed-off-by: David S. Miller <[email protected]>