aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2024-04-03bpf: add special internal-only MOV instruction to resolve per-CPU addrsAndrii Nakryiko1-0/+20
Add a new BPF instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. We use a special BPF_MOV | BPF_ALU64 | BPF_X form with insn->off field set to BPF_ADDR_PERCPU = -1. I used negative offset value to distinguish them from positive ones used by user-exposed instructions. Such instruction performs a resolution of a per-CPU offset stored in a register to a valid kernel address which can be dereferenced. It is useful in any use case where absolute address of a per-CPU data has to be resolved (e.g., in inlining bpf_map_lookup_elem()). BPF disassembler is also taught to recognize them to support dumping final BPF assembly code (non-JIT'ed version). Add arch-specific way for BPF JITs to mark support for this instructions. This patch also adds support for these instructions in x86-64 BPF JIT. Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-04-03firmware: cs_dsp: Add locked wrappers for coeff read and writeSimon Trimmer1-0/+4
It is a common pattern for functions to take and release the DSP pwr_lock over the cs_dsp calls to read and write firmware controls. Add wrapper functions to do this sequence so that the calling code can be simplified to a single function call.. Signed-off-by: Simon Trimmer <[email protected]> Signed-off-by: Richard Fitzgerald <[email protected]> Reviewed-by: Takashi Iwai <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-04-03PM: wakeup: Remove unnecessary else from device_init_wakeup()Dhruva Gole1-4/+3
Checkpatch warns that else is generally not necessary after a return condition which exists in the if part of this function. Hence, just to abide by what checkpatch recommends, follow it's guidelines. Signed-off-by: Dhruva Gole <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-03PM: wakeup: make device_wakeup_disable() return voidDhruva Gole1-3/+2
The device_wakeup_disable() call only returns an error if no dev exists, but there's not much a user can do at that point. Rather, make this function return void. Signed-off-by: Dhruva Gole <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-03tee: optee: Move pool_op helper functionsBalint Dobszay1-0/+10
Move the pool alloc and free helper functions from the OP-TEE driver to the TEE subsystem, since these could be reused in other TEE drivers. This patch is not supposed to change behavior, it's only reorganizing the code. Reviewed-by: Sumit Garg <[email protected]> Suggested-by: Jens Wiklander <[email protected]> Signed-off-by: Balint Dobszay <[email protected]> Signed-off-by: Jens Wiklander <[email protected]>
2024-04-03gpiolib: legacy: Remove unused gpio_request_array() and gpio_free_array()Andy Shevchenko1-15/+0
No more users. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Yanteng Si <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
2024-04-03netdevice: add DEFINE_FREE() for dev_putJohannes Berg1-0/+2
For short netdev holds within a function there are still a lot of users of dev_put() rather than netdev_put(). Add DEFINE_FREE() to allow making those safer. Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03rtnetlink: add guard for RTNLJohannes Berg1-0/+3
The new guard/scoped_gard can be useful for the RTNL as well, so add a guard definition for it. It gets used like { guard(rtnl)(); // RTNL held until end of block } or scoped_guard(rtnl) { // RTNL held in this block } as with any other guard/scoped_guard. Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03tee: Refactor TEE subsystem header filesSumit Garg2-254/+327
Since commit 25559c22cef8 ("tee: add kernel internal client interface"), it has been a common include/linux/tee_drv.h header file which is shared to hold TEE subsystem internal bits along with the APIs exposed to the TEE client drivers. However, this practice is prone to TEE subsystem internal APIs abuse and especially so with the new TEE implementation drivers being added to reuse existing functionality. In order to address this split TEE subsystem internal bits as a separate header file: include/linux/tee_core.h which should be the one used by TEE implementation drivers. With that include/linux/tee_drv.h lists only APIs exposed by TEE subsystem to the TEE client drivers. Signed-off-by: Sumit Garg <[email protected]> Signed-off-by: Balint Dobszay <[email protected]> Signed-off-by: Jens Wiklander <[email protected]>
2024-04-02page_pool: check for PP direct cache locality laterAlexander Lobakin1-6/+6
Since we have pool->p.napi (Jakub) and pool->cpuid (Lorenzo) to check whether it's safe to use direct recycling, we can use both globally for each page instead of relying solely on @allow_direct argument. Let's assume that @allow_direct means "I'm sure it's local, don't waste time rechecking this" and when it's false, try the mentioned params to still recycle the page directly. If neither is true, we'll lose some CPU cycles, but then it surely won't be hotpath. On the other hand, paths where it's possible to use direct cache, but not possible to safely set @allow_direct, will benefit from this move. The whole propagation of @napi_safe through a dozen of skb freeing functions can now go away, which saves us some stack space. Signed-off-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-02rhashtable: Improve grammarJonathan Neuschäfer1-5/+5
Change "a" to "an" according to the usual rules, fix an "if" that was mistyped as "in", improve grammar in "considerable slow" -> "considerably slower". Signed-off-by: Jonathan Neuschäfer <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-02io_uring/kbuf: get rid of lower BGID listsJens Axboe1-1/+0
Just rely on the xarray for any kind of bgid. This simplifies things, and it really doesn't bring us much, if anything. Cc: [email protected] # v6.4+ Signed-off-by: Jens Axboe <[email protected]>
2024-04-02counter: stm32-timer-cnt: add support for capture eventsFabrice Gasnier1-0/+13
Add support for capture events. Captured counter value for each channel can be retrieved through CCRx register. STM32 timers can have up to 4 capture channels (on input channel 1 to channel 4), hence need to check the number of channels before reading the capture data. The capture configuration is hard-coded to capture signals on both edges (non-inverted). Interrupts are used to report events independently for each channel. Reviewed-by: William Breathitt Gray <[email protected]> Acked-by: Lee Jones <[email protected]> Signed-off-by: Fabrice Gasnier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: William Breathitt Gray <[email protected]>
2024-04-02counter: Introduce the COUNTER_COMP_FREQUENCY() macroFabrice Gasnier1-0/+3
Now that there are two users for the "frequency" extension, introduce a new COUNTER_COMP_FREQUENCY() macro. This extension is intended to be a read-only signal attribute. Suggested-by: William Breathitt Gray <[email protected]> Signed-off-by: Fabrice Gasnier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: William Breathitt Gray <[email protected]>
2024-04-02counter: linux/counter.h: fix Excess kernel-doc description warningRandy Dunlap1-1/+0
Remove the @priv: line to prevent the kernel-doc warning: include/linux/counter.h:400: warning: Excess struct member 'priv' description in 'counter_device' Signed-off-by: Randy Dunlap <[email protected]> Fixes: f2ee4759fb70 ("counter: remove old and now unused registration API") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: William Breathitt Gray <[email protected]>
2024-04-02bpf: Improve program stats run-time calculationJose Fernandez1-2/+4
This patch improves the run-time calculation for program stats by capturing the duration as soon as possible after the program returns. Previously, the duration included u64_stats_t operations. While the instrumentation overhead is part of the total time spent when stats are enabled, distinguishing between the program's native execution time and the time spent due to instrumentation is crucial for accurate performance analysis. By making this change, the patch facilitates more precise optimization of BPF programs, enabling users to understand their performance in environments without stats enabled. I used a virtualized environment to measure the run-time over one minute for a basic raw_tracepoint/sys_enter program, which just increments a local counter. Although the virtualization introduced some performance degradation that could affect the results, I observed approximately a 16% decrease in average run-time reported by stats with this change (310 -> 260 nsec). Signed-off-by: Jose Fernandez <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-04-02mmc: sdio: store owner from modules with sdio_register_driver()Krzysztof Kozlowski1-1/+4
Modules registering driver with sdio_register_driver() might forget to set .owner field. The field is used by some of other kernel parts for reference counting (try_module_get()), so it is expected that drivers will set it. Solve the problem by moving this task away from the drivers to the core code, just like we did for platform_driver in commit 9447057eaff8 ("platform_device: use a macro instead of platform_driver_register"). Since many drivers forget to set the .owner, this effectively will fix them. Examples of fixed drivers are: ath6kl, b43, btsdio.c, ks7010, libertas, MediaTek WiFi drivers, Realtek WiFi drivers, rsi, siano, wilc1000, wl1251 and more. Signed-off-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]>
2024-04-02Merge drm/drm-next into drm-misc-nextThomas Zimmermann348-2914/+6981
Backmerging to get v6.9-rc2 changes into drm-misc-next. Signed-off-by: Thomas Zimmermann <[email protected]>
2024-04-01genetlink: remove linux/genetlink.hJakub Kicinski2-15/+1
genetlink.h is a shell of what used to be a combined uAPI and kernel header over a decade ago. It has fewer than 10 lines of code. Merge it into net/genetlink.h. In some ways it'd be better to keep the combined header under linux/ but it would make looking through git history harder. Acked-by: Sven Eckelmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-01netlink: create a new header for internal genetlink symbolsJakub Kicinski1-5/+0
There are things in linux/genetlink.h which are only used under net/netlink/. Move them to a new local header. A new header with just 2 externs isn't great, but alternative would be to include af_netlink.h in genetlink.c which feels even worse. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-02crypto: qat - add interface for live migrationXin Zeng1-0/+31
Extend the driver with a new interface to be used for VF live migration. This allows to create and destroy a qat_mig_dev object that contains a set of methods to allow to save and restore the state of QAT VF. This interface will be used by the qat-vfio-pci module. Signed-off-by: Xin Zeng <[email protected]> Reviewed-by: Giovanni Cabiddu <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-01block: add a bio_list_merge_init helperChristoph Hellwig1-0/+7
This is a simple combination of bio_list_merge + bio_list_init similar to list_splice_init. While it only saves a single line in a callers, it makes the move all bios from one list to another and reinitialize the original pattern a lot more obvious in the callers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Matthew Sakai <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-04-01Merge 6.9-rc2 into usb-nextGreg Kroah-Hartman8-12/+36
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-04-01bus: mhi: host: Add mhi_power_down_keep_dev() API to support system ↵Baochen Qiang1-1/+17
suspend/hibernation Currently, ath11k fails to resume from system suspend/hibernation on some the x86 host machines with below error message: ``` ath11k_pci 0000:06:00.0: timeout while waiting for restart complete ``` This happens because, ath11k powers down the MHI stack during suspend and that leads to destruction of the struct device associated with the MHI channels. And during resume, ath11k calls calling mhi_sync_power_up() to power up the MHI subsystem and that eventually calls the driver framework's device_add() API from mhi_create_devices(). But the PM framework blocks the struct device creation during device_add() and this leads to probe deferral as below: ``` mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral ``` The reason for deferring device creation during resume is explained in dpm_prepare(): /* * It is unsafe if probing of devices will happen during suspend or * hibernation and system behavior will be unpredictable in this * case. So, let's prohibit device's probing here and defer their * probes instead. The normal behavior will be restored in * dpm_complete(). */ Due to the device probe deferral, qcom_mhi_qrtr_probe() API is not getting called during resume and thus MHI channels are not prepared. So this blocks the QMI messages from being transferred between ath11k and firmware, resulting in a firmware initialization failure. After consulting with Rafael, it was decided to not destroy the struct device for the MHI channels during system suspend/hibernation because the device is bound to appear again during resume. So to achieve this, a new API called mhi_power_down_keep_dev() is introduced for MHI controllers to keep the struct device when required. This API is similar to the existing mhi_power_down() API, except that it keeps the struct device associated with MHI channels instead of destroying them. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 Signed-off-by: Baochen Qiang <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Jeff Johnson <[email protected]> Link: https://lore.kernel.org/r/[email protected] [mani: reworded the commit message and subject] Signed-off-by: Manivannan Sadhasivam <[email protected]>
2024-04-01power: supply: bq27xxx: Move health reading out of update loopAndrew Davis1-1/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update health reading to match. As health is not checked for changes as part of the update loop, remove the read of this from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01power: supply: bq27xxx: Move cycle count reading out of update loopAndrew Davis1-1/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update cycle count reading to match. As cycle count is not checked for changes as part of the update loop, remove the read of this from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01power: supply: bq27xxx: Move energy reading out of update loopAndrew Davis1-1/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update energy reading to match. As energy is not checked for changes as part of the update loop, remove the read of this from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01power: supply: bq27xxx: Move charge reading out of update loopAndrew Davis1-1/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update charge reading to match. As charge state is not checked for changes as part of the update loop, remove the read of this from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01power: supply: bq27xxx: Move time reading out of update loopAndrew Davis1-3/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update time reading to match. As time is not checked for changes as part of the update loop, remove the read of the this from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01power: supply: bq27xxx: Move temperature reading out of update loopAndrew Davis1-1/+0
Most of the functions that read values return a status and put the value itself in an a function parameter. Update temperature reading to match. As temp is not checked for changes as part of the update loop, remove the read of the temperature from the periodic update loop. This saves I2C/1W bandwidth. It also means we do not have to cache it, fresh values are read when requested. Signed-off-by: Andrew Davis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-04-01net: rps: move received_rps field to a better locationEric Dumazet1-1/+1
Commit 14d898f3c1b3 ("dev: Move received_rps counter next to RPS members in softnet data") was unfortunate: received_rps is dirtied by a cpu and never read by other cpus in fast path. Its presence in the hot RPS cache line (shared by many cpus) is hurting RPS/RFS performance. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01net: rps: change input_queue_tail_incr_save()Eric Dumazet1-15/+0
input_queue_tail_incr_save() is incrementing the sd queue_tail and save it in the flow last_qtail. Two issues here : - no lock protects the write on last_qtail, we should use appropriate annotations. - We can perform this write after releasing the per-cpu backlog lock, to decrease this lock hold duration (move away the cache line miss) Also move input_queue_head_incr() and rps helpers to include/net/rps.h, while adding rps_ prefix to better reflect their role. v2: Fixed a build issue (Jakub and kernel build bots) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01net: make softnet_data.dropped an atomic_tEric Dumazet1-1/+2
If under extreme cpu backlog pressure enqueue_to_backlog() has to drop a packet, it could do this without dirtying a cache line and potentially slowing down the target cpu. Move sd->dropped into a separate cache line, and make it atomic. In non pressure mode, this field is not touched, no need to consume valuable space in a hot cache line. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01net: move dev_xmit_recursion() helpers to net/core/dev.hEric Dumazet1-17/+0
Move dev_xmit_recursion() and friends to net/core/dev.h They are only used from net/core/dev.c and net/core/filter.c. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01net: move kick_defer_list_purge() to net/core/dev.hEric Dumazet1-1/+0
kick_defer_list_purge() is defined in net/core/dev.c and used from net/core/skubff.c Because we need softnet_data, include <linux/netdevice.h> from net/core/dev.h Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01ip_tunnel: use a separate struct to store tunnel params in the kernelAlexander Lobakin1-3/+4
Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure to store params inside the kernel, IPv4 tunnel code uses the same ip_tunnel_parm which is being used to talk with the userspace. This makes it difficult to alter or add any fields or use a different format for whatever data. Define struct ip_tunnel_parm_kern, a 1:1 copy of ip_tunnel_parm for now, and use it throughout the code. Define the pieces, where the copy user <-> kernel happens, as standalone functions, and copy the data there field-by-field, so that the kernel-side structure could be easily modified later on and the users wouldn't have to care about this. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}()Alexander Lobakin1-33/+5
Now that we have generic bitmap_read() and bitmap_write(), which are inline and try to take care of non-bound-crossing and aligned cases to keep them optimized, collapse bitmap_{get,set}_value8() into simple wrappers around the former ones. bloat-o-meter shows no difference in vmlinux and -2 bytes for gpio-pca953x.ko, which says the optimization didn't suffer due to that change. The converted helpers have the value width embedded and always compile-time constant and that helps a lot. Suggested-by: Yury Norov <[email protected]> Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01bitmap: introduce generic optimized bitmap_size()Alexander Lobakin2-4/+6
The number of times yet another open coded `BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge. Some generic helper is long overdue. Add one, bitmap_size(), but with one detail. BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both divident and divisor are compile-time constants or when the divisor is not a pow-of-2. When it is however, the compilers sometimes tend to generate suboptimal code (GCC 13): 48 83 c0 3f add $0x3f,%rax 48 c1 e8 06 shr $0x6,%rax 48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx %BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does full division of `nbits + 63` by it and then multiplication by 8. Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC: 8d 50 3f lea 0x3f(%rax),%edx c1 ea 03 shr $0x3,%edx 81 e2 f8 ff ff 1f and $0x1ffffff8,%edx Now it shifts `nbits + 63` by 3 positions (IOW performs fast division by 8) and then masks bits[2:0]. bloat-o-meter: add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617) Clang does it better and generates the same code before/after starting from -O1, except that with the ALIGN() approach it uses %edx and thus still saves some bytes: add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520) Note that we can't expand DIV_ROUND_UP() by adding a check and using this approach there, as it's used in array declarations where expressions are not allowed. Add this helper to tools/ as well. Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01linkmode: convert linkmode_{test,set,clear,mod}_bit() to macrosAlexander Lobakin1-23/+4
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops on compile-time constants"), the non-atomic bitops are macros which can be expanded by the compilers into compile-time expressions, which will result in better optimized object code. Unfortunately, turned out that passing `volatile` to those macros discards any possibility of optimization, as the compilers then don't even try to look whether the passed bitmap is known at compilation time. In addition to that, the mentioned linkmode helpers are marked with `inline`, not `__always_inline`, meaning that it's not guaranteed some compiler won't uninline them for no reason, which will also effectively prevent them from being optimized (it's a well-known thing the compilers sometimes uninline `2 + 2`). Convert linkmode_*_bit() from inlines to macros. Their calling convention are 1:1 with the corresponding bitops, so that it's not even needed to enumerate and map the arguments, only the names. No changes in vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that doesn't necessarily means the change is meaningless. Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01bitops: let the compiler optimize {__,}assign_bit()Alexander Lobakin1-16/+4
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops on compile-time constants"), the compilers are able to expand inline bitmap operations to compile-time initializers when possible. However, during the round of replacement if-__set-else-__clear with __assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes difference in object code size for one module (even one function), where the pattern: DECLARE_BITMAP(foo) = { }; // on the stack, zeroed if (a) __set_bit(const_bit_num, foo); if (b) __set_bit(another_const_bit_num, foo); ... is heavily used, although there should be no difference: the bitmap is zeroed, so the second half of __assign_bit() should be compiled-out as a no-op. I either missed the fact that __assign_bit() has bitmap pointer marked as `volatile` (as we usually do for bitops) or was hoping that the compilers would at least try to look past the `volatile` for __always_inline functions. Anyhow, due to that attribute, the compilers were always compiling the whole expression and no mentioned compile-time optimizations were working. Convert __assign_bit() to a macro since it's a very simple if-else and all of the checks are performed inside __set_bit() and __clear_bit(), thus that wrapper has to be as transparent as possible. After that change, despite it showing only -20 bytes change for vmlinux (due to that it's still relatively unpopular), no drastic code size changes happen when replacing if-set-else-clear for onstack bitmaps with __assign_bit(), meaning the compiler now expands them to the actual operations will all the expected optimizations. Atomic assign_bit() is less affected due to its nature, but let's convert it to a macro as well to keep the code consistent and not leave a place for possible suboptimal codegen. Moreover, with certain kernel configuration it actually gives some saves (x86): do_ip_setsockopt 4154 4099 -55 Suggested-by: Yury Norov <[email protected]> # assign_bit(), too Cc: Andy Shevchenko <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01bitops: make BYTES_TO_BITS() treewide-availableAlexander Lobakin1-0/+2
Avoid open-coding that simple expression each time by moving BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export it to the rest of the kernel. Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always equals to %BITS_PER_BYTE, regardless of the target architecture. Do the same for the tools ecosystem as well (incl. its version of bitops.h). The previous implementation had its implicit type of long, while the new one is int, so adjust the format literal accordingly in the perf code. Suggested-by: Andy Shevchenko <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01bitops: add missing prototype checkAlexander Lobakin1-0/+1
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added a new bitop, test_bit_acquire(), with proper wrapping in order to try to optimize it at compile-time, but missed the list of bitops used for checking their prototypes a bit below. The functions added have consistent prototypes, so that no more changes are required and no functional changes take place. Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01lib/bitmap: add bitmap_{read,write}()Syed Nayyar Waris1-0/+77
The two new functions allow reading/writing values of length up to BITS_PER_LONG bits at arbitrary position in the bitmap. The code was taken from "bitops: Introduce the for_each_set_clump macro" by Syed Nayyar Waris with a number of changes and simplifications: - instead of using roundup(), which adds an unnecessary dependency on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; - indentation is reduced by not using else-clauses (suggested by checkpatch for bitmap_get_value()); - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() and bitmap_write(); - some redundant computations are omitted. Cc: Arnd Bergmann <[email protected]> Signed-off-by: Syed Nayyar Waris <[email protected]> Signed-off-by: William Breathitt Gray <[email protected]> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ Suggested-by: Yury Norov <[email protected]> Co-developed-by: Alexander Potapenko <[email protected]> Signed-off-by: Alexander Potapenko <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Yury Norov <[email protected]> Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-01timers: Fix kernel-doc format and add Return valuesRandy Dunlap1-2/+10
Fix kernel-doc format and warnings: timer.h:26: warning: Cannot understand * @TIMER_DEFERRABLE: A deferrable timer will work normally when the on line 26 - I thought it was a doc line timer.h:146: warning: No description found for return value of 'timer_pending' timer.h:180: warning: No description found for return value of 'del_timer_sync' timer.h:193: warning: No description found for return value of 'del_timer' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-01time/timekeeping: Fix kernel-doc warnings and typosRandy Dunlap1-7/+42
Fix punctuation, spellos, and kernel-doc warnings: timekeeping.h:79: warning: No description found for return value of 'ktime_get_real' timekeeping.h:95: warning: No description found for return value of 'ktime_get_boottime' timekeeping.h:108: warning: No description found for return value of 'ktime_get_clocktai' timekeeping.h:149: warning: Function parameter or struct member 'mono' not described in 'ktime_mono_to_real' timekeeping.h:149: warning: No description found for return value of 'ktime_mono_to_real' timekeeping.h:255: warning: Function parameter or struct member 'cs_id' not described in 'system_time_snapshot' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-01time/timecounter: Fix inline documentationRandy Dunlap1-2/+9
Fix kernel-doc warnings, text punctuation, and a kernel-doc marker (change '%' to '&' to indicate a struct): timecounter.h:72: warning: No description found for return value of 'cyclecounter_cyc2ns' timecounter.h:85: warning: Function parameter or member 'tc' not described in 'timecounter_adjtime' timecounter.h:111: warning: No description found for return value of 'timecounter_read' timecounter.h:128: warning: No description found for return value of 'timecounter_cyc2time' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-31Merge tag 'irq_urgent_for_v6.9_rc2' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix an unused function warning on irqchip/irq-armada-370-xp - Fix the IRQ sharing with pinctrl-amd and ACPI OSL * tag 'irq_urgent_for_v6.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/armada-370-xp: Suppress unused-function warning genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd
2024-03-31fpga: bridge: add owner module and take its refcountMarco Pagani1-3/+7
The current implementation of the fpga bridge assumes that the low-level module registers a driver for the parent device and uses its owner pointer to take the module's refcount. This approach is problematic since it can lead to a null pointer dereference while attempting to get the bridge if the parent device does not have a driver. To address this problem, add a module owner pointer to the fpga_bridge struct and use it to take the module's refcount. Modify the function for registering a bridge to take an additional owner module parameter and rename it to avoid conflicts. Use the old function name for a helper macro that automatically sets the module that registers the bridge as the owner. This ensures compatibility with existing low-level control modules and reduces the chances of registering a bridge without setting the owner. Also, update the documentation to keep it consistent with the new interface for registering an fpga bridge. Other changes: opportunistically move put_device() from __fpga_bridge_get() to fpga_bridge_get() and of_fpga_bridge_get() to improve code clarity since the bridge device is taken in these functions. Fixes: 21aeda950c5f ("fpga: add fpga bridge framework") Suggested-by: Greg Kroah-Hartman <[email protected]> Suggested-by: Xu Yilun <[email protected]> Reviewed-by: Russ Weight <[email protected]> Signed-off-by: Marco Pagani <[email protected]> Acked-by: Xu Yilun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xu Yilun <[email protected]>
2024-03-31fpga: manager: add owner module and take its refcountMarco Pagani1-6/+20
The current implementation of the fpga manager assumes that the low-level module registers a driver for the parent device and uses its owner pointer to take the module's refcount. This approach is problematic since it can lead to a null pointer dereference while attempting to get the manager if the parent device does not have a driver. To address this problem, add a module owner pointer to the fpga_manager struct and use it to take the module's refcount. Modify the functions for registering the manager to take an additional owner module parameter and rename them to avoid conflicts. Use the old function names for helper macros that automatically set the module that registers the manager as the owner. This ensures compatibility with existing low-level control modules and reduces the chances of registering a manager without setting the owner. Also, update the documentation to keep it consistent with the new interface for registering an fpga manager. Other changes: opportunistically move put_device() from __fpga_mgr_get() to fpga_mgr_get() and of_fpga_mgr_get() to improve code clarity since the manager device is taken in these functions. Fixes: 654ba4cc0f3e ("fpga manager: ensure lifetime with of_fpga_mgr_get") Suggested-by: Greg Kroah-Hartman <[email protected]> Suggested-by: Xu Yilun <[email protected]> Signed-off-by: Marco Pagani <[email protected]> Acked-by: Xu Yilun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xu Yilun <[email protected]>
2024-03-30Merge tag 'scsi-fixes' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes and updates from James Bottomley: "Fully half this pull is updates to lpfc and qla2xxx which got committed just as the merge window opened. A sizeable fraction of the driver updates are simple bug fixes (and lock reworks for bug fixes in the case of lpfc), so rather than splitting the few actual enhancements out, we're just adding the drivers to the -rc1 pull. The enhancements for lpfc are log message removals, copyright updates and three patches redefining types. For qla2xxx it's just removing a debug message on module removal and the manufacturer detail update. The two major fixes are the sg teardown race and a core error leg problem with the procfs directory not being removed if we destroy a created host that never got to the running state. The rest are minor fixes and constifications" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (41 commits) scsi: bnx2fc: Remove spin_lock_bh while releasing resources after upload scsi: core: Fix unremoved procfs host directory regression scsi: mpi3mr: Avoid memcpy field-spanning write WARNING scsi: sd: Fix TCG OPAL unlock on system resume scsi: sg: Avoid sg device teardown race scsi: lpfc: Copyright updates for 14.4.0.1 patches scsi: lpfc: Update lpfc version to 14.4.0.1 scsi: lpfc: Define types in a union for generic void *context3 ptr scsi: lpfc: Define lpfc_dmabuf type for ctx_buf ptr scsi: lpfc: Define lpfc_nodelist type for ctx_ndlp ptr scsi: lpfc: Use a dedicated lock for ras_fwlog state scsi: lpfc: Release hbalock before calling lpfc_worker_wake_up() scsi: lpfc: Replace hbalock with ndlp lock in lpfc_nvme_unregister_port() scsi: lpfc: Update lpfc_ramp_down_queue_handler() logic scsi: lpfc: Remove IRQF_ONESHOT flag from threaded IRQ handling scsi: lpfc: Move NPIV's transport unregistration to after resource clean up scsi: lpfc: Remove unnecessary log message in queuecommand path scsi: qla2xxx: Update version to 10.02.09.200-k scsi: qla2xxx: Delay I/O Abort on PCI error scsi: qla2xxx: Change debug message during driver unload ...