aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-08-08net: fs_enet: Move struct fs_platform_info into fs_enet.hChristophe Leroy1-16/+0
struct fs_platform_info is only used in fs_enet ethernet driver since commit 3dd82a1ea724 ("[POWERPC] CPM: Always use new binding."). Stale prototypes using fs_platform_info were left over in arch/powerpc/sysdev/fsl_soc.c but they are now removed by previous patch. Move struct fs_platform_info into fs_enet.h Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/f882d6b0b7075d0d8435310634ceaa2cc8e9938f.1691155347.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-08net: fs_enet: Remove has_phy field in fs_platform_info structChristophe Leroy1-1/+0
Since commit 3dd82a1ea724 ("[POWERPC] CPM: Always use new binding.") has_phy field is never set. Remove dead code and remove the field. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/bb5264e09e18f0ce8a0dbee399926a59f33cb248.1691155346.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-08net: fs_enet: Remove unused fields in fs_platform_info structChristophe Leroy1-19/+0
Since commit 3dd82a1ea724 ("[POWERPC] CPM: Always use new binding.") many fields of fs_platform_info structure are not used anymore. Remove them. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/2e584fcd75e21a0f7e7d5f942eebdc067b2f82f9.1691155346.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-08net: fs_enet: Remove fs_get_id()Christophe Leroy1-11/+0
fs_get_id() hasn't been used since commit b219108cbace ("fs_enet: Remove !CONFIG_PPC_CPM_NEW_BINDING code") Remove it. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/7a53b88cc40302fcbea59554f5e7067e3594ad53.1691155346.git.christophe.leroy@csgroup.eu Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-08block: get rid of unused plug->nowait flagJens Axboe1-1/+0
This was introduced to add a plug based way of signaling nowait issues, but we have since moved on from that. Kill the old dead code, nobody is setting it anymore. Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2023-08-08ublk: enable zoned storage supportAndreas Hindborg1-9/+54
Add zoned storage support to ublk: report_zones and operations: - REQ_OP_ZONE_OPEN - REQ_OP_ZONE_CLOSE - REQ_OP_ZONE_FINISH - REQ_OP_ZONE_RESET - REQ_OP_ZONE_APPEND The zone append feature uses the `addr` field of `struct ublksrv_io_cmd` to communicate ALBA back to the kernel. Therefore ublk must be used with the user copy feature (UBLK_F_USER_COPY) for zoned storage support to be available. Without this feature, ublk will not allow zoned storage support. Signed-off-by: Andreas Hindborg <[email protected]> Reviewed-by: Ming Lei <[email protected]> Tested-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-08-08lsm: constify the 'target' parameter in security_capget()Khadija Kamran2-4/+5
Three LSMs register the implementations for the "capget" hook: AppArmor, SELinux, and the normal capability code. Looking at the function implementations we may observe that the first parameter "target" is not changing. Mark the first argument "target" of LSM hook security_capget() as "const" since it will not be changing in the LSM hook. cap_capget() LSM hook declaration exceeds the 80 characters per line limit. Split the function declaration to multiple lines to decrease the line length. Signed-off-by: Khadija Kamran <[email protected]> Acked-by: John Johansen <[email protected]> [PM: align the cap_capget() declaration, spelling fixes] Signed-off-by: Paul Moore <[email protected]>
2023-08-08kunit: Allow kunit test modules to use test filteringJanusz Krzysztofik1-0/+11
External tools, e.g., Intel GPU tools (IGT), support execution of individual selftests provided by kernel modules. That could be also applicable to kunit test modules if they provided test filtering. But test filtering is now possible only when kunit code is built into the kernel. Moreover, a filter can be specified only at boot time, then reboot is required each time a different filter is needed. Build the test filtering code also when kunit is configured as a module, expose test filtering functions to other kunit source files, and use them in kunit module notifier callback functions. Userspace can then reload the kunit module with a value of the filter_glob parameter tuned to a specific kunit test module every time it wants to limit the scope of tests executed on that module load. Make the kunit.filter* parameters visible in sysfs for user convenience. v5: Refresh on tpp of attributes filtering fix v4: Refresh on top of newly applied attributes patches and changes introdced by new versions of other patches submitted in series with this one. v3: Fix CONFIG_GLOB, required by filtering functions, not selected when building as a module ([email protected]). v2: Fix new name of a structure moved to kunit namespace not updated across all uses ([email protected]). Signed-off-by: Janusz Krzysztofik <[email protected]> Reviewed-by: David Gow <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-08-08kunit: Make 'list' action available to kunit test modulesJanusz Krzysztofik1-0/+2
Results from kunit tests reported via dmesg may be interleaved with other kernel messages. When parsing dmesg for modular kunit results in real time, external tools, e.g., Intel GPU tools (IGT), may want to insert their own test name markers into dmesg at the start of each test, before any kernel message related to that test appears there, so existing upper level test result parsers have no doubt which test to blame for a specific kernel message. Unfortunately, kunit reports names of tests only at their completion (with the exeption of a not standarized "# Subtest: <name>" header above a test plan of each test suite or parametrized test). External tools could be able to insert their own "start of the test" markers with test names included if they new those names in advance. Test names could be learned from a list if provided by a kunit test module. There exists a feature of listing kunit tests without actually executing them, but it is now limited to configurations with the kunit module built in and covers only built-in tests, already available at boot time. Moreover, switching from list to normal mode requires reboot. If that feature was also available when kunit is built as a module, userspace could load the module with action=list parameter, load some kunit test modules they are interested in and learn about the list of tests provided by those modules, then unload them, reload the kunit module in normal mode and execute the tests with their lists already known. Extend kunit module notifier initialization callback with a processing path for only listing the tests provided by a module if the kunit action parameter is set to "list" or "list_attr". For user convenience, make the kunit.action parameter visible in sysfs. v2: Don't use a different format, use kunit_exec_list_tests() (Rae), - refresh on top of new attributes patches, handle newly introduced kunit.action=list_attr case (Rae). Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Rae Moar <[email protected]> Reviewed-by: David Gow <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-08-08kunit: Report the count of test suites in a moduleJanusz Krzysztofik1-0/+8
According to KTAP specification[1], results should always start from a header that provides a TAP protocol version, followed by a test plan with a count of items to be executed. That pattern should be followed at each nesting level. In the current implementation of the top-most, i.e., test suite level, those rules apply only for test suites built into the kernel, executed and reported on boot. Results submitted to dmesg from kunit test modules loaded later are missing those top-level headers. As a consequence, if a kunit test module provides more than one test suite then, without the top level test plan, external tools that are parsing dmesg for kunit test output are not able to tell how many test suites should be expected and whether to continue parsing after complete output from the first test suite is collected. Submit the top-level headers also from the kunit test module notifier initialization callback. v3: Fix new name of a structure moved to kunit namespace not updated in executor_test functions ([email protected]). v2: Use kunit_exec_run_tests() (Mauro, Rae), but prevent it from emitting the headers when called on load of non-test modules. [1] https://docs.kernel.org/dev-tools/ktap.html# Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Rae Moar <[email protected]> Reviewed-by: Rae Moar <[email protected]> Reviewed-by: David Gow <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-08-08dt-bindings: clk: axg-audio-clkc: expose all clock idsNeil Armstrong1-0/+65
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every axg-audio-clkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-14-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: amlogic,a1-pll-clkc: expose all clock idsNeil Armstrong1-0/+5
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every A1 pll ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Reviewed-by: Dmitry Rokosov <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-13-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: amlogic,a1-peripherals-clkc: expose all clock idsNeil Armstrong1-0/+53
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every A1 peripherals ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Reviewed-by: Dmitry Rokosov <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-12-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: meson8b-clkc: expose all clock idsNeil Armstrong1-0/+97
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every meson8b-clkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-11-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: g12a-aoclkc: expose all clock idsNeil Armstrong1-0/+7
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every g12a-aoclkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-10-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: g12a-clks: expose all clock idsNeil Armstrong1-0/+130
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every g12a-clkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-9-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: axg-clkc: expose all clock idsNeil Armstrong1-0/+48
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every axg-clkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-8-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08dt-bindings: clk: gxbb-clkc: expose all clock idsNeil Armstrong1-0/+65
Due to a policy change in clock ID bindings handling, expose all the "private" clock IDs to the public clock dt-bindings to move out of the previous maintenance scheme. This refers to a discussion at [1] & [2] with Krzysztof about the issue with the current maintenance. It was decided to move every gxbb-clkc ID to the public clock dt-bindings headers to be merged in a single tree so we can safely add new clocks without having merge issues. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20230607-topic-amlogic-upstream-clkid-public-migration-v2-7-38172d17c27a@linaro.org Signed-off-by: Jerome Brunet <[email protected]>
2023-08-08virtio: Remove PM #ifdef guards to fix i2c driverArnd Bergmann1-2/+0
A cleanup in the virtio i2c caused a build failure: drivers/i2c/busses/i2c-virtio.c:270:10: error: 'struct virtio_driver' has no member named 'freeze' drivers/i2c/busses/i2c-virtio.c:271:10: error: 'struct virtio_driver' has no member named 'restore' Change the structure definition to allow this cleanup to be applied everywhere. Fixes: 73d546c76235b ("i2c: virtio: Remove #ifdef guards for PM related functions") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Paul Cercueil <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Andi Shyti <[email protected]>
2023-08-08ALSA: info: Remove unused function declarationsYue Haibing1-2/+0
These declarations is never used since beginning of git history. Signed-off-by: Yue Haibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-08-08ASoC: SOF: Intel: add LunarLake supportMark Brown4-19/+120
Merge series from Pierre-Louis Bossart <[email protected]>: This patchset first fixes a number of errors made in the hda-mlink support, then adds Lunar Lake definitions. The main contribution is the hda-dai changes where the HDaudio DMA is now used for SSP, DMIC and SoundWire. In previous hardware the GPDMA (aka DesignWare) was used and controlled by the audio firmware. The volume of code is minimized with the abstraction added in previous kernel cycles. Due to cross-dependencies between ASoC and SoundWire trees, the full support for jack detection will be deferred to the next kernel cycle. There's not much point to ask for a sync of the two trees to support one patch for each tree - we are at -rc5 already.
2023-08-08dt-bindings: clock: add Intel Agilex5 clock managerNiravkumar L Rabara1-0/+100
Add clock ID definitions for Intel Agilex5 SoCFPGA. The registers in Agilex5 handling the clock is named as clock manager. Signed-off-by: Teh Wen Ping <[email protected]> Reviewed-by: Dinh Nguyen <[email protected]> Reviewed-by: Conor Dooley <[email protected]> Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Dinh Nguyen <[email protected]>
2023-08-08dt-bindings: reset: add reset IDs for Agilex5Niravkumar L Rabara1-1/+4
Add reset ID definitions required for Intel Agilex5 SoCFPGA, re-use altr,rst-mgr-s10.h as common header file similar S10 & Agilex. Acked-by: Conor Dooley <[email protected]> Reviewed-by: Dinh Nguyen <[email protected]> Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Dinh Nguyen <[email protected]>
2023-08-08netfilter: h323: Remove unused function declarationsYue Haibing1-4/+0
Commit f587de0e2feb ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port") declared but never implemented these. Signed-off-by: Yue Haibing <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-08netfilter: conntrack: Remove unused function declarationsYue Haibing3-7/+0
Commit 1015c3de23ee ("netfilter: conntrack: remove extension register api") leave nf_conntrack_acct_fini() and nf_conntrack_labels_init() unused, remove it. And commit a0ae2562c6c4 ("netfilter: conntrack: remove l3proto abstraction") leave behind nf_ct_l3proto_try_module_get() and nf_ct_l3proto_module_put(). Signed-off-by: Yue Haibing <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-08netfilter: helper: Remove unused function declarationsYue Haibing1-3/+0
Commit b118509076b3 ("netfilter: remove nf_conntrack_helper sysctl and modparam toggles") leave these unused declarations. Signed-off-by: Yue Haibing <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-08netfilter: gre: Remove unused function declaration nf_ct_gre_keymap_flush()Yue Haibing1-1/+0
Commit a23f89a99906 ("netfilter: conntrack: nf_ct_gre_keymap_flush() removal") leave this unused, remove it. Signed-off-by: Yue Haibing <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-08wifi: cfg80211: fix sband iftype data lookup for AP_VLANFelix Fietkau1-0/+3
AP_VLAN interfaces are virtual, so doesn't really exist as a type for capabilities. When passed in as a type, AP is the one that's really intended. Fixes: c4cbaf7973a7 ("cfg80211: Add support for HE") Signed-off-by: Felix Fietkau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2023-08-08Documentation: core-api/cpuhotplug: Fix state namesAnna-Maria Behnsen1-1/+1
Dynamic allocated hotplug states in documentation and the comment above cpuhp_state enum do not match the code. To not get confused by wrong documentation, change to proper state names. Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-07workqueue: Make default affinity_scope dynamically updatableTejun Heo1-2/+1
While workqueue.default_affinity_scope is writable, it only affects workqueues which are created afterwards and isn't very useful. Instead, let's introduce explicit "default" scope and update the effective scope dynamically when workqueue.default_affinity_scope is changed. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Implement non-strict affinity scope for unbound workqueuesTejun Heo1-0/+11
An unbound workqueue can be served by multiple worker_pools to improve locality. The segmentation is achieved by grouping CPUs into pods. By default, the cache boundaries according to cpus_share_cache() define the CPUs are grouped. Let's a workqueue is allowed to run on all CPUs and the system has two L3 caches. The workqueue would be mapped to two worker_pools each serving one L3 cache domains. While this improves locality, because the pod boundaries are strict, it limits the total bandwidth a given issuer can consume. For example, let's say there is a thread pinned to a CPU issuing enough work items to saturate the whole machine. With the machine segmented into two pods, no matter how many work items it issues, it can only use half of the CPUs on the system. While this limitation has existed for a very long time, it wasn't very pronounced because the affinity grouping used to be always by NUMA nodes. With cache boundaries as the default and support for even finer grained scopes (smt and cpu), it is now an a lot more pressing problem. This patch implements non-strict affinity scope where the pod boundaries aren't enforced strictly. Going back to the previous example, the workqueue would still be mapped to two worker_pools; however, the affinity enforcement would be soft. The workers in both pools would have their cpus_allowed set to the whole machine thus allowing the scheduler to migrate them anywhere on the machine. However, whenever an idle worker is woken up, the workqueue code asks the scheduler to bring back the task within the pod if the worker is outside. ie. work items start executing within its affinity scope but can be migrated outside as the scheduler sees fit. This removes the hard cap on utilization while maintaining the benefits of affinity scopes. After the earlier ->__pod_cpumask changes, the implementation is pretty simple. When non-strict which is the new default: * pool_allowed_cpus() returns @pool->attrs->cpumask instead of ->__pod_cpumask so that the workers are allowed to run on any CPU that the associated workqueues allow. * If the idle worker task's ->wake_cpu is outside the pod, kick_pool() sets the field to a CPU within the pod. This would be the first use of task_struct->wake_cpu outside scheduler proper, so it isn't clear whether this would be acceptable. However, other methods of migrating tasks are significantly more expensive and are likely prohibitively so if we want to do this on every work item. This needs discussion with scheduler folks. There is also a race window where setting ->wake_cpu wouldn't be effective as the target task is still on CPU. However, the window is pretty small and this being a best-effort optimization, it doesn't seem to warrant more complexity at the moment. While the non-strict cache affinity scopes seem to be the best option, the performance picture interacts with the affinity scope and is a bit complicated to fully discuss in this patch, so the behavior is made easily selectable through wqattrs and sysfs and the next patch will add documentation to discuss performance implications. v2: pool->attrs->affn_strict is set to true for per-cpu worker_pools. Signed-off-by: Tejun Heo <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]>
2023-08-07workqueue: Add workqueue_attrs->__pod_cpumaskTejun Heo1-0/+16
workqueue_attrs has two uses: * to specify the required unouned workqueue properties by users * to match worker_pool's properties to workqueues by core code For example, if the user wants to restrict a workqueue to run only CPUs 0 and 2, and the two CPUs are on different affinity scopes, the workqueue's attrs->cpumask would contains CPUs 0 and 2, and the workqueue would be associated with two worker_pools, one with attrs->cpumask containing just CPU 0 and the other CPU 2. Workqueue wants to support non-strict affinity scopes where work items are started in their matching affinity scopes but the scheduler is free to migrate them outside the starting scopes, which can enable utilizing the whole machine while maintaining most of the locality benefits from affinity scopes. To enable that, worker_pools need to distinguish the strict affinity that it has to follow (because that's the restriction coming from the user) and the soft affinity that it wants to apply when dispatching work items. Note that two worker_pools with different soft dispatching requirements have to be separate; otherwise, for example, we'd be ping-ponging worker threads across NUMA boundaries constantly. This patch adds workqueue_attrs->__pod_cpumask. The new field is double underscored as it's only used internally to distinguish worker_pools. A worker_pool's ->cpumask is now always the same as the online subset of allowed CPUs of the associated workqueues, and ->__pod_cpumask is the pod's subset of that ->cpumask. Going back to the example above, both worker_pools would have ->cpumask containing both CPUs 0 and 2 but one's ->__pod_cpumask would contain 0 while the other's 2. * pool_allowed_cpus() is added. It returns the worker_pool's strict cpumask that the pool's workers must stay within. This is currently always ->__pod_cpumask as all boundaries are still strict. * As a workqueue_attrs can now track both the associated workqueues' cpumask and its per-pod subset, wq_calc_pod_cpumask() no longer needs an external out-argument. Drop @cpumask and instead store the result in ->__pod_cpumask. * The above also simplifies apply_wqattrs_prepare() as the same workqueue_attrs can be used to create all pods associated with a workqueue. tmp_attrs is dropped. * wq_update_pod() is updated to use wqattrs_equal() to test whether a pwq update is needed instead of only comparing ->cpumask so that ->__pod_cpumask is compared too. It can directly compare ->__pod_cpumaks but the code is easier to understand and more robust this way. The only user-visible behavior change is that two workqueues with different cpumasks no longer can share worker_pools even when their pod subsets coincide. Going back to the example, let's say there's another workqueue with cpumask 0, 2, 3, where 2 and 3 are in the same pod. It would be mapped to two worker_pools - one with CPU 0, the other with 2 and 3. The former has the same cpumask as the first pod of the earlier example and would have shared the same worker_pool but that's no longer the case after this patch. The worker_pools would have the same ->__pod_cpumask but their ->cpumask's wouldn't match. While this is necessary to support non-strict affinity scopes, there can be further optimizations to maintain sharing among strict affinity scopes. However, non-strict affinity scopes are going to be preferable for most use cases and we don't see very diverse mixture of unbound workqueue cpumasks anyway, so the additional overhead doesn't seem to justify the extra complexity. v2: - wq_update_pod() was incorrectly comparing target_attrs->__pod_cpumask to pool->attrs->cpumask instead of its ->__pod_cpumask. Fix it by using wqattrs_equal() for comparison instead. - Per-cpu worker pools weren't initializing ->__pod_cpumask which caused a subtle problem later on. Set it to cpumask_of(cpu) like ->cpumask. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Add multiple affinity scopes and interface to select themTejun Heo1-1/+4
Add three more affinity scopes - WQ_AFFN_CPU, SMT and CACHE - and make CACHE the default. The code changes to actually add the additional scopes are trivial. Also add module parameter "workqueue.default_affinity_scope" to override the default scope and "affinity_scope" sysfs file to configure it per workqueue. wq_dump.py and documentations are updated accordingly. This enables significant flexibility in configuring how unbound workqueues behave. If affinity scope is set to "cpu", it'll behave close to a per-cpu workqueue. On the other hand, "system" removes all locality boundaries. Many modern machines have multiple L3 caches often while being mostly uniform in terms of memory access. Thus, workqueue's previous behavior of spreading work items in each NUMA node had negative performance implications from unncessarily crossing L3 boundaries between issue and execution. However, picking a finer grained affinity scope also has a downside in that an issuer in one group can't utilize CPUs in other groups. While dependent on the specifics of workload, there's usually a noticeable penalty in crossing L3 boundaries, so let's default to CACHE. This issue will be further addressed and documented with examples in future patches. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Generalize unbound CPU podsTejun Heo1-4/+27
While renamed to pod, the code still assumes that the pods are defined by NUMA boundaries. Let's generalize it: * workqueue_attrs->affn_scope is added. Each enum represents the type of boundaries that define the pods. There are currently two scopes - WQ_AFFN_NUMA and WQ_AFFN_SYSTEM. The former is the same behavior as before - one pod per NUMA node. The latter defines one global pod across the whole system. * struct wq_pod_type is added which describes how pods are configured for each affnity scope. For each pod, it lists the member CPUs and the preferred NUMA node for memory allocations. The reverse mapping from CPU to pod is also available. * wq_pod_enabled is dropped. Pod is now always enabled. The previously disabled behavior is now implemented through WQ_AFFN_SYSTEM. * get_unbound_pool() wants to determine the NUMA node to allocate memory from for the new pool. The variables are renamed from node to pod but the logic still assumes they're one and the same. Clearly distinguish them - walk the WQ_AFFN_NUMA pods to find the matching pod and then use the pod's NUMA node. * wq_calc_pod_cpumask() was taking @pod but assumed that it was the NUMA node. Take @cpu instead and determine the cpumask to use from the pod_type matching @attrs. * apply_wqattrs_prepare() is update to return ERR_PTR() on error instead of NULL so that it can indicate -EINVAL on invalid affinity scopes. This patch allows CPUs to be grouped into pods however desired per type. While this patch causes some internal behavior changes, nothing material should change for workqueue users. v2: Trigger WARN_ON_ONCE() in wqattrs_pod_type() if affn_scope is WQ_AFFN_NR_TYPES which indicates that the function is called with a worker_pool's attrs instead of a workqueue's. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Initialize unbound CPU pods later in the bootTejun Heo1-0/+1
During boot, to initialize unbound CPU pods, wq_pod_init() was called from workqueue_init(). This is early enough for NUMA nodes to be set up but before SMP is brought up and CPU topology information is populated. Workqueue is in the process of improving CPU locality for unbound workqueues and will need access to topology information during pod init. This adds a new init function workqueue_init_topology() which is called after CPU topology information is available and replaces wq_pod_init(). As unbound CPU pods are now initialized after workqueues are activated, we need to revisit the workqueues to apply the pod configuration. Workqueues which are created before workqueue_init_topology() are set up so that they always use the default worker pool. After pods are set up in workqueue_init_topology(), wq_update_pod() is called on all existing workqueues to update the pool associations accordingly. Note that wq_update_pod_attrs_buf allocation is moved to workqueue_init_early(). This isn't necessary right now but enables further generalization of pod handling in the future. This patch changes the initialization sequence but the end result should be the same. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Rename workqueue_attrs->no_numa to ->orderedTejun Heo1-3/+3
With the recent removal of NUMA related module param and sysfs knob, workqueue_attrs->no_numa is now only used to implement ordered workqueues. Let's rename the field so that it's less confusing especially with the planned CPU affinity awareness improvements. Just a rename. No functional changes. Signed-off-by: Tejun Heo <[email protected]>
2023-08-07workqueue: Make unbound workqueues to use per-cpu pool_workqueuesTejun Heo1-6/+2
A pwq (pool_workqueue) represents an association between a workqueue and a worker_pool. When a work item is queued, the workqueue selects the pwq to use, which in turn determines the pool, and queues the work item to the pool through the pwq. pwq is also what implements the maximum concurrency limit - @max_active. As a per-cpu workqueue should be assocaited with a different worker_pool on each CPU, it always had per-cpu pwq's that are accessed through wq->cpu_pwq. However, unbound workqueues were sharing a pwq within each NUMA node by default. The sharing has several downsides: * Because @max_active is per-pwq, the meaning of @max_active changes depending on the machine configuration and whether workqueue NUMA locality support is enabled. * Makes per-cpu and unbound code deviate. * Gets in the way of making workqueue CPU locality awareness more flexible. This patch makes unbound workqueues use per-cpu pwq's the same way per-cpu workqueues do by making the following changes: * wq->numa_pwq_tbl[] is removed and unbound workqueues now use wq->cpu_pwq just like per-cpu workqueues. wq->cpu_pwq is now RCU protected for unbound workqueues. * numa_pwq_tbl_install() is renamed to install_unbound_pwq() and installs the specified pwq to the target CPU's wq->cpu_pwq. * apply_wqattrs_prepare() now always allocates a separate pwq for each CPU unless the workqueue is ordered. If ordered, all CPUs use wq->dfl_pwq. This makes the return value of wq_calc_node_cpumask() unnecessary. It now returns void. * @max_active now means the same thing for both per-cpu and unbound workqueues. WQ_UNBOUND_MAX_ACTIVE now equals WQ_MAX_ACTIVE and documentation is updated accordingly. WQ_UNBOUND_MAX_ACTIVE is no longer used in workqueue implementation and will be removed later. * All unbound pwq operations which used to be per-numa-node are now per-cpu. For most unbound workqueue users, this shouldn't cause noticeable changes. Work item issue and completion will be a small bit faster, flush_workqueue() would become a bit more expensive, and the total concurrency limit would likely become higher. All @max_active==1 use cases are currently being audited for conversion into alloc_ordered_workqueue() and they shouldn't be affected once the audit and conversion is complete. One area where the behavior change may be more noticeable is workqueue_congested() as the reported congestion state is now per CPU instead of NUMA node. There are only two users of this interface - drivers/infiniband/hw/hfi1 and net/smc. Maintainers of both subsystems are cc'd. Inputs on the behavior change would be very much appreciated. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Dennis Dalessandro <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: Karsten Graul <[email protected]> Cc: Wenjia Zhang <[email protected]> Cc: Jan Karcher <[email protected]>
2023-08-07scsi: ufs: core: Export ufshcd_is_hba_active()Nitin Rawat1-0/+1
Export ufshcd_is_hba_active() to allow driver modules to check the state of the host controller. Signed-off-by: Nitin Rawat <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07bpf: Add support for bpf_get_func_ip helper for uprobe programJiri Olsa2-3/+13
Adding support for bpf_get_func_ip helper for uprobe program to return probed address for both uprobe and return uprobe. We discussed this in [1] and agreed that uprobe can have special use of bpf_get_func_ip helper that differs from kprobe. The kprobe bpf_get_func_ip returns: - address of the function if probe is attach on function entry for both kprobe and return kprobe - 0 if the probe is not attach on function entry The uprobe bpf_get_func_ip returns: - address of the probe for both uprobe and return uprobe The reason for this semantic change is that kernel can't really tell if the probe user space address is function entry. The uprobe program is actually kprobe type program attached as uprobe. One of the consequences of this design is that uprobes do not have its own set of helpers, but share them with kprobes. As we need different functionality for bpf_get_func_ip helper for uprobe, I'm adding the bool value to the bpf_trace_run_ctx, so the helper can detect that it's executed in uprobe context and call specific code. The is_uprobe bool is set as true in bpf_prog_run_array_sleepable, which is currently used only for executing bpf programs in uprobe. Renaming bpf_prog_run_array_sleepable to bpf_prog_run_array_uprobe to address that it's only used for uprobes and that it sets the run_ctx.is_uprobe as suggested by Yafang Shao. Suggested-by: Andrii Nakryiko <[email protected]> Tested-by: Alan Maguire <[email protected]> [1] https://lore.kernel.org/bpf/CAEf4BzZ=xLVkG5eurEuvLU79wAMtwho7ReR+XJAgwhFF4M-7Cg@mail.gmail.com/ Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Viktor Malik <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-08-07Merge tag 'x86_bugs_srso' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/srso fixes from Borislav Petkov: "Add a mitigation for the speculative RAS (Return Address Stack) overflow vulnerability on AMD processors. In short, this is yet another issue where userspace poisons a microarchitectural structure which can then be used to leak privileged information through a side channel" * tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/srso: Tie SBPB bit setting to microcode patch detection x86/srso: Add a forgotten NOENDBR annotation x86/srso: Fix return thunks in generated code x86/srso: Add IBPB on VMEXIT x86/srso: Add IBPB x86/srso: Add SRSO_NO support x86/srso: Add IBPB_BRTYPE support x86/srso: Add a Speculative RAS Overflow mitigation x86/bugs: Increase the x86 bugs vector size to two u32s
2023-08-07ASoC: SOF: Intel: hda-mlink: add helper to get sublink LSDIID registerPierre-Louis Bossart1-0/+4
We need to retrieve the current value to deal with the HDAudio WAKEEN/WAKESTS setup. Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Rander Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-08-07drm/amdgpu: add UAPI for allocating doorbell memoryAlex Deucher1-1/+6
This patch adds flags for a new gem domain AMDGPU_GEM_DOMAIN_DOORBELL in the UAPI layer. V2: Drop 'memory' from description (Christian) Cc: Alex Deucher <[email protected]> Cc: Christian Koenig <[email protected]> Reviewed-by: Christian Koenig <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Shashank Sharma <[email protected]>
2023-08-07page_pool: add a lockdep check for recycling in hardirqJakub Kicinski1-0/+7
Page pool use in hardirq is prohibited, add debug checks to catch misuses. IIRC we previously discussed using DEBUG_NET_WARN_ON_ONCE() for this, but there were concerns that people will have DEBUG_NET enabled in perf testing. I don't think anyone enables lockdep in perf testing, so use lockdep to avoid pushback and arguing :) Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-07page_pool: place frag_* fields in one cachelineAlexander Lobakin1-5/+5
On x86_64, frag_* fields of struct page_pool are scattered across two cachelines despite the summary size of 24 bytes. All three fields are used in pretty much the same places, but the last field, ::frag_users, is pushed out to the next CL, provoking unwanted false-sharing on hotpath (frags allocation code). There are some holes and cold members to move around. Move frag_* one block up, placing them right after &page_pool_params perfectly at the beginning of CL2. This doesn't do any meaningful to the second block, as those are some destroy-path cold structures, and doesn't do anything to ::alloc_stats, which still starts at 200-byte offset, 8 bytes after CL3 (still fitting into 1 cacheline). On my setup, this yields 1-2% of Mpps when using PP frags actively. When it comes to 32-bit architectures with 32-byte CL: &page_pool_params plus ::pad is 44 bytes, the block taken care of is 16 bytes within one CL, so there should be at least no regressions from the actual change. ::pages_state_hold_cnt is not related directly to that triple, but is paired currently with ::frags_offset and decoupling them would mean either two 4-byte holes or more invasive layout changes. Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-07net: skbuff: don't include <net/page_pool/types.h> to <linux/skbuff.h>Alexander Lobakin2-4/+3
Currently, touching <net/page_pool/types.h> triggers a rebuild of more than half of the kernel. That's because it's included in <linux/skbuff.h>. And each new include to page_pool/types.h adds more [useless] data for the toolchain to process per each source file from that pile. In commit 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling"), Matteo included it to be able to call a couple of functions defined there. Then, in commit 57f05bc2ab24 ("page_pool: keep pp info as long as page pool owns the page") one of the calls was removed, so only one was left. It's the call to page_pool_return_skb_page() in napi_frag_unref(). The function is external and doesn't have any dependencies. Having very niche page_pool_types.h included only for that looks like an overkill. As %PP_SIGNATURE is not local to page_pool.c (was only in the early submissions), nothing holds this function there. Teleport page_pool_return_skb_page() to skbuff.c, just next to the main consumer, skb_pp_recycle(), and rename it to napi_pp_put_page(), as it doesn't work with skbs at all and the former name tells nothing. The #if guards here are only to not compile and have it in the vmlinux when not needed -- both call sites are already guarded. Now, touching page_pool_types.h only triggers rebuilding of the drivers using it and a couple of core networking files. Suggested-by: Jakub Kicinski <[email protected]> # make skbuff.h less heavy Suggested-by: Alexander Duyck <[email protected]> # move to skbuff.c Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-07page_pool: split types and declarations from page_pool.hYunsheng Lin4-239/+245
Split types and pure function declarations from page_pool.h and add them in page_page/types.h, so that C sources can include page_pool.h and headers should generally only include page_pool/types.h as suggested by jakub. Rename page_pool.h to page_pool/helpers.h to have both in one place. Signed-off-by: Yunsheng Lin <[email protected]> Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Jakub: change microsoft/mana, fix kdoc paths in Documentation] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-07net: stmmac: correct MAC propagation delayJohannes Zink1-0/+1
The IEEE1588 Standard specifies that the timestamps of Packets must be captured when the PTP message timestamp point (leading edge of first octet after the start of frame delimiter) crosses the boundary between the node and the network. As the MAC latches the timestamp at an internal point, the captured timestamp must be corrected for the additional data transmission latency, as described in the publicly available datasheet [1]. This patch only corrects for the MAC-Internal delay, which can be read out from the MAC_Ingress_Timestamp_Latency register on DWMAC version 5, since the Phy framework currently does not support querying the Phy ingress and egress latency. The Closs Domain Crossing Circuits errors as indicated in [1] are already being accounted in the stmmac_get_tx_hwtstamp() function and are not corrected here. As the Latency varies for different link speeds and MII modes of operation, the correction value needs to be updated on each link state change. As the delay also causes a phase shift in the timestamp counter compared to the rest of the network, this correction will also reduce phase error when generating PPS outputs from the timestamp counter. Since the correction registers may be unavailable on some hardware and no feature bits are documented for dynamically detection of the MAC propagation delay readout, introduce a feature bit to explicitely enable MAC delay Correction in the gluecode driver. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <[email protected]> Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-1-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-07decompress: Use 8 byte alignmentArd Biesheuvel1-1/+1
The ZSTD decompressor requires malloc() allocations to be 8 byte aligned, so ensure that this the case. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-07cgroup/rstat: Record the cumulative per-cpu time of cgroup and its descendantsHao Jia1-0/+14
The member variable bstat of the structure cgroup_rstat_cpu records the per-cpu time of the cgroup itself, but does not include the per-cpu time of its descendants. The per-cpu time including descendants is very useful for calculating the per-cpu usage of cgroups. Although we can indirectly obtain the total per-cpu time of the cgroup and its descendants by accumulating the per-cpu bstat of each descendant of the cgroup. But after a child cgroup is removed, we will lose its bstat information. This will cause the cumulative value to be non-monotonic, thus affecting the accuracy of cgroup per-cpu usage. So we add the subtree_bstat variable to record the total per-cpu time of this cgroup and its descendants, which is similar to "cpuacct.usage*" in cgroup v1. And this is also helpful for the migration from cgroup v1 to cgroup v2. After adding this variable, we can obtain the per-cpu time of cgroup and its descendants in user mode through eBPF/drgn, etc. And we are still trying to determine how to expose it in the cgroupfs interface. Suggested-by: Tejun Heo <[email protected]> Signed-off-by: Hao Jia <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2023-08-07tpm: Disable RNG for all AMD fTPMsMario Limonciello1-0/+1
The TPM RNG functionality is not necessary for entropy when the CPU already supports the RDRAND instruction. The TPM RNG functionality was previously disabled on a subset of AMD fTPM series, but reports continue to show problems on some systems causing stutter root caused to TPM RNG functionality. Expand disabling TPM RNG use for all AMD fTPMs whether they have versions that claim to have fixed or not. To accomplish this, move the detection into part of the TPM CRB registration and add a flag indicating that the TPM should opt-out of registration to hwrng. Cc: [email protected] # 6.1.y+ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Reported-by: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719 Reported-by: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212 Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>