Age | Commit message (Collapse) | Author | Files | Lines |
|
Set a flag when a cdrom_device_info is opened for writing, instead of
trying to figure out this at release time. This will allow to eventually
remove the mode argument to the ->release block_device_operation as
nothing but the CDROM drivers uses that argument.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Phillip Potter <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
cdrom_close_write is empty, and the for_data flag it is keyed off is
never set. Remove all this clutter.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Phillip Potter <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Phillip Potter <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Phillip Potter <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The fix for maple tree RCU locking on sync is a dependency for the
block sync code for the maple tree.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-updates-2023-06-09
1) Embedded CPU Virtual Functions
2) Lightweight local SFs
Daniel Jurgens says:
====================
Embedded CPU Virtual Functions
This series enables the creation of virtual functions on Bluefield (the
embedded CPU platform). Embedded CPU virtual functions (EC VFs). EC VF
creation, deletion and management interfaces are the same as those for
virtual functions in a server with a Connect-X NIC.
When using EC VFs on the ARM the creation of virtual functions on the
host system is still supported. Host VFs eswitch vports occupy a range
of 1..max_vfs, the EC VF vport range is max_vfs+1..max_ec_vfs.
Every function (PF, ECPF, VF, EC VF, and subfunction) has a function ID
associated with it. Prior to this series the function ID and the eswitch
vport were the same. That is no longer the case, the EC VF function ID
range is 1..max_ec_vfs. When querying or setting the capabilities of an
EC VF function an new bit must be set in the query/set HCA cap
structure.
This is a high level overview of the changes made:
- Allocate vports for EC VFs if they are enabled.
- Create representors and devlink ports for the EC VF vports.
- When querying/setting HCA caps by vport break the assumption
that function ID is the same a vport number and adjust
accordingly.
- Create a new type of page, so that when SRIOV on the ARM is
disabled, but remains enabled on the host, the driver can
wait for the correct pages.
- Update SRIOV code to support EC VF creation/deletion.
===================
Lightweight local SFs:
Last 3 patches form Shay Drory:
SFs are heavy weight and by default they come with the full package of
ConnectX features. Usually users want specialized SFs for one specific
purpose and using devlink users will almost always override the set of
advertises features of an SF and reload it.
Shay Drory says:
================
In order to avoid the wasted time and resources on the reload, local SFs
will probe without any auxiliary sub-device, so that the SFs can be
configured prior to its full probe.
The defaults of the enable_* devlink params of these SFs are set to
false.
Usage example:
Create SF:
$ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11
$ devlink port function set pci/0000:08:00.0/32768 \
hw_addr 00:00:00:00:00:11 state active
Enable ETH auxiliary device:
$ devlink dev param set auxiliary/mlx5_core.sf.1 \
name enable_eth value true cmode driverinit
Now, in order to fully probe the SF, use devlink reload:
$ devlink dev reload auxiliary/mlx5_core.sf.1
At this point the user have SF devlink instance with auxiliary device
for the Ethernet functionality only.
================
|
|
Now all tcp_stream_alloc_skb() callers pass @size == 0, we can
remove this parameter.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tcp_send_syn_data() is the last component in TCP transmit
path to put payload in skb->head.
Switch it to use page frags, so that we can remove dead
code later.
This allows to put more payload than previous implementation.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 4a19edb60d02 ("netlink: Pass extack to dump handlers")
added extack support to netlink dumps. It was focused on rtnl
and since rtnl does not use ->start(), ->done() callbacks
it ignored those. Genetlink on the other hand uses ->start()
extensively, for parsing and input validation.
Pass the extact in via struct netlink_dump_control and link
it to cb for the time of ->start(). Both struct netlink_dump_control
and extack itself live on the stack so we can't keep the same
extack for the duration of the dump. This means that the extack
visible in ->start() and each ->dump() callbacks will be different.
Corner cases like reporting a warning message in DONE across dump
calls are still not supported.
We could put the extack (for dumps) in the socket struct,
but layering makes it slightly awkward (extack pointer is decided
before the DO / DUMP split).
The genetlink dump error extacks are now surfaced:
$ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
lib.ynl.NlError: Netlink error: Invalid argument
nl_len = 64 (48) nl_flags = 0x300 nl_type = 2
error: -22 extack: {'msg': 'request header missing'}
Previously extack was missing:
$ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
lib.ynl.NlError: Netlink error: Invalid argument
nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
error: -22
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
This thing is direct analog of SO_PEERCRED which allows to get plain PID.
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Luca Boccassi <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Christian Brauner <[email protected]>
Acked-by: Stanislav Fomichev <[email protected]>
Tested-by: Luca Boccassi <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
but it contains pidfd instead of plain pid, which allows programmers not
to care about PID reuse problem.
We mask SO_PASSPIDFD feature if CONFIG_UNIX is not builtin because
it depends on a pidfd_prepare() API which is not exported to the kernel
modules.
Idea comes from UAPI kernel group:
https://uapi-group.org/kernel-features/
Big thanks to Christian Brauner and Lennart Poettering for productive
discussions about this.
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Luca Boccassi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Tested-by: Luca Boccassi <[email protected]>
Reviewed-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since its introduction, the ovs module execute_hash action allowed
hash algorithms other than the skb->l4_hash to be used. However,
additional hash algorithms were not implemented. This means flows
requiring different hash distributions weren't able to use the
kernel datapath.
Now, introduce support for symmetric hashing algorithm as an
alternative hash supported by the ovs module using the flow
dissector.
Output of flow using l4_sym hash:
recirc_id(0),in_port(3),eth(),eth_type(0x0800),
ipv4(dst=64.0.0.0/192.0.0.0,proto=6,frag=no), packets:30473425,
bytes:45902883702, used:0.000s, flags:SP.,
actions:hash(sym_l4(0)),recirc(0xd)
Some performance testing with no GRO/GSO, two veths, single flow:
hash(l4(0)): 4.35 GBits/s
hash(l4_sym(0)): 4.24 GBits/s
Signed-off-by: Aaron Conole <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The taprio Qdisc creates child classes per netdev TX queue, but
taprio_dump_class_stats() currently reports offload statistics per
traffic class. Traffic classes are groups of TXQs sharing the same
dequeue priority, so this is incorrect and we shouldn't be bundling up
the TXQ stats when reporting them, as we currently do in enetc.
Modify the API from taprio to drivers such that they report TXQ offload
stats and not TC offload stats.
There is no change in the UAPI or in the global Qdisc stats.
Fixes: 6c1adb650c8d ("net/sched: taprio: add netlink reporting for offload statistics counters")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Delete duplicated word in comment.
Signed-off-by: Mao Zhu <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
When compiling YNL generated code compiler complains about
array-initializer-out-of-bounds. Turns out the MAX value
for STATS_GRP uses the value for STATS.
This may lead to random corruptions in user space (kernel
itself doesn't use this value as it never parses stats).
Fixes: f09ea6fb1272 ("ethtool: add a new command for reading standard stats")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Now that the preparation of an rq_vec has been removed from the
generic read path, nfsd_splice_read() no longer needs to reset
rq_next_page.
nfsd4_encode_read() calls nfsd_splice_read() directly. As far as I
can ascertain, resetting rq_next_page for NFSv4 splice reads is
unnecessary because rq_next_page is already set correctly.
Moreover, resetting it might even be incorrect if previous
operations in the COMPOUND have already consumed at least a page of
the send buffer. I would expect that the result would be encoding
the READ payload over previously-encoded results.
Signed-off-by: Chuck Lever <[email protected]>
|
|
The cited commit aimed to ensure that Virtual Functions (VFs) assign a
queue affinity to a Queue Pair (QP) to distribute traffic when
the LAG master creates a hardware LAG. If the affinity was set while
the hardware was not in LAG, the firmware would ignore the affinity value.
However, this commit unintentionally assigned an affinity to QPs on the LAG
master's VPORT even if the RDMA device was not marked as LAG-enabled.
In most cases, this was not an issue because when the hardware entered
hardware LAG configuration, the RDMA device of the LAG master would be
destroyed and a new one would be created, marked as LAG-enabled.
The problem arises when a user configures Equal-Cost Multipath (ECMP).
In ECMP mode, traffic can be directed to different physical ports based on
the queue affinity, which is intended for use by VPORTS other than the
E-Switch manager. ECMP mode is supported only if both E-Switch managers are
in switchdev mode and the appropriate route is configured via IP. In this
configuration, the RDMA device is not destroyed, and we retain the RDMA
device that is not marked as LAG-enabled.
To ensure correct behavior, Send Queues (SQs) opened by the E-Switch
manager through verbs should be assigned strict affinity. This means they
will only be able to communicate through the native physical port
associated with the E-Switch manager. This will prevent the firmware from
assigning affinity and will not allow the SQs to be remapped in case of
failover.
Fixes: 802dcc7fc5ec ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode")
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Mark Bloch <[email protected]>
Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Set static rate to 0 as it should be discovered by path query and
has no meaning for RoCE.
This also avoid of using the rtnl lock and ethtool API, which is
a bottleneck when try to setup many rdma-cm connections at the same
time, especially with multiple processes.
Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices")
Signed-off-by: Mark Zhang <[email protected]>
Link: https://lore.kernel.org/r/f72a4f8b667b803aee9fa794069f61afb5839ce4.1685960567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Most of the changes this time are for the Qualcomm Snapdragon
platforms.
There are bug fixes for error handling in Qualcomm icc-bwmon,
rpmh-rsc, ramp_controller and rmtfs driver as well as the AMD tee
firmware driver and a missing initialization in the Arm ff-a firmware
driver. The Qualcomm RPMh and EDAC drivers need some rework to work
correctly on all supported chips.
The DT fixes include:
- i.MX8 fixes for gpio, pinmux and clock settings
- ADS touchscreen gpio polarity settings in several machines
- Address dtb warnings for caches, panel and input-enable properties
on Qualcomm platforms
- Incorrect data on qualcomm platforms fir SA8155P power domains,
SM8550 LLCC, SC7180-lite SDRAM frequencies and SM8550 soundwire
- Remoteproc firmware paths are corrected for Sony Xperia 10 IV"
* tag 'arm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (36 commits)
firmware: arm_ffa: Set handle field to zero in memory descriptor
ARM: dts: Fix erroneous ADS touchscreen polarities
arm64: dts: imx8mn-beacon: Fix SPI CS pinmux
arm64: dts: imx8-ss-dma: assign default clock rate for lpuarts
arm64: dts: imx8qm-mek: correct GPIOs for USDHC2 CD and WP signals
EDAC/qcom: Get rid of hardcoded register offsets
EDAC/qcom: Remove superfluous return variable assignment in qcom_llcc_core_setup()
arm64: dts: qcom: sm8550: Use the correct LLCC register scheme
dt-bindings: cache: qcom,llcc: Fix SM8550 description
arm64: dts: qcom: sc7180-lite: Fix SDRAM freq for misidentified sc7180-lite boards
arm64: dts: qcom: sm8550: use uint16 for Soundwire interval
soc: qcom: rpmhpd: Add SA8155P power domains
arm64: dts: qcom: Split out SA8155P and use correct RPMh power domains
dt-bindings: power: qcom,rpmpd: Add SA8155P
soc: qcom: Rename ice to qcom_ice to avoid module name conflict
soc: qcom: rmtfs: Fix error code in probe()
soc: qcom: ramp_controller: Fix an error handling path in qcom_ramp_controller_probe()
ARM: dts: at91: sama7g5ek: fix debounce delay property for shdwc
ARM: at91: pm: fix imbalanced reference counter for ethernet devices
arm64: dts: qcom: sm6375-pdx225: Fix remoteproc firmware paths
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
netfilter pull request 23-06-08
Pablo Neira Ayuso says:
====================
The following patchset contains Netfilter fixes for net:
1) Add commit and abort set operation to pipapo set abort path.
2) Bail out immediately in case of ENOMEM in nfnetlink batch.
3) Incorrect error path handling when creating a new rule leads to
dangling pointer in set transaction list.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Move declarations into include/net/gso.h and code into net/core/gso.c
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.5
The second pull request for v6.5. We have support for three new
Realtek chipsets, all from different generations. Shows how active
Realtek development is right now, even older generations are being
worked on.
Note: We merged wireless into wireless-next to avoid complex conflicts
between the trees.
Major changes:
rtl8xxxu
- RTL8192FU support
rtw89
- RTL8851BE support
rtw88
- RTL8723DS support
ath11k
- Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
Advertisement (EMA) support in AP mode
iwlwifi
- support for segmented PNVM images and power tables
- new vendor entries for PPAG (platform antenna gain) feature
cfg80211/mac80211
- more Multi-Link Operation (MLO) support such as hardware restart
- fixes for a potential work/mutex deadlock and with it beginnings of
the previously discussed locking simplifications
* tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (162 commits)
wifi: rtlwifi: remove misused flag from HAL data
wifi: rtlwifi: remove unused dualmac control leftovers
wifi: rtlwifi: remove unused timer and related code
wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown
wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled
wifi: brcmfmac: Detect corner error case earlier with log
wifi: rtw89: 8852c: update RF radio A/B parameters to R63
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (3 of 3)
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (2 of 3)
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (1 of 3)
wifi: rtw89: process regulatory for 6 GHz power type
wifi: rtw89: regd: update regulatory map to R64-R40
wifi: rtw89: regd: judge 6 GHz according to chip and BIOS
wifi: rtw89: refine clearing supported bands to check 2/5 GHz first
wifi: rtw89: 8851b: configure CRASH_TRIGGER feature for 8851B
wifi: rtw89: set TX power without precondition during setting channel
wifi: rtw89: debug: txpwr table access only valid page according to chip
wifi: rtw89: 8851b: enable hw_scan support
wifi: cfg80211: move scan done work to wiphy work
wifi: cfg80211: move sched scan stop to wiphy work
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When the embedded cpu supports SRIOV it can be enabled and disabled
independently from the host SRIOV. Track the pages separately so we can
properly wait for returned VF pages.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: William Tu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
These functions are for query/set by vport, there was an underlying
assumption that vport was equal to function ID. That's not the case for
EC VF functions. Set the ec_vf_function bit accordingly.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: William Tu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Enable creation of a devlink port for EC VF vports.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: William Tu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Add ec_vf_vport_base to HCA Capabilities 2. This indicates the base vport
of embedded CPU virtual functions that are connected to the eswitch.
Add ec_vf_function to query/set_hca_caps. If set this indicates
accessing a virtual function on the embedded CPU by function ID. This
should only be used with other_function set to 1.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Bodong Wang <[email protected]>
Reviewed-by: William Tu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
The sys_ni_posix_timers() definition causes a warning when the declaration
is missing, so this needs to be added along with the normal syscalls,
outside of the #ifdef.
kernel/time/posix-stubs.c:26:17: error: no previous prototype for 'sys_ni_posix_timers' [-Werror=missing-prototypes]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
mult_frac() evaluates _all_ arguments multiple times in the body.
Clarify comment while I'm at it.
Link: https://lkml.kernel.org/r/f9f9fdbb-ec8e-4f5e-a998-2a58627a1a43@p183
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
With the recent feature added to enable perf events to use pseudo NMIs as
interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.
One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). To cope with that, override
arch_perf_nmi_is_available() to let the watchdog framework know PMU not
ready, and inform the framework to re-initialize lockup detection once PMU
has been initialized.
[[email protected]: only HAVE_HARDLOCKUP_DETECTOR_PERF if the PMU config is enabled]
Link: https://lkml.kernel.org/r/20230523073952.1.I60217a63acc35621e13f10be16c0cd7c363caf8c@changeid
Link: https://lkml.kernel.org/r/20230519101840.v5.18.Ia44852044cdcb074f387e80df6b45e892965d4a1@changeid
Co-developed-by: Sumit Garg <[email protected]>
Signed-off-by: Sumit Garg <[email protected]>
Co-developed-by: Pingfan Liu <[email protected]>
Signed-off-by: Pingfan Liu <[email protected]>
Signed-off-by: Lecopzer Chen <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When lockup_detector_init()->watchdog_hardlockup_probe(), PMU may be not
ready yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init). And it is deeply integrated with
the driver model and cpuhp. Hence it is hard to push this initialization
before smp_init().
But it is easy to take an opposite approach and try to initialize the
watchdog once again later. The delayed probe is called using workqueues.
It need to allocate memory and must be proceed in a normal context. The
delayed probe is able to use if watchdog_hardlockup_probe() returns
non-zero which means the return code returned when PMU is not ready yet.
Provide an API - lockup_detector_retry_init() for anyone who needs to
delayed init lockup detector if they had ever failed at
lockup_detector_init().
The original assumption is: nobody should use delayed probe after
lockup_detector_check() which has __init attribute. That is, anyone uses
this API must call between lockup_detector_init() and
lockup_detector_check(), and the caller must have __init attribute
Link: https://lkml.kernel.org/r/20230519101840.v5.16.If4ad5dd5d09fb1309cebf8bcead4b6a5a7758ca7@changeid
Reviewed-by: Petr Mladek <[email protected]>
Co-developed-by: Pingfan Liu <[email protected]>
Signed-off-by: Pingfan Liu <[email protected]>
Signed-off-by: Lecopzer Chen <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Suggested-by: Petr Mladek <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
On arm64, NMI support needs to be detected at runtime. Add a weak
function to the perf hardlockup detector so that an architecture can
implement it to detect whether NMIs are available.
Link: https://lkml.kernel.org/r/20230519101840.v5.15.Ic55cb6f90ef5967d8aaa2b503a4e67c753f64d3a@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Implement a hardlockup detector that doesn't doesn't need any extra
arch-specific support code to detect lockups. Instead of using something
arch-specific we will use the buddy system, where each CPU watches out for
another one. Specifically, each CPU will use its softlockup hrtimer to
check that the next CPU is processing hrtimer interrupts by verifying that
a counter is increasing.
NOTE: unlike the other hard lockup detectors, the buddy one can't easily
show what's happening on the CPU that locked up just by doing a simple
backtrace. It relies on some other mechanism in the system to get
information about the locked up CPUs. This could be support for NMI
backtraces like [1], it could be a mechanism for printing the PC of locked
CPUs at panic time like [2] / [3], or it could be something else. Even
though that means we still rely on arch-specific code, this arch-specific
code seems to often be implemented even on architectures that don't have a
hardlockup detector.
This style of hardlockup detector originated in some downstream Android
trees and has been rebased on / carried in ChromeOS trees for quite a long
time for use on arm and arm64 boards. Historically on these boards we've
leveraged mechanism [2] / [3] to get information about hung CPUs, but we
could move to [1].
Although the original motivation for the buddy system was for use on
systems without an arch-specific hardlockup detector, it can still be
useful to use even on systems that _do_ have an arch-specific hardlockup
detector. On x86, for instance, there is a 24-part patch series [4] in
progress switching the arch-specific hard lockup detector from a scarce
perf counter to a less-scarce hardware resource. Potentially the buddy
system could be a simpler alternative to free up the perf counter but
still get hard lockup detection.
Overall, pros (+) and cons (-) of the buddy system compared to an
arch-specific hardlockup detector (which might be implemented using
perf):
+ The buddy system is usable on systems that don't have an
arch-specific hardlockup detector, like arm32 and arm64 (though it's
being worked on for arm64 [5]).
+ The buddy system may free up scarce hardware resources.
+ If a CPU totally goes out to lunch (can't process NMIs) the buddy
system could still detect the problem (though it would be unlikely
to be able to get a stack trace).
+ The buddy system uses the same timer function to pet the hardlockup
detector on the running CPU as it uses to detect hardlockups on
other CPUs. Compared to other hardlockup detectors, this means it
generates fewer interrupts and thus is likely better able to let
CPUs stay idle longer.
- If all CPUs are hard locked up at the same time the buddy system
can't detect it.
- If we don't have SMP we can't use the buddy system.
- The buddy system needs an arch-specific mechanism (possibly NMI
backtrace) to get info about the locked up CPU.
[1] https://lore.kernel.org/r/[email protected]
[2] https://issuetracker.google.com/172213129
[3] https://docs.kernel.org/trace/coresight/coresight-cpu-debug.html
[4] https://lore.kernel.org/lkml/[email protected]/
[5] https://lore.kernel.org/linux-arm-kernel/[email protected]/
Link: https://lkml.kernel.org/r/20230519101840.v5.14.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid
Signed-off-by: Colin Cross <[email protected]>
Signed-off-by: Matthias Kaehlcke <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Tzung-Bi Shih <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The fact that there watchdog_hardlockup_enable(),
watchdog_hardlockup_disable(), and watchdog_hardlockup_probe() are
declared __weak means that the configured hardlockup detector can define
non-weak versions of those functions if it needs to. Instead of doing
this, the perf hardlockup detector hooked itself into the default __weak
implementation, which was a bit awkward. Clean this up.
From comments, it looks as if the original design was done because the
__weak function were expected to implemented by the architecture and not
by the configured hardlockup detector. This got awkward when we tried to
add the buddy lockup detector which was not arch-specific but wanted to
hook into those same functions.
This is not expected to have any functional impact.
Link: https://lkml.kernel.org/r/20230519101840.v5.13.I847d9ec852449350997ba00401d2462a9cb4302b@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Do a search and replace of:
- NMI_WATCHDOG_ENABLED => WATCHDOG_HARDLOCKUP_ENABLED
- SOFT_WATCHDOG_ENABLED => WATCHDOG_SOFTOCKUP_ENABLED
- watchdog_nmi_ => watchdog_hardlockup_
- nmi_watchdog_available => watchdog_hardlockup_available
- nmi_watchdog_user_enabled => watchdog_hardlockup_user_enabled
- soft_watchdog_user_enabled => watchdog_softlockup_user_enabled
- NMI_WATCHDOG_DEFAULT => WATCHDOG_HARDLOCKUP_DEFAULT
Then update a few comments near where names were changed.
This is specifically to make it less confusing when we want to introduce
the buddy hardlockup detector, which isn't using NMIs. As part of this,
we sanitized a few names for consistency.
[[email protected]: make variables static]
Link: https://lkml.kernel.org/r/20230525162822.1.I0fb41d138d158c9230573eaa37dc56afa2fb14ee@changeid
Link: https://lkml.kernel.org/r/20230519101840.v5.12.I91f7277bab4bf8c0cb238732ed92e7ce7bbd71a6@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: Tom Rix <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In preparation for the buddy hardlockup detector, which wants the same
petting logic as the current perf hardlockup detector, move the code to
watchdog.c. While doing this, rename the global variable to match others
nearby. As part of this change we have to change the code to account for
the fact that the CPU we're running on might be different than the one
we're checking.
Currently the code in watchdog.c is guarded by
CONFIG_HARDLOCKUP_DETECTOR_PERF, which makes this change seem silly.
However, a future patch will change this.
Link: https://lkml.kernel.org/r/20230519101840.v5.11.I00dfd6386ee00da25bf26d140559a41339b53e57@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In preparation for the buddy hardlockup detector where the CPU checking
for lockup might not be the currently running CPU, add a "cpu" parameter
to watchdog_hardlockup_check().
As part of this change, make hrtimer_interrupts an atomic_t since now the
CPU incrementing the value and the CPU reading the value might be
different. Technially this could also be done with just READ_ONCE and
WRITE_ONCE, but atomic_t feels a little cleaner in this case.
While hrtimer_interrupts is made atomic_t, we change
hrtimer_interrupts_saved from "unsigned long" to "int". The "int" is
needed to match the data type backing atomic_t for hrtimer_interrupts.
Even if this changes us from 64-bits to 32-bits (which I don't think is
true for most compilers), it doesn't really matter. All we ever do is
increment it every few seconds and compare it to an old value so 32-bits
is fine (even 16-bits would be). The "signed" vs "unsigned" also doesn't
matter for simple equality comparisons.
hrtimer_interrupts_saved is _not_ switched to atomic_t nor even accessed
with READ_ONCE / WRITE_ONCE. The hrtimer_interrupts_saved is always
consistently accessed with the same CPU. NOTE: with the upcoming "buddy"
detector there is one special case. When a CPU goes offline/online then
we can change which CPU is the one to consistently access a given instance
of hrtimer_interrupts_saved. We still can't end up with a partially
updated hrtimer_interrupts_saved, however, because we end up petting all
affected CPUs to make sure the new and old CPU can't end up somehow
read/write hrtimer_interrupts_saved at the same time.
Link: https://lkml.kernel.org/r/20230519101840.v5.10.I3a7d4dd8c23ac30ee0b607d77feb6646b64825c0@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The perf hardlockup detector works by looking at interrupt counts and
seeing if they change from run to run. The interrupt counts are managed
by the common watchdog code via its watchdog_timer_fn().
Currently the API between the perf detector and the common code is a
function: is_hardlockup(). When the hard lockup detector sees that
function return true then it handles printing out debug info and inducing
a panic if necessary.
Let's change the API a little bit in preparation for the buddy hardlockup
detector. The buddy hardlockup detector wants to print nearly the same
debug info and have nearly the same panic behavior. That means we want to
move all that code to the common file. For now, the code in the common
file will only be there if the perf hardlockup detector is enabled, but
eventually it will be selected by a common config.
Right now, this _just_ moves the code from the perf detector file to the
common file and changes the names. It doesn't make the changes that the
buddy hardlockup detector will need and doesn't do any style cleanups. A
future patch will do cleanup to make it more obvious what changed.
With the above, we no longer have any callers of is_hardlockup() outside
of the "watchdog.c" file, so we can remove it from the header, make it
static, and move it to the same "#ifdef" block as our new
watchdog_hardlockup_check(). While doing this, it can be noted that even
if no hardlockup detectors were configured the existing code used to still
have the code for counting/checking "hrtimer_interrupts" even if the perf
hardlockup detector wasn't configured. We didn't need to do that, so move
all the "hrtimer_interrupts" counting to only be there if the perf
hardlockup detector is configured as well.
This change is expected to be a no-op.
Link: https://lkml.kernel.org/r/20230519101840.v5.8.Id4133d3183e798122dc3b6205e7852601f289071@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In preparation for the buddy hardlockup detector, add comments to
touch_nmi_watchdog() to make it obvious that it touches the configured
hardlockup detector regardless of whether it's backed by an NMI. Also
note that arch_touch_nmi_watchdog() may not be architecture-specific.
Ideally, we'd like to rename these functions but that is a fairly
disruptive change touching a lot of drivers. After discussion [1] the
plan is to defer this until a good time.
[1] https://lore.kernel.org/r/ZFy0TX1tfhlH8gxj@alley
[[email protected]: comment changes, per Petr]
Link: https://lkml.kernel.org/r/ZGyONWPXpE1DcxA5@alley
Link: https://lkml.kernel.org/r/20230519101840.v5.6.I4e47cbfa1bb2ebbcdb5ca16817aa2887f15dc82c@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Nobody cares about the return value of watchdog_nmi_enable(), changing its
prototype to void.
Link: https://lkml.kernel.org/r/20230519101840.v5.4.Ic3a19b592eb1ac4c6f6eade44ffd943e8637b6e5@changeid
Signed-off-by: Pingfan Liu <[email protected]>
Signed-off-by: Lecopzer Chen <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Acked-by: Nicholas Piggin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
config
Patch series "watchdog/hardlockup: Add the buddy hardlockup detector", v5.
This patch series adds the "buddy" hardlockup detector. In brief, the
buddy hardlockup detector can detect hardlockups without arch-level
support by having CPUs checkup on a "buddy" CPU periodically.
Given the new design of this patch series, testing all combinations is
fairly difficult. I've attempted to make sure that all combinations of
CONFIG_ options are good, but it wouldn't surprise me if I missed
something. I apologize in advance and I'll do my best to fix any
problems that are found.
This patch (of 18):
The real watchdog_update_hrtimer_threshold() is defined in
kernel/watchdog_hld.c. That file is included if
CONFIG_HARDLOCKUP_DETECTOR_PERF and the function is defined in that file
if CONFIG_HARDLOCKUP_CHECK_TIMESTAMP.
The dummy version of the function in "nmi.h" didn't get that quite right.
While this doesn't appear to be a huge deal, it's nice to make it
consistent.
It doesn't break builds because CHECK_TIMESTAMP is only defined by x86 so
others don't get a double definition, and x86 uses perf lockup detector,
so it gets the out of line version.
Link: https://lkml.kernel.org/r/20230519101840.v5.18.Ia44852044cdcb074f387e80df6b45e892965d4a1@changeid
Link: https://lkml.kernel.org/r/20230519101840.v5.1.I8cbb2f4fa740528fcfade4f5439b6cdcdd059251@changeid
Fixes: 7edaeb6841df ("kernel/watchdog: Prevent false positives with turbo modes")
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masayoshi Mizuma <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Sumit Garg <[email protected]>
Cc: Tzung-Bi Shih <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Colin Cross <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
A number of internal functions in kcov are only called from generated code
and don't technically need a declaration, but 'make W=1' warns about
global symbols without a prototype:
kernel/kcov.c:199:14: error: no previous prototype for '__sanitizer_cov_trace_pc' [-Werror=missing-prototypes]
kernel/kcov.c:264:14: error: no previous prototype for '__sanitizer_cov_trace_cmp1' [-Werror=missing-prototypes]
kernel/kcov.c:270:14: error: no previous prototype for '__sanitizer_cov_trace_cmp2' [-Werror=missing-prototypes]
kernel/kcov.c:276:14: error: no previous prototype for '__sanitizer_cov_trace_cmp4' [-Werror=missing-prototypes]
kernel/kcov.c:282:14: error: no previous prototype for '__sanitizer_cov_trace_cmp8' [-Werror=missing-prototypes]
kernel/kcov.c:288:14: error: no previous prototype for '__sanitizer_cov_trace_const_cmp1' [-Werror=missing-prototypes]
kernel/kcov.c:295:14: error: no previous prototype for '__sanitizer_cov_trace_const_cmp2' [-Werror=missing-prototypes]
kernel/kcov.c:302:14: error: no previous prototype for '__sanitizer_cov_trace_const_cmp4' [-Werror=missing-prototypes]
kernel/kcov.c:309:14: error: no previous prototype for '__sanitizer_cov_trace_const_cmp8' [-Werror=missing-prototypes]
kernel/kcov.c:316:14: error: no previous prototype for '__sanitizer_cov_trace_switch' [-Werror=missing-prototypes]
Adding prototypes for these in a header solves that problem, but now there
is a mismatch between the built-in type and the prototype on 64-bit
architectures because they expect some functions to take a 64-bit
'unsigned long' argument rather than an 'unsigned long long' u64 type:
include/linux/kcov.h:84:6: error: conflicting types for built-in function '__sanitizer_cov_trace_switch'; expected 'void(long long unsigned int, void *)' [-Werror=builtin-declaration-mismatch]
84 | void __sanitizer_cov_trace_switch(u64 val, u64 *cases);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid this as well with a custom type definition.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Rong Tao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The arch_get_vdso_data() function is defined separately on each
architecture, but only called when CONFIG_TIME_NS is set. If the
definition is a global function, this causes a W=1 warning without
TIME_NS:
arch/x86/entry/vdso/vma.c:35:19: error: no previous prototype for 'arch_get_vdso_data' [-Werror=missing-prototypes]
Move the prototype out of the #ifdef block to reliably turn off that
warning.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There are a few __weak functions in kernel/fork.c, which architectures
can override. If there is no prototype, the compiler warns about them:
kernel/fork.c:164:13: error: no previous prototype for 'arch_release_task_struct' [-Werror=missing-prototypes]
kernel/fork.c:991:20: error: no previous prototype for 'arch_task_cache_init' [-Werror=missing-prototypes]
kernel/fork.c:1086:12: error: no previous prototype for 'arch_dup_task_struct' [-Werror=missing-prototypes]
There are already prototypes in a number of architecture specific headers
that have addressed those warnings before, but it's much better to have
these in a single place so the warning no longer shows up anywhere.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
cifs_root_data() is defined in cifs and called from early init code, but
lacks a global prototype:
fs/cifs/cifsroot.c:83:12: error: no previous prototype for 'cifs_root_data'
Move the declaration from do_mounts.c into an appropriate header.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The init/main.c file contains some extern declarations for functions
defined in architecture code, and it defines some other functions that are
called from architecture code with a custom prototype. Both of those
result in warnings with 'make W=1':
init/calibrate.c:261:37: error: no previous prototype for 'calibrate_delay_is_known' [-Werror=missing-prototypes]
init/main.c:790:20: error: no previous prototype for 'mem_encrypt_init' [-Werror=missing-prototypes]
init/main.c:792:20: error: no previous prototype for 'poking_init' [-Werror=missing-prototypes]
arch/arm64/kernel/irq.c:122:13: error: no previous prototype for 'init_IRQ' [-Werror=missing-prototypes]
arch/arm64/kernel/time.c:55:13: error: no previous prototype for 'time_init' [-Werror=missing-prototypes]
arch/x86/kernel/process.c:935:13: error: no previous prototype for 'arch_post_acpi_subsys_init' [-Werror=missing-prototypes]
init/calibrate.c:261:37: error: no previous prototype for 'calibrate_delay_is_known' [-Werror=missing-prototypes]
kernel/fork.c:991:20: error: no previous prototype for 'arch_task_cache_init' [-Werror=missing-prototypes]
Add prototypes for all of these in include/linux/init.h or another
appropriate header, and remove the duplicate declarations from
architecture specific code.
[[email protected]: declare time_init_early()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
A few panic() related functions have a global definition but not
declaration, which causes a warning with W=1:
kernel/panic.c:710:6: error: no previous prototype for '__warn_printk' [-Werror=missing-prototypes]
kernel/panic.c:756:24: error: no previous prototype for '__stack_chk_fail' [-Werror=missing-prototypes]
kernel/exit.c:1917:32: error: no previous prototype for 'abort' [-Werror=missing-prototypes]
__warn_printk() is called both as a global function when CONFIG_BUG
is enabled, and as a local function in other configs. The other
two here are called indirectly from generated or assembler code.
Add prototypes for all of these.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The __kernel_map_pages() function is mainly used for
CONFIG_DEBUG_PAGEALLOC, but has a number of architecture specific
definitions that may also be used in other configurations, as well as a
global fallback definition for architectures that do not support
DEBUG_PAGEALLOC.
When the option is disabled, any definitions without the prototype cause a
warning:
mm/page_poison.c:102:6: error: no previous prototype for '__kernel_map_pages' [-Werror=missing-prototypes]
The function is a trivial nop here, so just declare it anyway
to avoid the warning.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "mm/init/kernel: missing-prototypes warnings".
These are patches addressing -Wmissing-prototypes warnings in common
kernel code and memory management code files that usually get merged
through the -mm tree.
This patch (of 12):
This function is called whenever CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK or
CONFIG_HAVE_SETUP_PER_CPU_AREA, but only declared when the former is set:
mm/percpu.c:3055:12: error: no previous prototype for 'pcpu_embed_first_chunk' [-Werror=missing-prototypes]
There is no real point in hiding declarations, so just remove
the #ifdef here.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add signed intptr_t given that a) it is standard type and b) uintptr_t is
in tree.
Link: https://lkml.kernel.org/r/ed66b9e4-1fb7-45be-9bb9-d4bc291c691f@p183
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "mm/gup: disallow GUP writing to file-backed mappings by
default", v9.
Writing to file-backed mappings which require folio dirty tracking using
GUP is a fundamentally broken operation, as kernel write access to GUP
mappings do not adhere to the semantics expected by a file system.
A GUP caller uses the direct mapping to access the folio, which does not
cause write notify to trigger, nor does it enforce that the caller marks
the folio dirty.
The problem arises when, after an initial write to the folio, writeback
results in the folio being cleaned and then the caller, via the GUP
interface, writes to the folio again.
As a result of the use of this secondary, direct, mapping to the folio no
write notify will occur, and if the caller does mark the folio dirty, this
will be done so unexpectedly.
For example, consider the following scenario:-
1. A folio is written to via GUP which write-faults the memory, notifying
the file system and dirtying the folio.
2. Later, writeback is triggered, resulting in the folio being cleaned and
the PTE being marked read-only.
3. The GUP caller writes to the folio, as it is mapped read/write via the
direct mapping.
4. The GUP caller, now done with the page, unpins it and sets it dirty
(though it does not have to).
This change updates both the PUP FOLL_LONGTERM slow and fast APIs. As
pin_user_pages_fast_only() does not exist, we can rely on a slightly
imperfect whitelisting in the PUP-fast case and fall back to the slow case
should this fail.
This patch (of 3):
vma_wants_writenotify() is specifically intended for setting PTE page
table flags, accounting for existing page table flag state and whether the
underlying filesystem performs dirty tracking for a file-backed mapping.
Everything is predicated firstly on whether the mapping is shared
writable, as this is the only instance where dirty tracking is pertinent -
MAP_PRIVATE mappings will always be CoW'd and unshared, and read-only
file-backed shared mappings cannot be written to, even with FOLL_FORCE.
All other checks are in line with existing logic, though now separated
into checks eplicitily for dirty tracking and those for determining how to
set page table flags.
We make this change so we can perform checks in the GUP logic to determine
which mappings might be problematic when written to.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/0f218370bd49b4e6bbfbb499f7c7b92c26ba1ceb.1683235180.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Mika Penttilä <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Kirill A . Shutemov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|