Age | Commit message (Collapse) | Author | Files | Lines |
|
Arm SCMI spec. v3.2, s4.5.3.4 PERFORMANCE_DOMAIN_ATTRIBUTES
defines a per-domain rate_limit for performance requests:
"""
Rate Limit in microseconds, indicating the minimum time
required between successive requests. A value of 0
indicates that this field is not supported by the
platform. This field does not apply to FastChannels.
""""
The field is first defined in SCMI v1.0.
Add support to fetch this value and advertise it through
a rate_limit_get() callback.
Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Cristian Marussi <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
|
|
The UEFI loads a lite variant of the ADSP firmware to support charging
use cases. The kernel needs to unload and reload it with the firmware
that has full feature support for audio. This patch arbitarily shutsdown
the lite firmware before loading the full firmware.
Signed-off-by: Sibi Sankar <[email protected]>
Signed-off-by: Abel Vesa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Add support for PIL loading on ADSP and CDSP on X1E80100 SoCs.
Signed-off-by: Sibi Sankar <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Abel Vesa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
For loops with multiple initializers and increments are hard to read
and reason about, simplify this by using the looping index to index
into the hwspinlock array.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
This will unregister the HW spinlock on module exit automatically for us,
currently we manually unregister which can be error-prone if not done in
the right order. This also allows us to remove the remove callback.
Do that here.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
This disables runtime PM on module exit automatically for us, currently
we manually disable runtime PM which can be error-prone if not done
in the right order or missed in some exit path. This also allows us
to simplify the probe exit path and remove callbacks. Do that here.
While here, as we can now return right after registering our hwspinlock,
simply return directly and remove the extra debug message.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
We do not use the OF node anymore, nor does it matter how
we got to probe, so remove the check for of_node.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Use the device lifecycle managed allocation function. This helps prevent
mistakes like freeing out of order in cleanup functions and forgetting to
free on error paths.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Use the device lifecycle managed allocation function. This helps prevent
mistakes like freeing out of order in cleanup functions and forgetting to
free on error paths.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Use the device lifecycle managed allocation function. This helps prevent
mistakes like freeing out of order in cleanup functions and forgetting to
free on error paths.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Use the device lifecycle managed allocation function. This helps prevent
mistakes like freeing out of order in cleanup functions and forgetting to
free on error paths.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Use the device lifecycle managed allocation function. This helps prevent
mistakes like freeing out of order in cleanup functions and forgetting to
free on error paths.
Signed-off-by: Andrew Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
Correct indentation of several struct adsp_data instances to always use
a single TAB character instead of two.
Signed-off-by: Dmitry Baryshkov <[email protected]>
Reviewed-by: Konrad Dybcio <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
The enabling/disabling of EEE in the MAC should happen as a result of
auto negotiation. So move the enable/disable into
fec_enet_adjust_link() which gets called by phylib when there is a
change in link status.
fec_enet_set_eee() now just stores away the LPI timer value.
Everything else is passed to phylib, so it can correctly setup the
PHY.
fec_enet_get_eee() relies on phylib doing most of the work,
the MAC driver just adds the LPI timer value.
Call phy_support_eee() if the quirk is present to indicate the MAC
actually supports EEE.
Signed-off-by: Andrew Lunn <[email protected]>
Tested-by: Oleksij Rempel <[email protected]> (On iMX8MP debix)
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Wei Fang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
FEC is about to get its EEE code re-written. To allow this, move
fec_enet_eee_mode_set() before fec_enet_adjust_link() which will
need to call it.
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Wei Fang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In order for EEE to operate, both the MAC and the PHY need to support
it, similar to how pause works. With some exception - a number of PHYs
have SmartEEE or AutoGrEEEn support in order to provide some EEE-like
power savings with non-EEE capable MACs.
Copy the pause concept and add the call phy_support_eee() which the MAC
makes after connecting the PHY to indicate it supports EEE. phylib will
then advertise EEE when auto-neg is performed.
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The MAC driver changes its EEE hardware configuration in its
adjust_link callback. This is called when auto-neg
completes. Disabling EEE via eee_enabled false will trigger an
autoneg, and as a result the adjust_link callback will be called with
phydev->enable_tx_lpi set to false. Similarly, eee_enabled set to true
and with a change of advertised link modes will result in a new
autoneg, and a call the adjust_link call.
If set_eee is called with only a change to tx_lpi_enabled which does
not trigger an auto-neg, it is necessary to call the adjust_link
callback so that the MAC is reconfigured to take this change into
account.
When setting phydev->enable_tx_lpi, take both eee_enabled and
tx_lpi_enabled into account, so the MAC drivers just needs to act on
phydev->enable_tx_lpi and not the whole EEE configuration.
The same check should be done for tx_lpi_timer too.
Signed-off-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Have phylib keep track of the EEE configuration. This simplifies the
MAC drivers, in that they don't need to store it.
Future patches to phylib will also make use of this information to
further simplify the MAC drivers.
Reviewed-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
MAC drivers which support EEE need to know the results of the EEE
auto-neg in order to program the hardware to perform EEE or not. The
oddly named phy_init_eee() can be used to determine this, it returns 0
if EEE should be used, or a negative error code,
e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate
resulted in it not being used.
However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi
which indicates the result of the autoneg for EEE, including if EEE is
administratively disabled with ethtool. The MAC driver can then access
this in the same way as link speed and duplex in the adjust link
callback. If enable_tx_lpi is true, the MAC should send low power
indications and does not need to consider anything else with respect
to EEE.
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This bug was noticed while re-implementing parts of the kernel
driver in userspace using spidev. The goal was to enable some
of the errata workarounds that Microchip describes in their
errata sheet [1].
Both the errata sheet and the regular datasheet of e.g. the KSZ8795
imply that you need to do this for indirect register accesses:
- write a 16-bit value to a control register pair (this value
consists of the indirect register table, and the offset inside
the table)
- either read or write an 8-bit value from the data storage
register (indicated by REG_IND_BYTE in the kernel)
The current implementation has the order swapped. It can be
proven, by reading back some indirect register with known content
(the EEE register modified in ksz8_handle_global_errata() is one of
these), that this implementation does not work.
Private discussion with Oleksij Rempel of Pengutronix has revealed
that the workaround was apparantly never tested on actual hardware.
[1] https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/ProductDocuments/Errata/KSZ87xx-Errata-DS80000687C.pdf
Signed-off-by: Tobias Jakobi (Compleo) <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Fixes: 7b6e6235b664 ("net: dsa: microchip: ksz8795: handle eee specif erratum")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Older versions of GCC really want to know the full definition
of the type involved in rcu_assign_pointer().
struct dpll_pin is defined in a local header, net/core can't
reach it. Move all the netdev <> dpll code into dpll, where
the type is known. Otherwise we'd need multiple function calls
to jump between the compilation units.
This is the same problem the commit under fixes was trying to address,
but with rcu_assign_pointer() not rcu_dereference().
Some of the exports are not needed, networking core can't
be a module, we only need exports for the helpers used by
drivers.
Reported-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type")
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.
Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.
With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.
Remove the allocation in the tun/tap driver and leverage the network
core allocation instead.
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
changed the driver from reporting everything as supported before a device
was bonded into having the driver report that no XDP feature is supported
until a real device is bonded as it seems to be more truthful given
eventually real underlying devices decide what XDP features are supported.
The change however did not take into account when all slave devices get
removed from the bond device. In this case after 9b0ed890ac2a, the driver
keeps reporting a feature mask of 0x77, that is, NETDEV_XDP_ACT_MASK &
~NETDEV_XDP_ACT_XSK_ZEROCOPY whereas it should have reported a feature
mask of 0.
Fix it by resetting XDP feature flags in the same way as if no XDP program
is attached to the bond device. This was uncovered by the XDP bond selftest
which let BPF CI fail. After adjusting the starting masks on the latter
to 0 instead of NETDEV_XDP_ACT_MASK the test passes again together with
this fix.
Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Magnus Karlsson <[email protected]>
Cc: Prashant Batra <[email protected]>
Cc: Toke Høiland-Jørgensen <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This property is documented to have a special format which exposes all
available behaviours and the currently active one at the same time. For
this special format some helpers are provided.
When the charge_behaviour property was added in
1b0b6cc8030d ("power: supply: add charge_behaviour attributes"), it did
not update the default logic in in power_supply_sysfs.c to use the
format helpers. Thus by default only the currently active behaviour
is printed. This fixes the default logic to follow the documented
format.
There is currently only one in-tree drivers exposing charge behaviours -
thinkpad_acpi, which is not affected by the change, as it directly uses
the helpers and does not use the power_supply_sysfs.c logic.
Signed-off-by: Thomas Weißschuh <[email protected]>
Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-3-8ebb0a7c2409@weissschuh.net
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
By moving the conditional into the default-branch of the switch new
additions to the switch won't have to bypass the conditional.
This makes it easier to implement those special cases like the upcoming
change to the formatting of "charge_behaviour".
Suggested-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Thomas Weißschuh <[email protected]>
Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-2-8ebb0a7c2409@weissschuh.net
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
The charge_behaviours property is meant as a control-knob that can be
changed by the user.
Page 23 of [0] which documents the flag CHG_INH as follows:
CHG_INH : Charge Inhibit When the current is more than or equal to charge
threshold current,
charge inhibit temperature (upper/lower limit) :1
charge permission temperature or the current is
less than charge threshold current :0
So this is pure read-only information which is better represented as
POWER_SUPPLY_STATUS_NOT_CHARGING.
[0] https://product.minebeamitsumi.com/en/product/category/ics/battery/fuel_gauge/parts/download/__icsFiles/afieldfile/2023/07/12/1_download_01_12.pdf
Signed-off-by: Thomas Weißschuh <[email protected]>
Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-1-8ebb0a7c2409@weissschuh.net
Fixes: e39257cde7e8 ("power: supply: mm8013: Add more properties")
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
As reported by the kernel test robot, 'power_supply_attr_group' is defined
but not used when CONFIG_SYSFS is not set. Sebastian suggested that the
correct fix implemented by this patch, instead of my attempt in commit
ea4367c40c79 ("power: supply: core: move power_supply_attr_group into #ifdef
block"), is to define power_supply_attr_groups in power_supply_sysfs.c and
expose it in the power_supply.h header. For the case where CONFIG_SYSFS=n,
define it as NULL.
Suggested-by: Sebastian Reichel <[email protected]>
Fixes: ea4367c40c79 ("power: supply: core: move power_supply_attr_group into #ifdef block")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Ricardo B. Marliere <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
Technically the sysfs attributes should be initialized
before the class is registered, since that will use them.
As a nice side effect this nicely simplifies the code,
since it allows dropping the helper variable.
Reviewed-by: Ricardo B. Marliere <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
Introduce power_supply_for_each_device(), which is a wrapper
for class_for_each_device() using the power_supply_class and
going through all devices.
This allows making the power_supply_class itself a local
variable, so that drivers cannot mess with it and simplifies
the code slightly.
Reviewed-by: Ricardo B. Marliere <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sebastian Reichel <[email protected]>
|
|
The pinctrl_add_gpio_range() function is deprecated add a comment so
people don't accidentally use it in new code.
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
EPROBE_DEFER error returns are not really critical, since they cancel
the probe process, but the kernel will return later and retry.
However, depending on the probe order, this might issue quite some
verbatim and scary, though pointless messages:
[ 2.388731] 300b000.pinctrl: pin-224 (5000000.serial) status -517
[ 2.397321] 300b000.pinctrl: could not request pin 224 (PH0) from group PH0 on device 300b000.pinctrl
Replace dev_err() with dev_err_probe(), which not only drops the
priority of the message from error to debug, but also puts some text
into debugfs' devices_deferred file, for later reference.
Signed-off-by: Andre Przywara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
The Awinic AW9523(B) is a multi-function I2C gpio expander in a
TQFN-24L package, featuring PWM (max 37mA per pin, or total max
power 3.2Watts) for LED driving capability.
It has two ports with 8 pins per port (for a total of 16 pins),
configurable as either PWM with 1/256 stepping or GPIO input/output,
1.8V logic input; each GPIO can be configured as input or output
independently from each other.
This IC also has an internal interrupt controller, which is capable
of generating an interrupt for each GPIO, depending on the
configuration, and will raise an interrupt on the INTN pin to
advertise this to an external interrupt controller.
Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
Signed-off-by: David Bauer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The deferred_reset logic was added to vfio migration drivers to prevent
a circular locking dependency with respect to mm_lock and state mutex.
This is mainly because of the copy_to/from_user() functions(which takes
mm_lock) invoked under state mutex. But for HiSilicon driver, the only
place where we now hold the state mutex for copy_to_user is during the
PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have
updated the data and perform copy_to_user without state mutex. By this,
we can get rid of the deferred_reset logic.
Link: https://lore.kernel.org/kvm/[email protected]/
Signed-off-by: Shameer Kolothum <[email protected]>
Reviewed-by: Brett Creeley <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
pci_dev_resource_resize_attr(n) macro is invoked for six resources,
creating a large footprint function for each resource.
Rework the macro to only create a function that calls a helper function so
the compiler can decide if it warrants to inline the function or not.
With x86_64 defconfig, this saves roughly 2.5kB:
$ scripts/bloat-o-meter drivers/pci/pci-sysfs.o{.old,.new}
add/remove: 1/0 grow/shrink: 0/6 up/down: 512/-2934 (-2422)
Function old new delta
__resource_resize_store - 512 +512
resource5_resize_store 503 14 -489
resource4_resize_store 503 14 -489
resource3_resize_store 503 14 -489
resource2_resize_store 503 14 -489
resource1_resize_store 503 14 -489
resource0_resize_store 500 11 -489
Total: Before=13399, After=10977, chg -18.08%
(The compiler seemingly chose to still inline __resource_resize_show()
which is fine, those functions are not very complex/large.)
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ilpo Järvinen <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Commit d9c8bea179a6 ("PCI: Remove unused IORESOURCE_ROM_COPY and
IORESOURCE_ROM_BIOS_COPY") removed pci_cleanup_rom(), but retained
its declaration in pci.h.
Remove it.
Link: https://lore.kernel.org/r/fc30de5276e21d5a3ebcb7e58a8b43e399f7e6e6.1698668982.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
|
|
It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for
space-constrained devices such as routers, such a configuration may
actually make sense.
However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled,
unnecessarily increasing the kernel's size.
To rectify that:
* Move pci_mmap_fits() to mmap.c. It is not only needed by
pci-sysfs.c, but also proc.c.
* Move pci_dev_type to probe.c and make it private. It references
pci_dev_attr_groups in pci-sysfs.c. Make that public instead for
consistency with pci_dev_groups, pcibus_groups and pci_bus_groups,
which are likewise public and referenced by struct definitions in
pci-driver.c and probe.c.
* Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and
pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty
static inlines for pci_{create,remove}_legacy_files() and
pci_{create,remove}_sysfs_dev_files().
Result:
vmlinux size is reduced by 122996 bytes in my arm 32-bit test build.
Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
|
|
This is the second half of fixes for dmraid. The first half is available
at [1].
This set contains fixes:
- reshape can start unexpected, cause data corruption, patch 1,5,6;
- deadlocks that reshape concurrent with IO, patch 8;
- a lockdep warning, patch 9;
For all the dmraid related tests in lvm2 suite, there is no new
regressions compared against 6.6 kernels (which is good baseline before
recent regressions).
[1] https://lore.kernel.org/all/CAPhsuW7u1UKHCDOBDhD7DzOVtkGemDz_QnJ4DUq_kSN-Q3G66Q@mail.gmail.com/
* dmraid-fix-6.9:
dm-raid: fix lockdep waring in "pers->hot_add_disk"
dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape
dm-raid: add a new helper prepare_suspend() in md_personality
md/dm-raid: don't call md_reap_sync_thread() directly
dm-raid: really frozen sync_thread during suspend
md: add a new helper reshape_interrupted()
md: export helper md_is_rdwr()
md: export helpers to stop sync_thread
md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume
|
|
The lockdep assert is added by commit a448af25becf ("md/raid10: remove
rcu protection to access rdev from conf") in print_conf(). And I didn't
notice that dm-raid is calling "pers->hot_add_disk" without holding
'reconfig_mutex'.
"pers->hot_add_disk" read and write many fields that is protected by
'reconfig_mutex', and raid_resume() already grab the lock in other
contex. Hence fix this problem by protecting "pers->host_add_disk"
with the lock.
Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume")
Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf")
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
with reshape
For raid456, if reshape is still in progress, then IO across reshape
position will wait for reshape to make progress. However, for dm-raid,
in following cases reshape will never make progress hence IO will hang:
1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;
After commit c467e97f079f ("md/raid6: use valid sector values to determine
if an I/O should wait on the reshape") fix the problem that IO across
reshape position doesn't wait for reshape, the dm-raid test
shell/lvconvert-raid-reshape.sh start to hang:
[root@fedora ~]# cat /proc/979/stack
[<0>] wait_woken+0x7d/0x90
[<0>] raid5_make_request+0x929/0x1d70 [raid456]
[<0>] md_handle_request+0xc2/0x3b0 [md_mod]
[<0>] raid_map+0x2c/0x50 [dm_raid]
[<0>] __map_bio+0x251/0x380 [dm_mod]
[<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
[<0>] __submit_bio+0xc2/0x1c0
[<0>] submit_bio_noacct_nocheck+0x17f/0x450
[<0>] submit_bio_noacct+0x2bc/0x780
[<0>] submit_bio+0x70/0xc0
[<0>] mpage_readahead+0x169/0x1f0
[<0>] blkdev_readahead+0x18/0x30
[<0>] read_pages+0x7c/0x3b0
[<0>] page_cache_ra_unbounded+0x1ab/0x280
[<0>] force_page_cache_ra+0x9e/0x130
[<0>] page_cache_sync_ra+0x3b/0x110
[<0>] filemap_get_pages+0x143/0xa30
[<0>] filemap_read+0xdc/0x4b0
[<0>] blkdev_read_iter+0x75/0x200
[<0>] vfs_read+0x272/0x460
[<0>] ksys_read+0x7a/0x170
[<0>] __x64_sys_read+0x1c/0x30
[<0>] do_syscall_64+0xc6/0x230
[<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74
This is because reshape can't make progress.
For md/raid, the problem doesn't exist because register new sync_thread
doesn't rely on the IO to be done any more:
1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
'reconfig_mutex', hence it can be cleared and reshape can continue by
sysfs api 'sync_action'.
However, I'm not sure yet how to avoid the problem in dm-raid yet. This
patch on the one hand make sure raid_message() can't change
sync_thread() through raid_message() after presuspend(), on the other
hand detect the above 3 cases before wait for IO do be done in
dm_suspend(), and let dm-raid requeue those IO.
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
There are no functional changes for now, prepare to fix a deadlock for
dm-raid456.
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Currently md_reap_sync_thread() is called from raid_message() directly
without holding 'reconfig_mutex', this is definitely unsafe because
md_reap_sync_thread() can change many fields that is protected by
'reconfig_mutex'.
However, hold 'reconfig_mutex' here is still problematic because this
will cause deadlock, for example, commit 130443d60b1b ("md: refactor
idle/frozen_sync_thread() to fix deadlock").
Fix this problem by using stop_sync_thread() to unregister sync_thread,
like md/raid did.
Fixes: be83651f0050 ("DM RAID: Add message/status support for changing sync action")
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
1) commit f52f5c71f3d4 ("md: fix stopping sync thread") remove
MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that
dm-raid relies on __md_stop_writes() to frozen sync_thread
indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in
md_stop_writes(), and since stop_sync_thread() is only used for
dm-raid in this case, also move stop_sync_thread() to
md_stop_writes().
2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen,
it only prevent new sync_thread to start, and it can't stop the
running sync thread; In order to frozen sync_thread, after seting the
flag, stop_sync_thread() should be used.
3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use
it as condition for md_stop_writes() in raid_postsuspend() doesn't
look correct. Consider that reentrant stop_sync_thread() do nothing,
always call md_stop_writes() in raid_postsuspend().
4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime,
and if MD_RECOVERY_FROZEN is cleared while the array is suspended,
new sync_thread can start unexpected. Fix this by disallow
raid_message() to change sync_thread status during suspend.
Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the
test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(),
and with previous fixes, the test won't hang there anymore, however, the
test will still fail and complain that ext4 is corrupted. And with this
patch, the test won't hang due to stop_sync_thread() or fail due to ext4
is corrupted anymore. However, there is still a deadlock related to
dm-raid456 that will be fixed in following patches.
Reported-by: Mikulas Patocka <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_writes()")
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The helper will be used for dm-raid456 later to detect the case that
reshape can't make progress.
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
There are no functional changes for now, prepare to fix a deadlock for
dm-raid456.
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add new helpers:
void md_idle_sync_thread(struct mddev *mddev);
void md_frozen_sync_thread(struct mddev *mddev);
void md_unfrozen_sync_thread(struct mddev *mddev);
The helpers will be used in dm-raid in later patches to fix regressions
and prevent calling md_reap_sync_thread() directly.
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
After commit 9dbd1aa3a81c ("dm raid: add reshaping support to the
target") raid_ctr() will set MD_RECOVERY_FROZEN before md_run() and
expect to keep array frozen until resume. However, md_run() will clear
the flag by setting mddev->recovery to 0.
Before commit 1baae052cccd ("md: Don't ignore suspended array in
md_check_recovery()"), dm-raid actually relied on suspending to prevent
starting new sync_thread.
Fix this problem by keeping 'MD_RECOVERY_FROZEN' for dm-raid in
md_run().
Fixes: 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()")
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target")
Cc: [email protected] # v6.7+
Signed-off-by: Yu Kuai <[email protected]>
Signed-off-by: Xiao Ni <[email protected]>
Acked-by: Mike Snitzer <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.
The original set [1][2] was expected to undo a suboptimal fix in [2], and
replace it with a better fix [1]. However, as reported by Dan Moulding [2]
causes an issue with raid5 with journal device.
Revert [2] for now to close the issue. We will follow up on another issue
reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a
good trade-off, because the latter issue happens less freqently.
In the meanwhile, we will NOT revert [1], as it contains the right logic.
[1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
[2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
Reported-by: Dan Moulding <[email protected]>
Closes: https://lore.kernel.org/linux-raid/[email protected]/
Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
Cc: [email protected] # v5.19+
Cc: Junxiao Bi <[email protected]>
Cc: Yu Kuai <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Yu Kuai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
- Fix P2SB regression causing ACPI errors and high CPU load
- Fix error return path in amd_pmf_init_smart_pc()
* tag 'platform-drivers-x86-v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/amd/pmf: Fix missing error code in amd_pmf_init_smart_pc()
platform/x86: p2sb: On Goldmont only cache P2SB and SPI devfn BAR
|
|
Exynos850 has the same version of USI SPI (v2.1) as GS101.
Drop the fifo_lvl_mask and rx_lvl_offset and switch to the new port
config data.
Backward compatibility with DT is not broken because when alises are
set:
- the SPI core will set the bus number according to the alias ID
- the FIFO depth is always the same size for exynos850 (64 bytes) no
matter the alias ID number.
Advantages of the change:
- drop dependency on the OF alias ID.
- FIFO depth is inferred from the compatible. Exynos850 integrates 3 SPI
IPs, all with 64 bytes FIFO depths.
- use full mask for SPI_STATUS.{RX, TX}_FIFO_LVL fields. Using partial
masks is misleading and can hide problems of the driver logic.
Just compiled tested.
Signed-off-by: Tudor Ambarus <[email protected]>
Link: https://msgid.link/r/[email protected]
Tested-by: Sam Protsenko <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
|