Age | Commit message (Collapse) | Author | Files | Lines |
|
The comparisons of the unsigned int hw_type to less than zero always
false because it is unsigned. Fix this by using an int for the
assignment and less than zero check.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 9d2df9a0ad80 ("ipmi: kcs_bmc_aspeed: Implement KCS SerIRQ configuration")
Signed-off-by: Colin Ian King <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Some Aspeed KCS devices can derive the status register address from the
address of the data register. As such, the address of the status
register can be implicit in the configuration if desired. On the other
hand, sometimes address schemes might be requested that are incompatible
with the default addressing scheme. Allow these requests where possible
if the devicetree specifies the status register address.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Chia-Wei Wang <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Input Buffer Full Interrupt Enable (IBFIE) is typoed as IBFIF for some
registers in the datasheet. Fix the driver to use the sensible acronym.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Apply the SerIRQ ID and level/sense behaviours from the devicetree if
provided.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Allocating IO and IRQ resources to LPC devices is in-theory an operation
for the host, however ASPEED don't appear to expose this capability
outside the BMC (e.g. SuperIO). Instead, we are left with BMC-internal
registers for managing these resources, so introduce a devicetree
property for KCS devices to describe SerIRQ properties.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Given the deprecated binding, improve the ability to detect issues in
the platform devicetrees. Further, a subsequent patch will introduce a
new interrupts property for specifying SerIRQ behaviour, so convert
before we do any further additions.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
kcs_bmc_serio acts as a bridge between the KCS drivers in the IPMI
subsystem and the existing userspace interfaces available through the
serio subsystem. This is useful when userspace would like to make use of
the BMC KCS devices for purposes that aren't IPMI.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
This way devices don't get delivered IRQs when no-one is interested.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Add a mechanism for controlling whether the client associated with a
KCS device will receive Input Buffer Full (IBF) and Output Buffer Empty
(OBE) events. This enables an abstract implementation of poll() for KCS
devices.
A wart in the implementation is that the ASPEED KCS devices don't
support an OBE interrupt for the BMC. Instead we pretend it has one by
polling the status register waiting for the Output Buffer Full (OBF) bit
to clear, and generating an event when OBE is observed.
Cc: CS20 KWLiu <[email protected]>
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Now that we have untangled the data-structures, split the userspace
interface out into its own module. Userspace interfaces and drivers are
registered to the KCS BMC core to support arbitrary binding of either.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Move all client-private data out of `struct kcs_bmc` into the KCS client
implementation.
With this change the KCS BMC core code now only concerns itself with
abstract `struct kcs_bmc` and `struct kcs_bmc_client` types, achieving
expected separation of concerns. Further, the change clears the path for
implementation of alternative userspace interfaces.
The chardev data-structures are rearranged in the same manner applied to
the KCS device driver data-structures in an earlier patch - `struct
kcs_bmc_client` is embedded in the client's private data and we exploit
container_of() to translate as required.
Finally, now that it is free of client data, `struct kcs_bmc` is renamed
to `struct kcs_bmc_device` to contrast `struct kcs_bmc_client`.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Strengthen the distinction between code that abstracts the
implementation of the KCS behaviours (device drivers) and code that
exploits KCS behaviours (clients). Neither needs to know about the APIs
required by the other, so provide separate headers.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Make the KCS device drivers responsible for allocating their own memory.
Until now the private data for the device driver was allocated internal
to the private data for the chardev interface. This coupling required
the slightly awkward API of passing through the struct size for the
driver private data to the chardev constructor, and then retrieving a
pointer to the driver private data from the allocated chardev memory.
In addition to being awkward, the arrangement prevents the
implementation of alternative userspace interfaces as the device driver
private data is not independent.
Peel a layer off the onion and turn the data-structures inside out by
exploiting container_of() and embedding `struct kcs_device` in the
driver private data.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Take steps towards defining a coherent API to separate the KCS device
drivers from the userspace interface. Decreasing the coupling will
improve the separation of concerns and enable the introduction of
alternative userspace interfaces.
For now, simply split the chardev logic out to a separate file. The code
continues to build into the same module.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Rename the functions in preparation for separating the IPMI chardev out
from the KCS BMC core.
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Enable more efficient implementation of read-modify-write sequences.
Both device drivers for the KCS BMC stack use regmaps. The new callback
allows us to exploit regmap_update_bits().
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
Unpack and remove the aspeed_kcs_probe_of_v[12]() functions to aid
rearranging how the private device-driver memory is allocated.
Signed-off-by: Andrew Jeffery <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Zev Weiss <[email protected]>
Signed-off-by: Corey Minyard <[email protected]>
|
|
This reverts commit 4cbbe34807938e6e494e535a68d5ff64edac3f20.
Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.
Signed-off-by: Yifan Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
doorbell."
This reverts commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3.
Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.
Signed-off-by: Yifan Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Once drm_framebuffer_init has returned 0, the framebuffer is hooked up
to the reference counting machinery and can no longer be destroyed with
a simple kfree. Therefore, it must be called last.
If drm_framebuffer_init returns 0 but its caller then returns non-0,
there will likely be memory corruption fireworks down the road.
The following lead me to this fix:
[ 12.891228] kernel BUG at lib/list_debug.c:25!
[...]
[ 12.891263] RIP: 0010:__list_add_valid+0x4b/0x70
[...]
[ 12.891324] Call Trace:
[ 12.891330] drm_framebuffer_init+0xb5/0x100 [drm]
[ 12.891378] amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu]
[ 12.891592] ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu]
[ 12.891794] amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu]
[ 12.891995] drm_internal_framebuffer_create+0x378/0x3f0 [drm]
[ 12.892036] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[ 12.892075] drm_mode_addfb2+0x34/0xd0 [drm]
[ 12.892115] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[ 12.892153] drm_ioctl_kernel+0xe2/0x150 [drm]
[ 12.892193] drm_ioctl+0x3da/0x460 [drm]
[ 12.892232] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[ 12.892274] amdgpu_drm_ioctl+0x43/0x80 [amdgpu]
[ 12.892475] __se_sys_ioctl+0x72/0xc0
[ 12.892483] do_syscall_64+0x33/0x40
[ 12.892491] entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: f258907fdd835e "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Signed-off-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.
Add a new helper function that corrects and clarifies the logic of
when we can skip reads, and have it only zero out the part of the page
that won't have data copied in for the write.
Finally, don't set the page Uptodate after zeroing. It's not up to date
since the write data won't have been copied in yet.
[DH made the following changes:
- Prefixed the new function with "netfs_".
- Don't call zero_user_segments() for a full-page write.
- Altered the beyond-last-page check to avoid a DIV instruction and got
rid of then-redundant zero-length file check.
]
Fixes: e1b1240c1ff5f ("netfs: Add write_begin helper")
Reported-by: Andrew W Elble <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]/
Link: https://lore.kernel.org/r/162367683365.460125.4467036947364047314.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162391826758.1173366.11794946719301590013.stgit@warthog.procyon.org.uk/ # v2
|
|
Fix afs_write_end() to correctly handle a short copy into the intended
write region of the page. Two things are necessary:
(1) If the page is not up to date, then we should just return 0
(ie. indicating a zero-length copy). The loop in
generic_perform_write() will go around again, possibly breaking up the
iterator into discrete chunks[1].
This is analogous to commit b9de313cf05fe08fa59efaf19756ec5283af672a
for ceph.
(2) The page should not have been set uptodate if it wasn't completely set
up by netfs_write_begin() (this will be fixed in the next patch), so
we need to set uptodate here in such a case.
Also remove the assertion that was checking that the page was set uptodate
since it's now set uptodate if it wasn't already a few lines above. The
assertion was from when uptodate was set elsewhere.
Changes:
v3: Remove the handling of len exceeding the end of the page.
Fixes: 3003bbd0697b ("afs: Use the netfs_write_begin() helper")
Reported-by: Jeff Layton <[email protected]>
Signed-off-by: David Howells <[email protected]>
Acked-by: Jeff Layton <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
cc: Al Viro <[email protected]>
cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]/ [1]
Link: https://lore.kernel.org/r/162367682522.460125.5652091227576721609.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162391825688.1173366.3437507255136307904.stgit@warthog.procyon.org.uk/ # v2
|
|
A disabled/masked interrupt marked as wakeup source must be re-enable
and unmasked in order to be able to wake-up the host. That can be done
by flaging the irqchip with IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND.
Note: It 'sometimes' works without that change, but only thanks to the
lazy generic interrupt disabling (keeping interrupt unmasked).
Reported-by: Michal Koziel <[email protected]>
Signed-off-by: Loic Poulain <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
on SA8155p-adp board" from Bhupesh Sharma <[email protected]>:
Changes since v2:
-----------------
- v2 series can be found here: https://lore.kernel.org/linux-arm-msm/[email protected]/T/#m8303d27d561b30133992da88198abb78ea833e21
- Addressed review comments from Bjorn and Mark.
- As per suggestion from Bjorn, seperated the patches in different
patchsets (specific to each subsystem) to ease review and patch application.
Changes since v1:
-----------------
- v1 series can be found here: https://lore.kernel.org/linux-arm-msm/[email protected]/T/#mc524fe82798d4c4fb75dd0333318955e0406ad18
- Addressed review comments from Bjorn and Vinod received on the v1
series.
This series adds the regulator support code for SA8155p-adp board
which is based on Qualcomm snapdragon sa8155p SoC which in turn is
simiar to the sm8150 SoC.
This board supports a new PMIC PMM8155AU.
While at it, also make some cosmetic changes to the regulator driver
and dt-bindings to make sure the compatibles are alphabetical and also
fix issues with extra comma(s) at the end of terminator line(s).
Cc: Mark Brown <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Bhupesh Sharma (5):
dt-bindings: regulator: qcom,rpmh-regulator: Arrange compatibles
alphabetically
dt-bindings: regulator: qcom,rpmh-regulator: Add compatible for
SA8155p-adp board pmic
regulator: qcom-rpmh: Cleanup terminator line commas
regulator: qcom-rpmh: Add terminator at the end of pm7325x_vreg_data[]
array
regulator: qcom-rpmh: Add new regulator found on SA8155p adp board
.../regulator/qcom,rpmh-regulator.yaml | 17 ++---
drivers/regulator/qcom-rpmh-regulator.c | 62 +++++++++++++++----
2 files changed, 59 insertions(+), 20 deletions(-)
--
2.31.1
|
|
<[email protected]>:
Extend regulator notification support
This series extends the regulator notification and error flag support.
Initial discussion on the topic can be found here:
https://lore.kernel.org/lkml/[email protected]/
In a nutshell - the series adds:
1. WARNING level events/error flags. (Patch 3)
Current regulator 'ERROR' event notifications for over/under
voltage, over current and over temperature are used to indicate
condition where monitored entity is so badly "off" that it actually
indicates a hardware error which can not be recovered. The most
typical hanling for that is believed to be a (graceful)
system-shutdown. Here we add set of 'WARNING' level flags to allow
sending notifications to consumers before things are 'that badly off'
so that consumer drivers can implement recovery-actions.
2. Device-tree properties for specifying limit values. (Patches 1, 5)
Add limits for above mentioned 'ERROR' and 'WARNING' levels (which
send notifications to consumers) and also for a 'PROTECTION' level
(which will be used to immediately shut-down the regulator(s) W/O
informing consumer drivers. Typically implemented by hardware).
Property parsing is implemented in regulator core which then calls
callback operations for limit setting from the IC drivers. A
warning is emitted if protection is requested by device tree but the
underlying IC does not support configuring requested protection.
3. Helpers which can be registered by IC. (Patch 4)
Target is to avoid implementing IRQ handling and IRQ storm protection
in each IC driver. (Many of the ICs implementin these IRQs do not allow
masking or acking the IRQ but keep the IRQ asserted for the whole
duration of problem keeping the processor in IRQ handling loop).
4. Emergency poweroff function (refactored out of the thermal_core to
kernel/reboot.c) which is called if IC fires error IRQs but IC reading
fails and given retry-count is exceeded. (Patches 2, 4)
Please note that the mutex in the emergency shutdown was replaced by a
simple atomic in order to allow call from any context.
The helper was attempted to be done so it could be used to implement
roughly same logic as is used in qcom-labibb regulator. This means
amongst other things a safety shut-down if IC registers are not readable.
Using these shut-down retry counters are optional. The idea is that the
helper could be also used by simpler ICs which do not provide status
register(s) which can be used to check if error is still active.
ICs which do not have such status register can simply omit the 'renable'
callback (and retry-counts etc) - and helper assumes the situation is Ok
and re-enables IRQ after given time period. If problem persists the
handler is ran again and another notification is sent - but at least the
delay allows processor to avoid IRQ loop.
Patch 7 takes this notification support in use at BD9576MUF.
Patch 8 is related to MFD change which is not really related to the RFC
here. It was added to this series in order to avoid potential conflicts.
Patch 9 adds a maintainers entry.
Changelog v10-RESEND:
- rebased on v5.13-rc4
Changelog v10:
- rebased on v5.13-rc2
- Move rdev_*() print macros to the internal.h and use rdev_dbg()
from irq_helpers.c
- Export rdev_get_name() and move it from coupler.h to driver.h for
others to use. (It was already in coupler.h but not exported -
usage was limited and coupler.h does not sound like optimal place
as rdev_name is not only used by coupled regulators)
- Send all regulator notifications from irq_helpers.c at one OR'd
event for the sake of simplicity. For BD9576 this does not matter
as it has own IRQ for each event case. Header defining events says
they may be OR'd.
- Change WARN() at protection shutdown to pr_emerg as suggested by
Petr.
Changelog v9:
- rebases on v5.13-rc1
- Update thermal documentation
- Fix regulator notification event number
Changelog v8:
- split shutdown API adding and thermal core taking it in use to
own patches.
- replace the spinlock with atomic when ensuring the emergency
shutdown is only called once.
Changelog v7:
general:
- rebased on v5.12-rc7
- new patch for refactoring the hw-failure reboot logic out of
thermal_core.c for others to use.
notification helpers:
- fix regulator error_flags query
- grammar/typos
- do not BUG() but attempt to shut-down the system
- use BITS_PER_TYPE()
Changelog v6:
Add MAINTAINERS entry
Changes to IRQ notifiers
- move devm functions to drivers/regulator/devres.c
- drop irq validity check
- use devm_add_action_or_reset()
- fix styling issues
- fix kerneldocs
Changelog v5:
- Fix the badly formatted pr_emerg() call.
Changelog v4:
- rebased on v5.12-rc6
- dropped RFC
- fix external FET DT-binding.
- improve prints for cases when expecting HW failure.
- styling and typos
Changelog v3:
Regulator core:
- Fix dangling pointer access at regulator_irq_helper()
stpmic1_regulator:
- fix function prototype (compile error)
bd9576-regulator:
- Update over current limits to what was given in new data-sheet
(REV00K)
- Allow over-current monitoring without external FET. Set limits to
values given in data-sheet (REV00K).
Changelog v2:
Generic:
- rebase on v5.12-rc2 + BD9576 series
- Split devm variant of delayed wq to own series
Regulator framework:
- Provide non devm variant of IRQ notification helpers
- shorten dt-property names as suggested by Rob
- unconditionally call map_event in IRQ handling and require it to be
populated
BD9576 regulators:
- change the FET resistance property to micro-ohms
- fix voltage computation in OC limit setting
|
|
ARM64_SWAPPER_USES_SECTION_MAPS implies that a PMD level huge page mappings
are used for swapper, idmap and vmemmap. Lets make it PMD explicit removing
any possible confusion with generic memory sections and also bit generic as
it's applicable for idmap and vmemmap mappings as well. Hence rename it as
ARM64_KERNEL_USES_PMD_MAPS instead.
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Calculate the max VMCS index for vmcs12 by walking the array to find the
actual max index. Hardcoding the index is prone to bitrot, and the
calculation is only done on KVM bringup (albeit on every CPU, but there
aren't _that_ many null entries in the array).
Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
As part of smaller maxphyaddr emulation, kvm needs to intercept
present page faults to see if it needs to add the RSVD flag (bit 3) to
the error code. However, there is no need to intercept page faults
that already have the RSVD flag set. When setting up the page fault
intercept, add the RSVD flag into the #PF error code mask field (but
not the #PF error code match field) to skip the intercept when the
RSVD flag is already set.
Signed-off-by: Jim Mattson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Pull ARM fix from Russell King:
- fix gcc 10 compiler regression with cpu_init()
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9081/1: fix gcc-10 thumb2-kernel regression
|
|
On HP Pavilion Gaming Laptop 15-cx0xxx, the ECDT EC and DSDT EC share
the same port addresses but different GPEs. And the DSDT GPE is the
right one to use.
The current code duplicates DSDT EC with ECDT EC if the port addresses
are the same, and uses ECDT GPE as a result, which breaks this machine.
Introduce a new quirk for the HP laptop to trust the DSDT GPE,
and avoid duplicating even if the port addresses are the same.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209989
Reported-and-tested-by: Shao Fu, Chen <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Notice that the table field of struct acpi_table_events_work is never
read and its event field is always equal to ACPI_TABLE_EVENT_LOAD, so
both of them are redundant.
Accordingly, drop struct acpi_table_events_work and use struct
work_struct directly instead of it, simplify acpi_scan_table_handler()
and rename it to acpi_scan_table_notify().
Moreover, make acpi_bus_table_handler() check the event code against
ACPI_TABLE_EVENT_LOAD before calling acpi_scan_table_notify(), so it
is not necessary to do that check in the latter.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
|
|
While checking the master status of the DRM file in
drm_is_current_master(), the device's master mutex should be
held. Without the mutex, the pointer fpriv->master may be freed
concurrently by another process calling drm_setmaster_ioctl(). This
could lead to use-after-free errors when the pointer is subsequently
dereferenced in drm_lease_owner().
The callers of drm_is_current_master() from drm_auth.c hold the
device's master mutex, but external callers do not. Hence, we implement
drm_is_current_master_locked() to be used within drm_auth.c, and
modify drm_is_current_master() to grab the device's master mutex
before checking the master status.
Reported-by: Daniel Vetter <[email protected]>
Signed-off-by: Desmond Cheong Zhi Xi <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Because the __x86_indirect_alt* symbols are just that, objtool will
try and validate them as regular symbols, instead of the alternative
replacements that they are.
This goes sideways for FRAME_POINTER=y builds; which generate a fair
amount of warnings.
Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Split up the #VC handler code into a from-user and a from-kernel part.
This allows clean and correct state tracking, as the #VC handler needs
to enter NMI-state when raised from kernel mode and plain IRQ state when
raised from user-mode.
Fixes: 62441a1fb532 ("x86/sev-es: Correctly track IRQ states in runtime #VC handler")
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The #VC handler only cares about IRQs being disabled while the GHCB is
active, as it must not be interrupted by something which could cause
another #VC while it holds the GHCB (NMI is the exception for which the
backup GHCB exits).
Make sure nothing interrupts the code path while the GHCB is active
by making sure that callers of __sev_{get,put}_ghcb() have disabled
interrupts upfront.
[ bp: Massage commit message. ]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Function wait_current_trans_commit_start is now fairly trivial so it can
be inlined in its only caller.
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
There's only one caller left btrfs_ioctl_start_sync that passes 0, so we
can remove the switch in btrfs_commit_transaction_async.
A cleanup 9babda9f33fd ("btrfs: Remove async_transid from
btrfs_mksubvol/create_subvol/create_snapshot") removed calls that passed
1, so this is a followup.
As this removes last call of wait_current_trans_commit_start_and_unblock,
remove the function as well.
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
clang warns:
fs/btrfs/delayed-inode.c:684:6: warning: variable 'total_data_size' set
but not used [-Wunused-but-set-variable]
int total_data_size = 0, total_size = 0;
^
1 warning generated.
This variable's value has been unused since commit fc0d82e103c7 ("btrfs:
sink total_data parameter in setup_items_for_insert"). Eliminate it.
Link: https://github.com/ClangBuiltLinux/linux/issues/1391
Reviewed-by: Nikolay Borisov <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
By way of inverting the list_empty conditional the insert label can be
eliminated, making the function's flow entirely linear.
Signed-off-by: Nikolay Borisov <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
[BUG]
There is a very rare ASSERT() triggering during full fstests run for
subpage rw support.
No other reproducer so far.
The ASSERT() gets triggered for metadata read in
btrfs_page_set_uptodate() inside end_page_read().
[CAUSE]
There is still a small race window for metadata only, the race could
happen like this:
T1 | T2
------------------------------------+-----------------------------
end_bio_extent_readpage() |
|- btrfs_validate_metadata_buffer() |
| |- free_extent_buffer() |
| Still have 2 refs |
|- end_page_read() |
|- if (unlikely(PagePrivate()) |
| The page still has Private |
| | free_extent_buffer()
| | | Only one ref 1, will be
| | | released
| | |- detach_extent_buffer_page()
| | |- btrfs_detach_subpage()
|- btrfs_set_page_uptodate() |
The page no longer has Private|
>>> ASSERT() triggered <<< |
This race window is super small, thus pretty hard to hit, even with so
many runs of fstests.
But the race window is still there, we have to go another way to solve
it other than relying on random PagePrivate() check.
Data path is not affected, as it will lock the page before reading,
while unlocking the page after the last read has finished, thus no race
window.
[FIX]
This patch will fix the bug by repurposing btrfs_subpage::readers.
Now btrfs_subpage::readers will be a member shared by both metadata and
data.
For metadata path, we don't do the page unlock as metadata only relies
on extent locking.
At the same time, teach page_range_has_eb() to take
btrfs_subpage::readers into consideration.
So that even if the last eb of a page gets freed, page::private won't be
detached as long as there still are pending end_page_read() calls.
By this we eliminate the race window, this will slight increase the
metadata memory usage, as the page may not be released as frequently as
usual. But it should not be a big deal.
The code got introduced in ("btrfs: submit read time repair only for
each corrupted sector"), but the fix is in a separate patch to keep the
problem description and the crash is rare so it should not hurt
bisectability.
Signed-off-by: Qu Wegruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
[BUG]
With current btrfs subpage rw support, the following script can lead to
fs hang:
$ mkfs.btrfs -f -s 4k $dev
$ mount $dev -o nospace_cache $mnt
$ fsstress -w -n 100 -p 1 -s 1608140256 -v -d $mnt
The fs will hang at btrfs_start_ordered_extent().
[CAUSE]
In above test case, btrfs_invalidate() will be called with the following
parameters:
offset = 0 length = 53248 page dirty = 1 subpage dirty bitmap = 0x2000
Since @offset is 0, btrfs_invalidate() will try to invalidate the full
page, and finally call clear_page_extent_mapped() which will detach
subpage structure from the page.
And since the page no longer has subpage structure, the subpage dirty
bitmap will be cleared, preventing the dirty range from being written
back, thus no way to wake up the ordered extent.
[FIX]
Just follow other filesystems, only to invalidate the page if the range
covers the full page.
There are cases like truncate_setsize() which can call
btrfs_invalidatepage() with offset == 0 and length != 0 for the last
page of an inode.
Although the old code will still try to invalidate the full page, we are
still safe to just wait for ordered extent to finish.
So it shouldn't cause extra problems.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
[BUG]
With current subpage RW support, the following script can hang the fs
with 64K page size.
# mkfs.btrfs -f -s 4k $dev
# mount $dev -o nospace_cache $mnt
# fsstress -w -n 50 -p 1 -s 1607749395 -d $mnt
The kernel will do an infinite loop in btrfs_punch_hole_lock_range().
[CAUSE]
In btrfs_punch_hole_lock_range() we:
- Truncate page cache range
- Lock extent io tree
- Wait any ordered extents in the range.
We exit the loop until we meet all the following conditions:
- No ordered extent in the lock range
- No page is in the lock range
The latter condition has a pitfall, it only works for sector size ==
PAGE_SIZE case.
While can't handle the following subpage case:
0 32K 64K 96K 128K
| |///////||//////| ||
lockstart=32K
lockend=96K - 1
In this case, although the range crosses 2 pages,
truncate_pagecache_range() will invalidate no page at all, but only zero
the [32K, 96K) range of the two pages.
Thus filemap_range_has_page(32K, 96K-1) will always return true, thus we
will never meet the loop exit condition.
[FIX]
Fix the problem by doing page alignment for the lock range.
Function filemap_range_has_page() has already handled lend < lstart
case, we only need to round up @lockstart, and round_down @lockend for
truncate_pagecache_range().
This modification should not change any thing for sector size ==
PAGE_SIZE case, as in that case our range is already page aligned.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
The modifications are:
- Page copy destination
For subpage case, one page can contain multiple sectors, thus we can
no longer expect the memcpy_to_page()/btrfs_decompress() to copy
data into page offset 0.
The correct offset is offset_in_page(file_offset) now, which should
handle both regular sectorsize and subpage cases well.
- Page status update
Now we need to use subpage helper to handle the page status update.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Only set_page_dirty() and SetPageUptodate() is not subpage compatible.
Convert them to subpage helpers, so that __extent_writepage_io() can
submit page content correctly.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
btrfs_truncate_block() itself is already mostly subpage compatible, the
only missing part is the page dirtying code.
Currently if we have a sector that needs to be truncated, we set the
sector aligned range delalloc, then set the full page dirty.
The problem is, current subpage code requires subpage dirty bit to be
set, or __extent_writepage_io() won't submit bio, thus leads to ordered
extent never to finish.
So this patch will make btrfs_truncate_block() to call
btrfs_page_set_dirty() helper to replace set_page_dirty() to fix the
problem.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
__extent_writepage_io() function originally just iterates through all
the extent maps of a page, and submits any regular extents.
This is fine for sectorsize == PAGE_SIZE case, as if a page is dirty, we
need to submit the only sector contained in the page.
But for subpage case, one dirty page can contain several clean sectors
with at least one dirty sector.
If __extent_writepage_io() still submit all regular extent maps, it can
submit data which is already written to disk.
And since such already written data won't have corresponding ordered
extents, it will trigger a BUG_ON() in btrfs_csum_one_bio().
Change the behavior of __extent_writepage_io() by finding the first
dirty byte in the page, and only submit the dirty range other than the
full extent.
Since we're also here, also modify the following calls to be subpage
compatible:
- SetPageError()
- end_page_writeback()
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Function btrfs_set_range_writeback() currently just sets the page
writeback unconditionally.
Change it to call the subpage helper so that we can handle both cases
well.
Since the subpage helpers needs btrfs_fs_info, also change the parameter
to accept btrfs_inode.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
__process_pages_contig()
In cow_file_range(), after we have succeeded creating an inline extent,
we unlock the page with extent_clear_unlock_delalloc() by passing
locked_page == NULL.
For sectorsize == PAGE_SIZE case, this is just making the page lock and
unlock harder to grab.
But for incoming subpage case, it can be a big problem.
For incoming subpage case, page locking have two entry points:
- __process_pages_contig()
In that case, we know exactly the range we want to lock (which only
requires sector alignment).
To handle the subpage requirement, we introduce btrfs_subpage::writers
to page::private, and will update it in __process_pages_contig().
- Other directly lock/unlock_page() call sites
Those won't touch btrfs_subpage::writers at all.
This means, page locked by __process_pages_contig() can only be unlocked
by __process_pages_contig().
Thankfully we already have the existing infrastructure in the form of
@locked_page in various call sites.
Unfortunately, extent_clear_unlock_delalloc() in cow_file_range() after
creating an inline extent is the exception.
It intentionally call extent_clear_unlock_delalloc() with locked_page ==
NULL, to also unlock current page (and clear its dirty/writeback bits).
To co-operate with incoming subpage modifications, and make the page
lock/unlock pair easier to understand, this patch will still call
extent_clear_unlock_delalloc() with locked_page, and only unlock the
page in __extent_writepage().
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
When __process_pages_contig() gets called for
extent_clear_unlock_delalloc(), if we hit the locked page, only Private2
bit is updated, but dirty/writeback/error bits are all skipped.
There are several call sites that call extent_clear_unlock_delalloc()
with locked_page and PAGE_CLEAR_DIRTY/PAGE_SET_WRITEBACK/PAGE_END_WRITEBACK
- cow_file_range()
- run_delalloc_nocow()
- cow_file_range_async()
All for their error handling branches.
For those call sites, since we skip the locked page for
dirty/error/writeback bit update, the locked page will still have its
subpage dirty bit remaining.
Normally it's the call sites which locked the page to handle the locked
page, but it won't hurt if we also do the update.
Especially there are already other call sites doing the same thing by
manually passing NULL as locked_page.
Tested-by: Ritesh Harjani <[email protected]> # [ppc64]
Tested-by: Anand Jain <[email protected]> # [aarch64]
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|