Age | Commit message (Collapse) | Author | Files | Lines |
|
Other parts of the kernel may use LPS0 constraints information to make
decisions on what power state to put a device into.
Signed-off-by: Mario Limonciello <[email protected]>
[ rjw: Rewrite kerneldoc, rearrange if () statement, edit subject and
changelog ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
We have one existing and one coming user of this macro.
Introduce a helper.
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
While parsing the constraints show all the entries for the table
to aid with debugging other problems later.
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The constraints table should be resetting the `list` object
after running through all of `info_obj` iterations.
This adjusts whitespace as well as less code will now be included
with each loop. This fixes a functional problem is fixed where a
badly formed package in the inner loop may have incorrect data.
Fixes: 146f1ed852a8 ("ACPI: PM: s2idle: Add AMD support to handle _DSM")
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
If a badly constructed firmware includes multiple `ACPI_TYPE_PACKAGE`
objects while evaluating the AMD LPS0 _DSM, there will be a memory
leak. Explicitly guard against this.
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
When code uses a pre-increment it makes the reader question "why".
In the constraint fetching code there is no reason for the variables
to be pre-incremented so adjust to post-increment.
No intended functional changes.
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The `#ifdef` for acpi_register_lps0_dev() currently is guarded against
`CONFIG_X86`, but actually the functions contained in the block are
specifically sleep related functions.
Adjust the guard to also check for `CONFIG_SUSPEND`.
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The Amlogic-C3 SoCs support 12 GPIO IRQ lines compared with previous
serial chips and have something different, details are as below.
IRQ Number:
- 54 1 pins on bank TESTN
- 53:40 14 pins on bank X
- 39:33 7 pins on bank D
- 32:27 6 pins on bank A
- 26:22 5 pins on bank E
- 21:15 7 pins on bank C
- 14:0 15 pins on bank B
Signed-off-by: Huqiang Qin <[email protected]>
Acked-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Update dt-binding document for GPIO interrupt controller of Amlogic-C3 SoCs
Signed-off-by: Huqiang Qin <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Acked-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
After commit 16988c742968 ("of/address: introduce of_address_count() helper"),
Use of_address_count() to instead of open-coding it, it's no functional change.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
platform_get_irq()
It is not possible for platform_get_irq() to return 0. Use the
return value from platform_get_irq().
Signed-off-by: Ruan Jinjie <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It is not possible for platform_get_irq() to return 0. Use the
return value from platform_get_irq().
Signed-off-by: Ruan Jinjie <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
i8259_of_init() is only used as an initcall and does not need to be global,
so mark it static to avoid:
drivers/irqchip/irq-i8259.c:343:12: warning: no previous prototype for 'i8259_of_init' [-Wmissing-prototypes]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This function is only used locally and should be static to avoid a warning:
drivers/irqchip/irq-mips-gic.c:560:6: error: no previous prototype for 'gic_irq_domain_free' [-Werror=missing-prototypes]
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The declaration for this function is not included, which leads to a harmless warning:
drivers/irqchip/irq-xtensa-pic.c:91:12: error: no previous prototype for 'xtensa_pic_init_legacy' [-Werror=missing-prototypes]
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Max Filippov <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Return value of function eiointc_index is int, however it is converted
into uint32_t and then compared smaller than zero, this will cause logic
problem.
Fixes: dd281e1a1a93 ("irqchip: Add Loongson Extended I/O interrupt controller support")
Signed-off-by: Bibo Mao <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
During stress test with attaching and detaching VF from KVM and
simultaneously changing VFs spoofcheck and trust there was a
NULL pointer dereference in ice_reset_vf that VF's VSI is null.
More than one instance of ice_reset_vf() can be running at a given
time. When we rebuild the VSI in ice_reset_vf, another reset can be
triaged from ice_service_task. In this case we can access the currently
uninitialized VSI and cause panic. The window for this racing condition
has been around for a long time but it's much worse after commit
227bf4500aaa ("ice: move VSI delete outside deconfig") because
the reset runs faster. ice_reset_vf() using vf->cfg_lock and when
we move this lock before accessing to the VF VSI, we can fix
BUG for all cases.
Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes
in ice_vsi_stop_all_rx_rings()
With our reproducer, we can hit BUG:
~8h before commit 227bf4500aaa ("ice: move VSI delete outside deconfig").
~20m after commit 227bf4500aaa ("ice: move VSI delete outside deconfig").
After this fix we are not able to reproduce it after ~48h
There was commit cf90b74341ee ("ice: Fix call trace with null VSI during
VF reset") which also tried to fix this issue, but it was only
partially resolved and the bug still exists.
[ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 6420.665382] #PF: supervisor read access in kernel mode
[ 6420.670521] #PF: error_code(0x0000) - not-present page
[ 6420.675659] PGD 0
[ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1
[ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022
[ 6420.698729] Workqueue: ice ice_service_task [ice]
[ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice]
[ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted
[ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83
[ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized
[ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246
[ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000
[ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828
[ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000
[ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0
[ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff
[ 6420.783742] FS: 0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000
[ 6420.791829] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0
[ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002)
[ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 6420.811841] PKRU: 55555554
[ 6420.811842] Call Trace:
[ 6420.811843] <TASK>
[ 6420.811844] ice_reset_vf+0x9a/0x450 [ice]
[ 6420.811876] ice_process_vflr_event+0x8f/0xc0 [ice]
[ 6420.841343] ice_service_task+0x23b/0x600 [ice]
[ 6420.845884] ? __schedule+0x212/0x550
[ 6420.849550] process_one_work+0x1e2/0x3b0
[ 6420.853563] ? rescuer_thread+0x390/0x390
[ 6420.857577] worker_thread+0x50/0x3a0
[ 6420.861242] ? rescuer_thread+0x390/0x390
[ 6420.865253] kthread+0xdd/0x100
[ 6420.868400] ? kthread_complete_and_exit+0x20/0x20
[ 6420.873194] ret_from_fork+0x1f/0x30
[ 6420.876774] </TASK>
[ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk
r
Fixes: efe41860008e ("ice: Fix memory corruption in VF driver")
Fixes: f23df5220d2b ("ice: Fix spurious interrupt during removal of trusted VF")
Signed-off-by: Petr Oros <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc.
After this commit we are not able to attach VF to VM:
virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52
error: Failed to attach interface
error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable
ice_check_vf_ready_for_cfg() already contain waiting for reset.
New condition in ice_check_vf_ready_for_reset() causing only problems.
Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization")
Signed-off-by: Petr Oros <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver is misconfiguring the hardware for some values of MTU such that
it could use multiple descriptors to receive a packet when it could have
simply used one.
Change the driver to use a round-up instead of the result of a shift, as
the shift can truncate the lower bits of the size, and result in the
problem noted above. It also aligns this driver with similar code in i40e.
The insidiousness of this problem is that everything works with the wrong
size, it's just not working as well as it could, as some MTU sizes end up
using two or more descriptors, and there is no way to tell that is
happening without looking at ice_trace or a bus analyzer.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Recent rework moved block device closing out of sb->put_super() and into
sb->kill_sb() to avoid deadlocks as s_umount is held in put_super() and
blkdev_put() can end up taking s_umount again.
That means we need to move the removal of the superblock from @fs_supers
out of generic_shutdown_super() and into deactivate_locked_super() to
ensure that concurrent mounters don't fail to open block devices that
are still in use because blkdev_put() in sb->kill_sb() hasn't been
called yet.
We can now do this as we can make iterators through @fs_super and
@super_blocks wait without holding s_umount. Concurrent mounts will wait
until a dying superblock is fully dead so until sb->kill_sb() has been
called and SB_DEAD been set. Concurrent iterators can already discard
any SB_DYING superblock.
Reviewed-by: Jan Kara <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Recent patches experiment with making it possible to allocate a new
superblock before opening the relevant block device. Naturally this has
intricate side-effects that we get to learn about while developing this.
Superblock allocators such as sget{_fc}() return with s_umount of the
new superblock held and lock ordering currently requires that block
level locks such as bdev_lock and open_mutex rank above s_umount.
Before aca740cecbe5 ("fs: open block device after superblock creation")
ordering was guaranteed to be correct as block devices were opened prior
to superblock allocation and thus s_umount wasn't held. But now s_umount
must be dropped before opening block devices to avoid locking
violations.
This has consequences. The main one being that iterators over
@super_blocks and @fs_supers that grab a temporary reference to the
superblock can now also grab s_umount before the caller has managed to
open block devices and called fill_super(). So whereas before such
iterators or concurrent mounts would have simply slept on s_umount until
SB_BORN was set or the superblock was discard due to initalization
failure they can now needlessly spin through sget{_fc}().
If the caller is sleeping on bdev_lock or open_mutex one caller waiting
on SB_BORN will always spin somewhere and potentially this can go on for
quite a while.
It should be possible to drop s_umount while allowing iterators to wait
on a nascent superblock to either be born or discarded. This patch
implements a wait_var_event() mechanism allowing iterators to sleep
until they are woken when the superblock is born or discarded.
This also allows us to avoid relooping through @fs_supers and
@super_blocks if a superblock isn't yet born or dying.
Link: aca740cecbe5 ("fs: open block device after superblock creation")
Reviewed-by: Jan Kara <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
efi_queue_work() is a macro that implements the non-trivial manipulation
of the EFI runtime workqueue and completion data structure, most of
which is generic, and could be shared between all the users of the
macro. So move it out of the macro and into a new helper function.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The current code that marshalls the EFI runtime call arguments to hand
them off to a async helper does so in a type unsafe and slightly messy
manner - everything is cast to void* except for some integral types that
are passed by reference and dereferenced on the receiver end.
Let's clean this up a bit, and record the arguments of each runtime
service invocation exactly as they are issued, in a manner that permits
the compiler to check the types of the arguments at both ends.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Only the arch_efi_call_virt() macro that some architectures override
needs to be a macro, given that it is variadic and encapsulates calls
via function pointers that have different prototypes.
The associated setup and teardown code are not special in this regard,
and don't need to be instantiated at each call site. So turn them into
ordinary C functions and move them out of line.
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Only the arch_efi_call_virt() macro that some architectures override
needs to be a macro, given that it is variadic and encapsulates calls
via function pointers that have different prototypes.
The associated setup and teardown code are not special in this regard,
and don't need to be instantiated at each call site. So turn them into
ordinary C functions and move them out of line.
Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Use helpers instead of the open coded dance to silence lockdep warnings.
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Use helpers instead of the open coded dance to silence lockdep warnings.
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Use helpers instead of the open coded dance to silence lockdep warnings.
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Use helpers instead of the open coded dance to silence lockdep warnings.
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
aio, io_uring, cachefiles and overlayfs, all open code an ugly variant
of file_{start,end}_write() to silence lockdep warnings.
Create helpers for this lockdep dance so we can use the helpers in all
the callers.
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
and use sb_end_write() instead of open coded version.
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
This helper does not take a kiocb as input and we want to create a
common helper by that name that takes a kiocb as input.
Signed-off-by: Amir Goldstein <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Convert buf->page to a folio once instead of five times. There's only
one uptodate bit per folio, not per page, so we lose nothing here.
Signed-off-by: "Matthew Wilcox (Oracle)" <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
Remove a number of implicit calls to compound_head() and various calls
to compatibility functions. This is not sufficient to enable support
for large folios; generic_perform_write() must be converted first.
Signed-off-by: "Matthew Wilcox (Oracle)" <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
There is race issue when concurrently splice_read main trace_pipe and
per_cpu trace_pipes which will result in data read out being different
from what actually writen.
As suggested by Steven:
> I believe we should add a ref count to trace_pipe and the per_cpu
> trace_pipes, where if they are opened, nothing else can read it.
>
> Opening trace_pipe locks all per_cpu ref counts, if any of them are
> open, then the trace_pipe open will fail (and releases any ref counts
> it had taken).
>
> Opening a per_cpu trace_pipe will up the ref count for just that
> CPU buffer. This will allow multiple tasks to read different per_cpu
> trace_pipe files, but will prevent the main trace_pipe file from
> being opened.
But because we only need to know whether per_cpu trace_pipe is open or
not, using a cpumask instead of using ref count may be easier.
After this patch, users will find that:
- Main trace_pipe can be opened by only one user, and if it is
opened, all per_cpu trace_pipes cannot be opened;
- Per_cpu trace_pipes can be opened by multiple users, but each per_cpu
trace_pipe can only be opened by one user. And if one of them is
opened, main trace_pipe cannot be opened.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Suggested-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Encountered on an ARM Mali-T760 MP4, attempting to read the nvmem
variable can also return EOPNOTSUPP instead of ENOENT when speed
binning is unsupported.
Cc: <[email protected]>
Fixes: 7d690f936e9b ("drm/panfrost: Add basic support for speed binning")
Signed-off-by: David Michael <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
While originally it was fine to format strings using "%pOF" while
holding devtree_lock, this now causes a deadlock. Lockdep reports:
of_get_parent from of_fwnode_get_parent+0x18/0x24
^^^^^^^^^^^^^
of_fwnode_get_parent from fwnode_count_parents+0xc/0x28
fwnode_count_parents from fwnode_full_name_string+0x18/0xac
fwnode_full_name_string from device_node_string+0x1a0/0x404
device_node_string from pointer+0x3c0/0x534
pointer from vsnprintf+0x248/0x36c
vsnprintf from vprintk_store+0x130/0x3b4
Fix this by moving the printing in __of_changeset_entry_apply() outside
the lock. As the only difference in the multiple prints is the action
name, use the existing "action_names" to refactor the prints into a
single print.
Fixes: a92eb7621b9fb2c2 ("lib/vsprintf: Make use of fwnode API to obtain node names and separators")
Cc: [email protected]
Reported-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
Commit 12e17243d8a1 ("of: base: improve error msg in
of_phandle_iterator_next()") added printing of the phandle value on
error, but failed to update the unittest.
Fixes: 12e17243d8a1 ("of: base: improve error msg in of_phandle_iterator_next()")
Cc: [email protected]
Reviewed-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
Add parameter descriptions to struct kunit_attr header for the
parameters attr_default and print.
Fixes: 39e92cb1e4a1 ("kunit: Add test attributes API structure")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Rae Moar <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Use colon to separate parameter name from their specific meaning.
silence the warning:
drivers/xen/grant-table.c:1051: warning: Function parameter or member 'nr_pages' not described in 'gnttab_free_pages'
Reported-by: Abaci Robot <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6030
Signed-off-by: Yang Li <[email protected]>
Acked-by: Juergen Gross <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
|
|
This reproduces the bug fixed by "btrfs: fix incorrect splitting in
btrfs_drop_extent_map_range", we were improperly calculating the range
for the split extent. Add a test that exercises this scenario and
validates that we get the correct resulting extent_maps in our tree.
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This helper is different from the normal add_extent_mapping in that it
will stuff an em into a gap that exists between overlapping em's in the
tree. It appeared there was a bug so I wrote a self test to validate it
did the correct thing when it worked with two side by side ems.
Thankfully it is correct, but more testing is better.
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
While investigating weird problems with the extent_map I wrote a self
test testing the various edge cases of btrfs_drop_extent_map_range.
This can split in different ways and behaves different in each case, so
test the various edge cases to make sure everything is functioning
properly.
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
scrub_stripe_read_repair_worker()
Currently the scrub_stripe_read_repair_worker() only does reads to
rebuild the corrupted sectors, it doesn't do any writeback.
The design is mostly to put writeback into a more ordered manner, to
co-operate with dev-replace with zoned mode, which requires every write
to be submitted in their bytenr order.
However the writeback for repaired sectors into the original mirror
doesn't need such strong sync requirement, as it can only happen for
non-zoned devices.
This patch would move the writeback for repaired sectors into
scrub_stripe_read_repair_worker(), which removes two calls sites for
repaired sectors writeback. (one from flush_scrub_stripes(), one from
scrub_raid56_parity_stripe())
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
The workqueue fs_info->scrub_worker would go ordered workqueue if it's a
device replace operation.
However the scrub is relying on multiple workers to do data csum
verification, and we always submit several read requests in a row.
Thus there is no need to use ordered workqueue just for dev-replace.
We have extra synchronization (the main thread will always
submit-and-wait for dev-replace writes) to handle it for zoned devices.
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
[REGRESSION]
There are several regression reports about the scrub performance with
v6.4 kernel.
On a PCIe 3.0 device, the old v6.3 kernel can go 3GB/s scrub speed, but
v6.4 can only go 1GB/s, an obvious 66% performance drop.
[CAUSE]
Iostat shows a very different behavior between v6.3 and v6.4 kernel:
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 9731.00 3425544.00 17237.00 63.92 2.18 352.02 21.18 100.00
nvme0n1p3 15578.00 993616.00 5.00 0.03 0.09 63.78 1.32 100.00
The upper one is v6.3 while the lower one is v6.4.
There are several obvious differences:
- Very few read merges
This turns out to be a behavior change that we no longer do bio
plug/unplug.
- Very low aqu-sz
This is due to the submit-and-wait behavior of flush_scrub_stripes(),
and extra extent/csum tree search.
Both behaviors are not that obvious on SATA SSDs, as SATA SSDs have NCQ
to merge the reads, while SATA SSDs can not handle high queue depth well
either.
[FIX]
For now this patch focuses on the read speed fix. Dev-replace replace
speed needs more work.
For the read part, we go two directions to fix the problems:
- Re-introduce blk plug/unplug to merge read requests
This is pretty simple, and the behavior is pretty easy to observe.
This would enlarge the average read request size to 512K.
- Introduce multi-group reads and no longer wait for each group
Instead of the old behavior, which submits 8 stripes and waits for
them, here we would enlarge the total number of stripes to 16 * 8.
Which is 8M per device, the same limit as the old scrub in-flight
bios size limit.
Now every time we fill a group (8 stripes), we submit them and
continue to next stripes.
Only when the full 16 * 8 stripes are all filled, we submit the
remaining ones (the last group), and wait for all groups to finish.
Then submit the repair writes and dev-replace writes.
This should enlarge the queue depth.
This would greatly improve the merge rate (thus read block size) and
queue depth:
Before (with regression, and cached extent/csum path):
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 20666.00 1318240.00 10.00 0.05 0.08 63.79 1.63 100.00
After (with all patches applied):
nvme0n1p3 5165.00 2278304.00 30557.00 85.54 0.55 441.10 2.81 100.00
i.e. 1287 to 2224 MB/s.
CC: [email protected] # 6.4+
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
One of the bottleneck of the new scrub code is the extra csum tree
search.
The old code would only do the csum tree search for each scrub bio,
which can be as large as 512KiB, thus they can afford to allocate a new
path each time.
But the new scrub code is doing csum tree search for each stripe, which
is only 64KiB, this means we'd better re-use the same csum path during
each search.
This patch would introduce a per-sctx path for csum tree search, as we
don't need to re-allocate the path every time we need to do a csum tree
search.
With this change we can further improve the queue depth and improve the
scrub read performance:
Before (with regression and cached extent tree path):
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
nvme0n1p3 15875.00 1013328.00 12.00 0.08 0.08 63.83 1.35 100.00
After (with both cached extent/csum tree path):
nvme0n1p3 17759.00 1133280.00 10.00 0.06 0.08 63.81 1.50 100.00
Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure")
CC: [email protected] # 6.4+
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|