Age | Commit message (Collapse) | Author | Files | Lines |
|
Implement the RSS functionality and add the corresponding callbacks in
XGMAC core.
Changes from v1:
- Do not use magic constants (Jakub)
- Use ethtool_rxfh_indir_default() (Jakub)
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When GRO decides not to coalesce a packet, in napi_frags_finish(), instead
of passing it to the stack immediately, place it on a list in the napi
struct. Then, at flush time (napi_complete_done(), napi_poll(), or
napi_busy_loop()), call netif_receive_skb_list_internal() on the list.
We'd like to do that in napi_gro_flush(), but it's not called if
!napi->gro_bitmask, so we have to do it in the callers instead. (There are
a handful of drivers that call napi_gro_flush() themselves, but it's not
clear why, or whether this will affect them.)
Because a full 64 packets is an inefficiently large batch, also consume the
list whenever it exceeds gro_normal_batch, a new net/core sysctl that
defaults to 8.
Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Supported ports in ethtool <eth1> are displayed based on media type.
For media type fibre and twinaxial, port type is "FIBRE". Media type
Base-T is "TP" and media KR is "Backplane".
V1->V2:
Corrected the subject.
Signed-off-by: Rahul Verma <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
device_node_to_regmap() is exactly like syscon_node_to_regmap(), but it
does not check that the node is compatible with "syscon", and won't
attach the first clock it finds to the regmap.
The rationale behind this, is that one device node with a standard
compatible string "foo,bar" can be covered by multiple drivers sharing a
regmap, or by a single driver doing all the job without a regmap, but
these are implementation details which shouldn't reflect on the
devicetree.
Signed-off-by: Paul Cercueil <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michael Turquette <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Mathieu Malaterre <[email protected]>
|
|
Convert pci_resource_to_user() to a weak function so the existing
architecture-specific implementations will automatically override the
generic one. This allows us to remove HAVE_ARCH_PCI_RESOURCE_TO_USER
definitions and avoid the conditional compilation for this single function.
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Denis Efremov <[email protected]>
[bhelgaas: squash into one commit]
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Paul Burton <[email protected]> # MIPS
|
|
The TLS progress params context WQE should not include an
Eth segment, drop it.
In addition, align the tls_progress_params layout with the
HW specification document:
- fix the tisn field name.
- remove the valid bit.
Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Fix the used constants for TLS TIS opmods, per the HW specification.
Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
According to Section 3.5 of the "Intel Low Power S0 Idle" document [1],
Function 5 of the LPS0 _DSM is expected to be invoked when the system
configuration matches the criteria for entering the target low-power
state of the platform. In particular, this means that all devices
should be suspended and in low-power states already when that function
is invoked.
This is not the case currently, however, because Function 5 of the
LPS0 _DSM is invoked by it before the "noirq" phase of device suspend,
which means that some devices may not have been put into low-power
states yet at that point. That is a consequence of the previous
design of the suspend-to-idle flow that allowed the "noirq" phase of
device suspend and the "noirq" phase of device resume to be carried
out for multiple times while "suspended" (if any spurious wakeup
events were detected) and the point of the LPS0 _DSM Function 5
invocation was chosen so as to call it (and LPS0 _DSM Function 6
analogously) once per suspend-resume cycle (regardless of how many
times the "noirq" phases of device suspend and resume were carried
out while "suspended").
Now that the suspend-to-idle flow has been redesigned to carry out
the "noirq" phases of device suspend and resume once in each cycle,
the code can be reordered to follow the specification that it is
based on more closely.
For this purpose, add ->prepare_late and ->restore_early platform
callbacks for suspend-to-idle, to be executed, respectively, after
the "noirq" phase of suspending devices and before the "noirq"
phase of resuming them and make ACPI use them for the invocation
of LPS0 _DSM functions as appropriate.
While at it, move the LPS0 entry requirements check to be made
before invoking Functions 3 and 5 of the LPS0 _DSM (also once
per cycle) as follows from the specification [1].
Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf # [1]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Tested-by: Kai-Heng Feng <[email protected]>
|
|
System firmware advertises the address of the 'Runtime
Configuration Interface table version 2 (RCI2)' via
an EFI Configuration Table entry. This code retrieves the RCI2
table from the address and exports it to sysfs as a binary
attribute 'rci2' under /sys/firmware/efi/tables directory.
The approach adopted is similar to the attribute 'DMI' under
/sys/firmware/dmi/tables.
RCI2 table contains BIOS HII in XML format and is used to populate
BIOS setup page in Dell EMC OpenManage Server Administrator tool.
The BIOS setup page contains BIOS tokens which can be configured.
Signed-off-by: Narendra K <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The SAL systab is an Itanium specific EFI configuration table, so
move its handling into arch/ia64 where it belongs.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The SGI UV UEFI machines are tightly coupled to the x86 architecture
so there is no need to keep any awareness of its existence in the
generic EFI layer, especially since we already have the infrastructure
to handle arch-specific configuration tables, and were even already
using it to some extent.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The function efi_is_table_address() and the associated array of table
pointers is specific to x86. Since we will be adding some more x86
specific tables, let's move this code out of the generic code first.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The patch moving bits into mutex.c was a little too much; by also
moving struct mutex_waiter a few less common CONFIGs would no longer
build.
Fixes: 5f35d5a66b3e ("locking/mutex: Make __mutex_owner static to mutex.c")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
|
|
The UWB and wusbcore code is long obsolete, so let us just move the code
out of the real part of the kernel and into the drivers/staging/
location with plans to remove it entirely in a few releases.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The following functions:
- clk_bulk_enable()
- clk_bulk_prepare()
- clk_bulk_disable()
- clk_bulk_unprepare()
already expect const clk_bulk_data * as a second parameter, however
their no-op version have mismatching prototypes that don't. Fix that.
While at it, constify the second argument of clk_bulk_prepare_enable()
and clk_bulk_disable_unprepare(), since the functions they are
comprised of already accept const clk_bulk_data *.
Signed-off-by: Andrey Smirnov <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Chris Healy <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
|
|
Reference counters are preferred to use refcount_t instead of
atomic_t.
This is because the implementation of refcount_t can prevent
overflows and detect possible use-after-free.
So convert atomic_t ref counters to refcount_t.
Signed-off-by: Chuhong Yuan <[email protected]>
Acked-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
All users pass PAGE_SIZE here, and if we wanted to support single entries
for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead
that uses the huge page size instead of having the caller calculate that
size once, just for the hmm code to verify it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The start, end and page_shift values are all saved in the range structure,
so we might as well use that for argument passing.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Booting a large arm64 server (HiSi D05) leads to the following
shouting at boot time:
[ 20.722132] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.730851] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.739560] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.748267] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.756975] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.765683] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[ 20.774391] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
and many more... Evidently, we expect something a bit more informative
than ____ptrval____, and certainly we want all of our domains, not just
the first one.
For that, turn the %p used to generate the fwnode name into something
that won't be repainted (%pa). Given that we've now fixed all users to
pass a pointer to a PA, it will actually do the right thing.
Acked-by: Thomas Gleixner <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
The function override_function_with_return() is defined separately for
each architecture and every architecture's definition is almost same
with each other. E.g. x86 and powerpc both define function in its own
asm/error-injection.h header and override_function_with_return() has
the same definition, the only difference is that x86 defines an extra
function just_return_func() but it is specific for x86 and is only used
by x86's override_function_with_return(), so don't need to export this
function.
This patch consolidates override_function_with_return() definition into
asm-generic/error-injection.h header, thus all architectures can use the
common definition. As result, the architecture specific headers are
removed; the include/linux/error-injection.h header also changes to
include asm-generic/error-injection.h header rather than architecture
header, furthermore, it includes linux/compiler.h for successful
compilation.
Reviewed-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Now that the driver core supports dev_groups for individual drivers,
expose that pointer to struct usb_device_driver to make it easier for USB
drivers to also use it.
Yes, users of usb_device_driver are much rare, but there are instances
already that use custom sysfs files, so adding this support will make
things easier for those drivers. usbip is one example, hubs might be
another one.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Now that the driver core supports dev_groups for individual drivers,
expose that pointer to struct usb_driver to make it easier for USB
drivers to also use it.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
MT7629 is an ARM platform SoC which has the same PCIe IP as MT7622.
The HW default value of its PCI host controller Device ID is invalid,
fix it to match the hardware implementation.
Signed-off-by: Jianjun Wang <[email protected]>
[[email protected]: commit log/minor spelling update]
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Andrew Murray <[email protected]>
Acked-by: Ryder Lee <[email protected]>
|
|
Just minor overlapping changes in the conflicts here.
Signed-off-by: David S. Miller <[email protected]>
|
|
Pull networking fixes from David Miller:
"Yeah I should have sent a pull request last week, so there is a lot
more here than usual:
1) Fix memory leak in ebtables compat code, from Wenwen Wang.
2) Several kTLS bug fixes from Jakub Kicinski (circular close on
disconnect etc.)
3) Force slave speed check on link state recovery in bonding 802.3ad
mode, from Thomas Falcon.
4) Clear RX descriptor bits before assigning buffers to them in
stmmac, from Jose Abreu.
5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF
loops, from Nishka Dasgupta.
6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean.
7) Need to hold sock across skb->destructor invocation, from Cong
Wang.
8) IP header length needs to be validated in ipip tunnel xmit, from
Haishuang Yan.
9) Use after free in ip6 tunnel driver, also from Haishuang Yan.
10) Do not use MSI interrupts on r8169 chips before RTL8168d, from
Heiner Kallweit.
11) Upon bridge device init failure, we need to delete the local fdb.
From Nikolay Aleksandrov.
12) Handle erros from of_get_mac_address() properly in stmmac, from
Martin Blumenstingl.
13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef
Kadlecsik.
14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with
some devices, so revert. From Johannes Berg.
15) Fix deadlock in rxrpc, from David Howells.
16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue
Haibing.
17) Fix mvpp2 crash on module removal, from Matteo Croce.
18) Fix race in genphy_update_link, from Heiner Kallweit.
19) bpf_xdp_adjust_head() stopped working with generic XDP when we
fixes generic XDP to support stacked devices properly, fix from
Jesper Dangaard Brouer.
20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from
David Ahern.
21) Several memory leaks in new sja1105 driver, from Vladimir Oltean"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits)
net: dsa: sja1105: Fix memory leak on meta state machine error path
net: dsa: sja1105: Fix memory leak on meta state machine normal path
net: dsa: sja1105: Really fix panic on unregistering PTP clock
net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled
net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus()
net: sched: sample: allow accessing psample_group with rtnl
net: sched: police: allow accessing police->params with rtnl
net: hisilicon: Fix dma_map_single failed on arm64
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: make hip04_tx_reclaim non-reentrant
tc-testing: updated vlan action tests with batch create/delete
net sched: update vlan action for batched events operations
net: stmmac: tc: Do not return a fragment entry
net: stmmac: Fix issues when number of Queues >= 4
net: stmmac: xgmac: Fix XGMAC selftests
be2net: disable bh with spin_lock in be_process_mcc
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs
net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
...
|
|
Fixes: c8424e776b09 ("MODSIGN: Export module signature definitions")
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
|
|
Now that blk_rq_map_kern can map both kmem and vmem, move internal
metadata mapping down to the lower level driver.
Reviewed-by: Javier González <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Hans Holmberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Move the redundant sync handling interface and wait for a completion in
the lightnvm core instead.
Reviewed-by: Javier González <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Hans Holmberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
__mutex_owner() should only be used by the mutex api's.
So, to put this restiction let's move the __mutex_owner()
function definition from linux/mutex.h to mutex.c file.
There exist functions that uses __mutex_owner() like
mutex_is_locked() and mutex_trylock_recursive(), So
to keep legacy thing intact move them as well and
export them.
Move mutex_waiter structure also to keep it private to the
file.
Signed-off-by: Mukesh Ojha <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Currently rwsems is the only locking primitive that lacks this
debug feature. Add it under CONFIG_DEBUG_RWSEMS and do the magic
checking in the locking fastpath (trylock) operation such that
we cover all cases. The unlocking part is pretty straightforward.
Signed-off-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Waiman Long <[email protected]>
Cc: [email protected]
Cc: Davidlohr Bueso <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core into usb-next
dev_groups added to struct driver
Persistent tag for others to pull this branch from
This is the first patch in a longer series that adds the ability for the
driver core to create and remove a list of attribute groups
automatically when the device is bound/unbound from a specific driver.
See:
https://lore.kernel.org/r/[email protected]
for details on this patch, and examples of how to use it in other
drivers.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
IMA will need to verify a PKCS#7 signature which has already been parsed.
For this reason, factor out the code which does that from
verify_pkcs7_signature() into a new function which takes a struct
pkcs7_message instead of a data buffer.
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Reviewed-by: Mimi Zohar <[email protected]>
Cc: David Howells <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
|
|
IMA will use the module_signature format for append signatures, so export
the relevant definitions and factor out the code which verifies that the
appended signature trailer is valid.
Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it
and be able to use mod_check_sig() without having to depend on either
CONFIG_MODULE_SIG or CONFIG_MODULES.
s390 duplicated the definition of struct module_signature so now they can
use the new <linux/module_signature.h> header instead.
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Acked-by: Jessica Yu <[email protected]>
Reviewed-by: Philipp Rudo <[email protected]>
Cc: Heiko Carstens <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
|
|
Add new attribute named "serial_number" as a standard interface for
user space to acquire the serial number of the device.
For ST-Ericsson SoCs this is exposed by the cryptically named "soc_id"
attribute, but this provides a human readable standardized name for this
property.
Tested-by: Vinod Koul <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Vaishali Thakkar <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Reviewed-by: Vinod Koul <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
|
|
There was no users left - so drop the code to support EARLY_EVENT_BLANK.
This patch removes the support in backlight,
and drop the notifier in fbmem.
That EARLY_EVENT_BLANK is not used can be verified that no driver set any of:
lcd_ops.early_set_power()
lcd_ops.r_early_set_power()
Noticed while browsing backlight code for other reasons.
v2:
- Fix changelog to say "EARLY_EVENT_BLANK" (Daniel)
Signed-off-by: Sam Ravnborg <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: Jingoo Han <[email protected]>
Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: "Michał Mirosław" <[email protected]>
Cc: Peter Rosin <[email protected]>
Cc: Gerd Hoffmann <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Daniel Vetter <[email protected]>
Acked-by: Daniel Thompson <[email protected]>
Acked-by: Bartlomiej Zolnierkiewicz <[email protected]>
Acked-by: Lee Jones <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Not even sure why it was there since the beginning. Just use IRQ
number in the sensor_data struct.
Signed-off-by: Denis Ciocca <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Also, when doing this, change kvm_arch_create_vcpu_debugfs() to return
void instead of an integer, as we should not care at all about if this
function actually does anything or not.
Cc: Paolo Bonzini <[email protected]>
Cc: "Radim Krčmář" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
There is no need for this function as all arches have to implement
kvm_arch_create_vcpu_debugfs() no matter what. A #define symbol
let us actually simplify the code.
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
After commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts), a
five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs
on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting
in the VMs after stress testing:
INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073)
Call Trace:
flush_tlb_mm_range+0x68/0x140
tlb_flush_mmu.part.75+0x37/0xe0
tlb_finish_mmu+0x55/0x60
zap_page_range+0x142/0x190
SyS_madvise+0x3cd/0x9c0
system_call_fastpath+0x1c/0x21
swait_active() sustains to be true before finish_swait() is called in
kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account
by kvm_vcpu_on_spin() loop greatly increases the probability condition
kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv
is enabled the yield-candidate vCPU's VMCS RVI field leaks(by
vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current
VMCS.
This patch fixes it by checking conservatively a subset of events.
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Fixes: 98f4a1467 (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop)
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Add a header include guard just in case.
Signed-off-by: Masahiro Yamada <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
Align in-parameter names for the declarations of pm_genpd_add|
remove_subdomain() and of_genpd_add_subdomain() according to their
implementations, as to improve consistency.
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Complete the abstraction of the ww_mutex inside the reservation object.
This allows us to add more handling and debugging to the reservation
object in the future.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/320761/
|
|
SCSI maintains its own driver private data hooked off of each SCSI
request, and the pridate data won't be freed after scsi_queue_rq()
returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver
(e.g. dm-rq) may need to retry these SCSI requests, before SCSI has
fully dispatched them, due to a lower level SCSI driver's resource
limitation identified in scsi_queue_rq(). Currently SCSI's per-request
private data is leaked when the upper layer driver (dm-rq) frees and
then retries these requests in response to BLK_STS_RESOURCE or
BLK_STS_DEV_RESOURCE returns from scsi_queue_rq().
This usecase is so specialized that it doesn't warrant training an
existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to
account for freeing its driver private data -- doing so would add an
extra branch for handling a special case that all other consumers of
SCSI (and blk-mq) won't ever need to worry about.
So the most pragmatic way forward is to delegate freeing SCSI driver
private data to the upper layer driver (dm-rq). Do so by adding
new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method
from dm-rq. A following commit will implement the .cleanup_rq() hook
in scsi_mq_ops.
Cc: Ewan D. Milne <[email protected]>
Cc: Bart Van Assche <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: [email protected]
Cc: <[email protected]>
Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch introduces a new request operation REQ_OP_ZONE_RESET_ALL.
This is useful for the applications like mkfs where it needs to reset
all the zones present on the underlying block device. As part for this
patch we also introduce new QUEUE_FLAG_ZONE_RESETALL which indicates the
queue zone reset all capability and corresponding helper macro.
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
See also commit 8f4236d9008b ("block: remove QUEUE_FLAG_BYPASS and ->bypass") # v5.0.
Cc: Christoph Hellwig <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Make it clear to the compiler and also to humans that the functions
that query request queue properties do not modify any member of the
request_queue data structure.
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
blk_mq_tagset_wait_completed_request() has been applied for waiting
for completed request's fn, so not necessary to use
blk_mq_complete_request_sync() any more.
Cc: Max Gurtovoy <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
blk-mq may schedule to call queue's complete function on remote CPU via
IPI, but doesn't provide any way to synchronize the request's complete
fn. The current queue freeze interface can't provide the synchonization
because aborted requests stay at blk-mq queues during EH.
In some driver's EH(such as NVMe), hardware queue's resource may be freed &
re-allocated. If the completed request's complete fn is run finally after the
hardware queue's resource is released, kernel crash will be triggered.
Prepare for fixing this kind of issue by introducing
blk_mq_tagset_wait_completed_request().
Cc: Max Gurtovoy <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
NVMe needs this function to decide if one request to be aborted has
been completed in normal IO path already.
So introduce it.
Cc: Max Gurtovoy <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|