Age | Commit message (Collapse) | Author | Files | Lines |
|
hmm_range_fault() calls find_vma() and walk_page_range() in a loop. This
is unnecessary duplication since walk_page_range() calls find_vma() in a
loop already.
Simplify hmm_range_fault() by defining a walk_test() callback function to
filter unhandled vmas.
This also fixes a bug where hmm_range_fault() was not checking start >=
vma->vm_start before checking vma->vm_flags so hmm_range_fault() could
return an error based on the wrong vma for the requested range.
It also fixes a bug when the vma has no read access and the caller did not
request a fault, there shouldn't be any error return code.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ralph Campbell <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
gntdev simply wants to monitor a specific VMA for any notifier events,
this can be done straightforwardly using mmu_interval_notifier_insert()
over the VMA's VA range.
The notifier should be attached until the original VMA is destroyed.
It is unclear if any of this is even sane, but at least a lot of duplicate
code is removed.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The only two users of this are now converted to use mmu_interval_notifier,
delete all the code and update hmm.rst.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jérôme Glisse <[email protected]>
Tested-by: Ralph Campbell <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Convert the collision-retry lock around hmm_range_fault to use the one now
provided by the mmu_interval notifier.
Although this driver does not seem to use the collision retry lock that
hmm provides correctly, it can still be converted over to use the
mmu_interval_notifier api instead of hmm_mirror without too much trouble.
This also deletes another place where a driver is associating additional
data (struct amdgpu_mn) with a mmu_struct.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Tested-by: Philip Yang <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.
For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.
Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Philip Yang <[email protected]>
Tested-by: Philip Yang <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.
Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().
Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads")
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Tested-by: Philip Yang <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Remove the hmm_mirror object and use the mmu_interval_notifier API instead
for the range, and use the normal mmu_notifier API for the general
invalidation callback.
While here re-organize the pagefault path so the locking pattern is clear.
nouveau is the only driver that uses a temporary range object and instead
forwards nearly every invalidation range directly to the HW. While this is
not how the mmu_interval_notifier was intended to be used, the overheads on
the pagefaulting path are similar to the existing hmm_mirror version.
Particularly since the interval tree will be small.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ralph Campbell <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
There is no reason to get the invalidate_range_start() callback via an
indirection through hmm_mirror, just register a normal notifier directly.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ralph Campbell <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The new API is an exact match for the needs of radeon.
For some reason radeon tries to remove overlapping ranges from the
interval tree, but interval trees (and mmu_interval_notifier_insert())
support overlapping ranges directly. Simply delete all this code.
Since this driver is missing a invalidate_range_end callback, but
still calls get_user_pages(), it cannot be correct against all races.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This converts one of the two users of mmu_notifiers to use the new API.
The conversion is fairly straightforward, however the existing use of
notifiers here seems to be racey.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Replace the internal interval tree based mmu notifier with the new common
mmu_interval_notifier_insert() API. This removes a lot of code and fixes a
deadlock that can be triggered in ODP:
zap_page_range()
mmu_notifier_invalidate_range_start()
[..]
ib_umem_notifier_invalidate_range_start()
down_read(&per_mm->umem_rwsem)
unmap_single_vma()
[..]
__split_huge_page_pmd()
mmu_notifier_invalidate_range_start()
[..]
ib_umem_notifier_invalidate_range_start()
down_read(&per_mm->umem_rwsem) // DEADLOCK
mmu_notifier_invalidate_range_end()
up_read(&per_mm->umem_rwsem)
mmu_notifier_invalidate_range_end()
up_read(&per_mm->umem_rwsem)
The umem_rwsem is held across the range_start/end as the ODP algorithm for
invalidate_range_end cannot tolerate changes to the interval
tree. However, due to the nested invalidation regions the second
down_read() can deadlock if there are competing writers. The new core code
provides an alternative scheme to solve this problem.
Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count")
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Artemy Kovalyov <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Only the function calls are stubbed out with static inlines that always
fail. This is the standard way to write a header for an optional component
and makes it easier for drivers that only optionally need HMM_MIRROR.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jérôme Glisse <[email protected]>
Tested-by: Ralph Campbell <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
hmm_mirror's handling of ranges does not use a sequence count which
results in this bug:
CPU0 CPU1
hmm_range_wait_until_valid(range)
valid == true
hmm_range_fault(range)
hmm_invalidate_range_start()
range->valid = false
hmm_invalidate_range_end()
range->valid = true
hmm_range_valid(range)
valid == true
Where the hmm_range_valid() should not have succeeded.
Adding the required sequence count would make it nearly identical to the
new mmu_interval_notifier. Instead replace the hmm_mirror stuff with
mmu_interval_notifier.
Co-existence of the two APIs is the first step.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jérôme Glisse <[email protected]>
Tested-by: Philip Yang <[email protected]>
Tested-by: Ralph Campbell <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Of the 13 users of mmu_notifiers, 8 of them use only
invalidate_range_start/end() and immediately intersect the
mmu_notifier_range with some kind of internal list of VAs. 4 use an
interval tree (i915_gem, radeon_mn, umem_odp, hfi1). 4 use a linked list
of some kind (scif_dma, vhost, gntdev, hmm)
And the remaining 5 either don't use invalidate_range_start() or do some
special thing with it.
It turns out that building a correct scheme with an interval tree is
pretty complicated, particularly if the use case is synchronizing against
another thread doing get_user_pages(). Many of these implementations have
various subtle and difficult to fix races.
This approach puts the interval tree as common code at the top of the mmu
notifier call tree and implements a shareable locking scheme.
It includes:
- An interval tree tracking VA ranges, with per-range callbacks
- A read/write locking scheme for the interval tree that avoids
sleeping in the notifier path (for OOM killer)
- A sequence counter based collision-retry locking scheme to tell
device page fault that a VA range is being concurrently invalidated.
This is based on various ideas:
- hmm accumulates invalidated VA ranges and releases them when all
invalidates are done, via active_invalidate_ranges count.
This approach avoids having to intersect the interval tree twice (as
umem_odp does) at the potential cost of a longer device page fault.
- kvm/umem_odp use a sequence counter to drive the collision retry,
via invalidate_seq
- a deferred work todo list on unlock scheme like RTNL, via deferred_list.
This makes adding/removing interval tree members more deterministic
- seqlock, except this version makes the seqlock idea multi-holder on the
write side by protecting it with active_invalidate_ranges and a spinlock
To minimize MM overhead when only the interval tree is being used, the
entire SRCU and hlist overheads are dropped using some simple
branches. Similarly the interval tree overhead is dropped when in hlist
mode.
The overhead from the mandatory spinlock is broadly the same as most of
existing users which already had a lock (or two) of some sort on the
invalidation path.
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Christian König <[email protected]>
Tested-by: Philip Yang <[email protected]>
Tested-by: Ralph Campbell <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
PROM only enables ethernet PHY on first Origin 200 module, so we must
do it ourselves for the second module.
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Srinivas Kandagatla <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Generation of fake subdevice ID had vendor and device ID swapped.
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Srinivas Kandagatla <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Pull last minute virtio bugfixes from Michael Tsirkin:
"Minor bugfixes all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: fix shrinker count
virtio_balloon: fix shrinker scan number of pages
virtio_console: allocate inbufs in add_port() only if it is needed
virtio_ring: fix return code on DMA mapping fails
|
|
rhashtable_lookup_fast() internally calls rcu_read_lock() then,
calls rhashtable_lookup(). So if rcu_read_lock() is already held,
rhashtable_lookup() is enough.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.5
Last set of patches for v5.5. Major features here 802.11ax support for
qtnfmac and airtime fairness support to mt76. And naturally smaller
fixes and improvements all over.
Major changes:
qtnfmac
* add 802.11ax support in AP mode
* enable offload bridging support
iwlwifi
* support TX/RX antennas reporting
mt76
* mt7615 smart carrier sense support
* aggregation statistics via debugfs
* airtime fairness (ATF) support
* mt76x0 OF mac address support
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Robert Schwebel says:
====================
here is v2 of the series converting the NFC documentation from txt to
rst. Thanks to Jonathan and Dave for the input.
Changes since (implicit) v1:
* replace code-block by more compact :: syntax
* really add the rst file to the index
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Now that the sphinx syntax has been fixed, change the document from txt
to rst and add it to the index.
Signed-off-by: Robert Schwebel <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Silence this warning:
Documentation/networking/nfc.rst:113: WARNING: Definition list ends without
a blank line; unexpected unindent.
Signed-off-by: Robert Schwebel <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix this warning:
Documentation/networking/nfc.rst:87: WARNING: Bullet list ends without
a blank line; unexpected unindent.
Signed-off-by: Robert Schwebel <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Change the block diagram to match the sphinx syntax. This will make it
possible to switch this file to rst in the future.
Signed-off-by: Robert Schwebel <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The headlines in this file do are not in the standard kernel docu-
mentation headline format. Change it, so this file can be switched to
rst in the future.
Signed-off-by: Robert Schwebel <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When a phydev is created, the speed and duplex are set to zero and
-1 respectively, rather than using the predefined SPEED_UNKNOWN and
DUPLEX_UNKNOWN constants.
There is a window at initialisation time where we may report link
down using the 0/-1 values. Tidy this up and use the predefined
constants, so debug doesn't complain with:
"Unsupported (update phy-core.c)/Unsupported (update phy-core.c)"
when the speed and duplex settings are printed.
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There are no users of phy_ethtool_sset() in the kernel anymore, and
as of commit 3c1bcc8614db ("net: ethernet: Convert phydev advertize
and supported from u32 to link mode"), the implementation is slightly
buggy - it doesn't correctly check the masked advertising mask as it
used to.
Remove it, and update the phy documentation to refer to its replacement
function.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This commit reverts commit 91e6015b082b ("bpf: Emit audit messages
upon successful prog load and unload") and its follow up commit
7599a896f2e4 ("audit: Move audit_log_task declaration under
CONFIG_AUDITSYSCALL") as requested by Paul Moore. The change needs
close review on linux-audit, tests etc.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 37e4c997dadf ("KVM: VMX: validate individual bits of guest
MSR_IA32_FEATURE_CONTROL") broke the KVM_SET_MSRS ABI by instituting
new constraints on the data values that kvm would accept for the guest
MSR, IA32_FEATURE_CONTROL. Perhaps these constraints should have been
opt-in via a new KVM capability, but they were applied
indiscriminately, breaking at least one existing hypervisor.
Relax the constraints to allow either or both of
FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX and
FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX to be set when nVMX is
enabled. This change is sufficient to fix the aforementioned breakage.
Fixes: 37e4c997dadf ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL")
Signed-off-by: Jim Mattson <[email protected]>
Reviewed-by: Liran Alon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Acquire kvm->srcu for the duration of ->set_nested_state() to fix a bug
where nVMX derefences ->memslots without holding ->srcu or ->slots_lock.
The other half of nested migration, ->get_nested_state(), does not need
to acquire ->srcu as it is a purely a dump of internal KVM (and CPU)
state to userspace.
Detected as an RCU lockdep splat that is 100% reproducible by running
KVM's state_test selftest with CONFIG_PROVE_LOCKING=y. Note that the
failing function, kvm_is_visible_gfn(), is only checking the validity of
a gfn, it's not actually accessing guest memory (which is more or less
unsupported during vmx_set_nested_state() due to incorrect MMU state),
i.e. vmx_set_nested_state() itself isn't fundamentally broken. In any
case, setting nested state isn't a fast path so there's no reason to go
out of our way to avoid taking ->srcu.
=============================
WARNING: suspicious RCU usage
5.4.0-rc7+ #94 Not tainted
-----------------------------
include/linux/kvm_host.h:626 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by evmcs_test/10939:
#0: ffff88826ffcb800 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x630 [kvm]
stack backtrace:
CPU: 1 PID: 10939 Comm: evmcs_test Not tainted 5.4.0-rc7+ #94
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack+0x68/0x9b
kvm_is_visible_gfn+0x179/0x180 [kvm]
mmu_check_root+0x11/0x30 [kvm]
fast_cr3_switch+0x40/0x120 [kvm]
kvm_mmu_new_cr3+0x34/0x60 [kvm]
nested_vmx_load_cr3+0xbd/0x1f0 [kvm_intel]
nested_vmx_enter_non_root_mode+0xab8/0x1d60 [kvm_intel]
vmx_set_nested_state+0x256/0x340 [kvm_intel]
kvm_arch_vcpu_ioctl+0x491/0x11a0 [kvm]
kvm_vcpu_ioctl+0xde/0x630 [kvm]
do_vfs_ioctl+0xa2/0x6c0
ksys_ioctl+0x66/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x54/0x200
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f59a2b95f47
Fixes: 8fcc4b5923af5 ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Fold shared_msr_update() into its sole user to eliminate its pointless
bounds check, its godawful printk, its misleading comment (it's called
under a global lock), and its woefully inaccurate name.
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The jump label out_free_1 and out_free_2 deal with
the same stuff, so git rid of one and rename the
label out_free_0a to retain the label name order.
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
A recent change inadvertently exported a static function, which results
in modpost throwing a warning. Fix it.
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Sean Christopherson <[email protected]>
Cc: [email protected]
Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf report:
Jin Yao:
- Allow entering the annotation view (symbol source/assembly +
overhead/cycles/etc column) from the 'perf report --total-cycles'
interface.
E.g.:
# perf record --all-cpus --branch-any --all-kernel
^C[ perf record: Woken up 5 times to write data ]
#
# perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK,
read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1,
bpf_event: 1, branch_sample_type: ANY
#
# perf report --total-cycles
#
# Samples: 78762 of event 'cycles'
Sampled Sampled Avg Avg
Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object
1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux]
1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux]
0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux]
0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux]
0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel]
0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux]
0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux]
0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux]
0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux]
0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux]
Then press 'A' or Enter on one of those lines, just like with 'perf top', say
the top one: [msr.h:105 -> msr.h:166], then this shows up:
Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762
native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period]
Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%)
│
│ Disassembly of section .text:
│
│ ffffffff8106c480 <native_write_msr>:
│ __wrmsr():
│ return EAX_EDX_VAL(val, low, high);
│ }
│
│ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high)
│ {
│ asm volatile("1: wrmsr\n"
49.16 │0.02 mov %edi,%ecx
│0.02 mov %esi,%eax
│0.02 wrmsr
│ arch_static_branch():
│ #include <linux/stringify.h>
│ #include <linux/types.h>
│
│ static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
│ {
│ asm_volatile_goto("1:"
0.79 │0.02 nop
│ native_write_msr():
│ {
│ __wrmsr(msr, low, high);
│
│ if (msr_tracepoint_active(__tracepoint_write_msr))
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ }
50.05 │0.02 254 ← retq
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ shl $0x20,%rdx
│ mov %esi,%esi
│ or %rdx,%rsi
│ xor %edx,%edx
│ → jmpq do_trace_write_msr
We need to improve this to show the source code line numbers in the
annotation view, so one can go from that program block to the annotation view
and see those source code line numbers straight away.
auxtrace/Intel PT:
Adrian Hunter:
- Add support for AUX area sampling, requires new functionality that
will land in 5.5, its already in tip.
This includes kernel capability querying so that it fails gracefully
with older kernels, duimping aux area samples in 'perf report -D' and
'perf script'.
perf.data:
Alexey Budankov:
- Fix decompression of PERF_RECORD_COMPRESSED records.
core:
Arnaldo Carvalho de Melo:
- Use the 'dcacheline' cmp routine to find the right DSOs taking into
account the 'maj', 'min', 'ino' and 'ino_generation', that got moved
from 'struct map' to 'struct dso', where it belongs.
This further reduces the size of 'struct map', there is still more
work to do to maybe get it to max one cacheline.
libtraceevent:
Hewenliang:
- Fix memory leakage in copy_filter_type().
Sudip Mukherjee:
- Fix header installation.
perf parse:
Ian Rogers :
- Fix potential memory leak when handling tracepoint errors, found using
LLVM's libFuzzer.
perf probe:
Colin Ian King:
- Fix spelling mistake "addrees" -> "address".
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Part of the documentation is taken from the README of the userspace
utils (https://github.com/vitorafsr/i8kutils). The license is GPL-2+
and the author Massimo Dal Zotto is already credited as author of
the module. Therefore there should be no copyright problem.
I also added a paragraph with specific information on the experimental
support for automatic BIOS fan control.
Signed-off-by: Giovanni Mascellani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[groeck: Fixed some of the documentation warnings]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
This patch exports standard hwmon pwmX_enable sysfs attribute for
enabling or disabling automatic fan control by BIOS. Standard value
"1" is for disabling automatic BIOS fan control and value "2" for
enabling.
By default BIOS auto mode is enabled by laptop firmware.
When BIOS auto mode is enabled, custom fan speed value (set via hwmon
pwmX sysfs attribute) is overwritten by SMM in few seconds and
therefore any custom settings are without effect. So this is reason
why implementing option for disabling BIOS auto mode is needed.
So finally this patch allows kernel to set and control fan speed on
laptops, but it can be dangerous (like setting speed of other fans).
The SMM commands to enable or disable automatic fan control are not
documented and are not the same on all Dell laptops. Therefore a
whitelist is used to send the correct codes only on laptopts for which
they are known.
This patch was originally developed by Pali Rohár; later Giovanni
Mascellani implemented the whitelist.
Signed-off-by: Giovanni Mascellani <[email protected]>
Co-Developed-by: Pali Rohár <[email protected]>
Signed-off-by: Pali Rohár <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[groeck: Fixed checkpatch warnings]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Conflicts:
arch/riscv/boot/Makefile
arch/riscv/include/asm/sbi.h
|
|
|
|
|
|
|
|
|
|
Edward Cree says:
====================
A series of changes to how we check filters for expiry, manage how much
of that work to do & when, etc.
Prompted by some pathological behaviour under heavy load, which was
Reported-by: David Ahern <[email protected]>
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If there's no traffic on a channel, its ARFS expiry work will never get
scheduled by efx_poll() as that isn't being run.
So make efx_filter_rfs_expire() reschedule itself to run after 30 seconds.
Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Report the number of successful and failed insertions, and also the
current count of filters, to aid in tuning e.g. rps_flow_cnt.
Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In high connection count usage, the NIC's filter table may be filled with
sufficiently many ARFS filters that further insertions fail. As this
does not represent a correctness issue, do not log the resulting MCDI
errors. Add a debug-level message under the (by default disabled)
rx_status category instead; and take the opportunity to do a little extra
expiry work.
Since there are now multiple workitems able to call __efx_filter_rfs_expire
on a given channel, it is possible for them to race and thus pass quotas
which, combined, exceed rfs_filter_count. Thus, don't WARN_ON if we loop
all the way around the table with quota left over.
Signed-off-by: Edward Cree <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The old rfs_filters_added method for determining the quota could potentially
allow the NIC to become filled with old filters, which never get tested for
expiry. Instead, explicitly make expiry check work depend on the number of
filters installed, and don't count checking slots without filters in as
doing work. This guarantees that each filter will be checked for expiry at
least once every thirty seconds (assuming the channel to which it belongs is
NAPI polling actively) regardless of fill level.
Signed-off-by: Edward Cree <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
This series contains updates to the ice driver only.
Bruce updates the driver to store the number of functions the device has
so that it won't have to compute it when setting safe mode capabilities.
Adds a check to adjust the reporting of capabilities for devices with
more than 4 ports, which differ for devices with less than 4 ports.
Brett adds a helper function to determine if the VF is allowed to do
VLAN operations based on the host's VF configuration. Also adds a new
function that initializes VLAN stripping (enabled/disabled) for the VF
based on the device supported capabilities. Adds a check if the vector
index is valid with the respect to the number of transmit and receive
queues configured when we set coalesce settings for DCB. Adds a check
if the promisc_mask contains ICE_PROMISC_VLAN_RX or ICE_PROMISC_VLAN_TX
so that VLAN 0 promiscuous rules to be removed. Add a helper macro for
a commonly used de-reference of a pointer to &pf->dev->pdev.
Jesse fixes an issue where if an invalid virtchnl request from the VF,
the driver would return uninitialized data to the VF from the PF stack,
so ensure the stack variable is initialized earlier. Add helpers to the
virtchnl interface make the reporting of strings consistent and help
reduce stack space. Implements VF statistics gathering via the kernel
ndo_get_vf_stats().
Akeem ensures we disable the state flag for each VF when its resources
are returned to the device.
Tony does additional cleanup in the driver to ensure the when we
allocate and free memory within the same function, we should not be
using devm_* variants; use regular alloc and free functions.
Henry implements code to query and set the number of channels on the
primary VSI for a PF via ethtool.
Jake cleans up needless NULL checks in ice_sched_cleanup_all().
Kevin updates the firmware API version to align with current NVM images.
v2: Added "Fixes:" tag to patch 5 commit description and added the use
of netif_is_rxfh_configured() in patch 13 to see if RSS has been
configured by the user, if so do not overwrite that configuration.
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fix from Dmitry Torokhov:
"Just a single revert as RMI mode should not have been enabled for this
model [yet?]"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
|
|
Cc: Eric Dumazet <[email protected]>
Signed-off-by: Maciej Żenczykowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Since MIPS architecture has a sparse syscall array, select the
HAVE_SPARSE_SYSCALL_NR to save space.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Hassan Naveed <[email protected]>
Reviewed-by: Paul Burton <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|