aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-10-02Merge tag 'hwmon-for-v5.15-rc4' of ↵Linus Torvalds11-125/+101
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fixed various potential NULL pointer accesses in w8379* drivers - Improved error handling, fault reporting, and fixed rounding in thmp421 driver - Fixed error handling in ltc2947 driver - Added missing attribute to pmbus/mp2975 driver - Fixed attribute values in pbus/ibm-cffps, occ, and mlxreg-fan drivers - Removed unused residual code from k10temp driver * tag 'hwmon-for-v5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controller hwmon: (pmbus/ibm-cffps) max_power_out swap changes hwmon: (occ) Fix P10 VRM temp sensors hwmon: (ltc2947) Properly handle errors when looking for the external clock hwmon: (tmp421) fix rounding for negative values hwmon: (tmp421) report /PVLD condition as fault hwmon: (tmp421) handle I2C errors hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs hwmon: (k10temp) Remove residues of current and voltage
2021-10-02Merge tag '5.15-rc3-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds12-342/+294
Pull ksmbd server fixes from Steve French: "Eleven fixes for the ksmbd kernel server, mostly security related: - an important fix for disabling weak NTLMv1 authentication - seven security (improved buffer overflow checks) fixes - fix for wrong infolevel struct used in some getattr/setattr paths - two small documentation fixes" * tag '5.15-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: missing check for NULL in convert_to_nt_pathname() ksmbd: fix transform header validation ksmbd: add buffer validation for SMB2_CREATE_CONTEXT ksmbd: add validation in smb2 negotiate ksmbd: add request buffer validation in smb2_set_info ksmbd: use correct basic info level in set_file_basic_info() ksmbd: remove NTLMv1 authentication ksmbd: fix documentation for 2 functions MAINTAINERS: rename cifs_common to smbfs_common in cifs and ksmbd entry ksmbd: fix invalid request buffer access in compound ksmbd: remove RFC1002 check in smb2 request
2021-10-02Merge tag 'scsi-fixes' of ↵Linus Torvalds5-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five fairly minor fixes and spelling updates, all in drivers. Even though the ufs fix is in tracing, it's a potentially exploitable use beyond end of array bug" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: csiostor: Add module softdep on cxgb4 scsi: qla2xxx: Fix excessive messages during device logout scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported" scsi: ses: Fix unsigned comparison with less than zero scsi: ufs: Fix illegal offset in UPIU event trace
2021-10-02Merge tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-blockLinus Torvalds5-27/+31
Pull block fixes from Jens Axboe: "A few block fixes for this release: - Revert a BFQ commit that causes breakage for people. Unfortunately it was auto-selected for stable as well, so now 5.14.7 suffers from it too. Hopefully stable will pick up this revert quickly too, so we can remove the issue on that end as well. - Add a quirk for Apple NVMe controllers, which due to their non-compliance broke due to the introduction of command sequences (Keith) - Use shifts in nbd, fixing a __divdi3 issue (Nick)" * tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-block: nbd: use shifts rather than multiplies Revert "block, bfq: honor already-setup queue merges" nvme: add command id quirk for apple controllers
2021-10-02Merge tag 'io_uring-5.15-2021-10-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-19/+3
Pull io_uring fixes from Jens Axboe: "Two fixes in here: - The signal issue that was discussed start of this week (me). - Kill dead fasync support in io_uring. Looks like it was broken since io_uring was initially merged, and given that nobody has ever complained about it, let's just kill it (Pavel)" * tag 'io_uring-5.15-2021-10-01' of git://git.kernel.dk/linux-block: io_uring: kill fasync io-wq: exclusively gate signal based exit on get_signal() return
2021-10-02Merge tag 'libnvdimm-fixes-5.15-rc4' of ↵Linus Torvalds2-4/+13
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A fix for a regression added this cycle in the pmem driver, and for a long standing bug for failed NUMA node lookups on ARM64. This has appeared in -next for several days with no reported issues. Summary: - Fix a regression that caused the sysfs ABI for pmem block devices to not be registered. This fails the nvdimm unit tests and dax xfstests. - Fix numa node lookups for dax-kmem memory (device-dax memory assigned to the page allocator) on ARM64" * tag 'libnvdimm-fixes-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/pmem: fix creating the dax group ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect
2021-10-02cachefiles: Fix oops in trace_cachefiles_mark_buried due to NULL objectDave Wysochanski1-1/+1
In cachefiles_mark_object_buried, the dentry in question may not have an owner, and thus our cachefiles_object pointer may be NULL when calling the tracepoint, in which case we will also not have a valid debug_id to print in the tracepoint. Check for NULL object in the tracepoint and if so, just set debug_id to MAX_UINT as was done in 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces"). This fixes the following oops: FS-Cache: Cache "mycache" added (type cachefiles) CacheFiles: File cache on vdc registered ... Workqueue: fscache_object fscache_object_work_func [fscache] RIP: 0010:trace_event_raw_event_cachefiles_mark_buried+0x4e/0xa0 [cachefiles] .... Call Trace: cachefiles_mark_object_buried+0xa5/0xb0 [cachefiles] cachefiles_bury_object+0x270/0x430 [cachefiles] cachefiles_walk_to_object+0x195/0x9c0 [cachefiles] cachefiles_lookup_object+0x5a/0xc0 [cachefiles] fscache_look_up_object+0xd7/0x160 [fscache] fscache_object_work_func+0xb2/0x340 [fscache] process_one_work+0x1f1/0x390 worker_thread+0x53/0x3e0 kthread+0x127/0x150 Fixes: 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces") Signed-off-by: Dave Wysochanski <[email protected]> Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2021-10-02drm/i915: fix blank screen booting crashesHugh Dickins1-2/+3
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and suprising "culprits", none of them the actual culprit. netconsole (with init_netconsole() hacked to call i915_init() when logging has started, instead of by module_init()) tells the story: kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245! with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify(). I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that function needs to be 4-byte aligned. Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Hugh Dickins <[email protected]> Tested-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-02hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary ↵Nadezda Lutovinova1-15/+11
structure field If driver read tmp value sufficient for (tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients(). The patch fixes possible NULL pointer dereference by removing lm75[]. Found by Linux Driver Verification project (linuxtesting.org). Cc: [email protected] Signed-off-by: Nadezda Lutovinova <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Dropped unnecessary continuation lines, fixed multi-line alignments] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-02hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary ↵Nadezda Lutovinova1-17/+11
structure field If driver read val value sufficient for (val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients(). The patch fixes possible NULL pointer dereference by removing lm75[]. Found by Linux Driver Verification project (linuxtesting.org). Cc: [email protected] Signed-off-by: Nadezda Lutovinova <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Dropped unnecessary continuation lines, fixed multipline alignment] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-02hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary ↵Nadezda Lutovinova1-18/+11
structure field If driver read val value sufficient for (val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients(). The patch fixes possible NULL pointer dereference by removing lm75[]. Found by Linux Driver Verification project (linuxtesting.org). Cc: [email protected] Signed-off-by: Nadezda Lutovinova <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Dropped unnecessary continuation lines, fixed multi-line alignment] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-02hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controllerVadim Pasternak1-1/+1
Add missed attribute for reading POUT from page 1. It is supported by device, but has been missed in initial commit. Fixes: 2c6fcbb21149 ("hwmon: (pmbus) Add support for MPS Multi-phase mp2975 controller") Signed-off-by: Vadim Pasternak <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-02hwmon: (pmbus/ibm-cffps) max_power_out swap changesBrandon Wyman1-2/+8
The bytes for max_power_out from the ibm-cffps devices differ in byte order for some power supplies. The Witherspoon power supply returns the bytes in MSB/LSB order. The Rainier power supply returns the bytes in LSB/MSB order. The Witherspoon power supply uses version cffps1. The Rainier power supply should use version cffps2. If version is cffps1, swap the bytes before output to max_power_out. Tested: Witherspoon before: 3148. Witherspoon after: 3148. Rainier before: 53255. Rainier after: 2000. Signed-off-by: Brandon Wyman <[email protected]> Reviewed-by: Eddie James <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Replaced yoda programming] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-02hwmon: (occ) Fix P10 VRM temp sensorsEddie James1-12/+5
The P10 (temp sensor version 0x10) doesn't do the same VRM status reporting that was used on P9. It just reports the temperature, so drop the check for VRM fru type in the sysfs show function, and don't set the name to "alarm". Fixes: db4919ec86 ("hwmon: (occ) Add new temperature sensor type") Signed-off-by: Eddie James <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2021-10-01Merge tag 's390-5.15-4' of ↵Linus Torvalds3-13/+45
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fix from Vasily Gorbik: "One fix for 5.15-rc4: Avoid CIO excessive path-verification requests, which might cause unwanted delays" * tag 's390-5.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: avoid excessive path-verification requests
2021-10-01thermal: Update information in MAINTAINERSRafael J. Wysocki1-3/+5
Because Rui is now going to focus on work that is not related to the maintenance of the thermal subsystem in the kernel, Rafael will start to help Daniel with handling the development process as a new member of the thermal maintainers team. Rui will continue to review patches in that area. The thermal development process flow will change so that the material from the thermal git tree will be merged into the thermal branch of the linux-pm.git tree before going into the mainline. Update the information in MAINTAINERS accordingly. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Zhang Rui <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds7-32/+81
Pull more kvm fixes from Paolo Bonzini: "Small x86 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Ensure all migrations are performed when test is affined KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm x86/kvmclock: Move this_cpu_pvti into kvmclock.h selftests: KVM: Don't clobber XMM register when read KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
2021-10-01Merge tag 'drm-fixes-2021-10-01' of git://anongit.freedesktop.org/drm/drmLinus Torvalds25-77/+81
Pull drm fixes from Daniel Vetter: "Dave is out on a long w/e, should be back next week. Nothing nefarious, just a bunch of driver fixes: amdgpu, i915, tegra, and one exynos driver fix" * tag 'drm-fixes-2021-10-01' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix drm/amdgpu: check tiling flags when creating FB on GFX8- drm/amd/display: Pass PCI deviceid into DC drm/amd/display: initialize backlight_ramping_override to false drm/amdgpu: correct initial cp_hqd_quantum for gfx9 drm/amd/display: Fix Display Flicker on embedded panels drm/amdgpu: fix gart.bo pin_count leak drm/i915: Remove warning from the rps worker drm/i915/request: fix early tracepoints drm/i915/guc, docs: Fix pdfdocs build error by removing nested grid gpu: host1x: Plug potential memory leak gpu/host1x: fence: Make spinlock static drm/tegra: uapi: Fix wrong mapping end address in case of disabled IOMMU drm/tegra: dc: Remove unused variables drm/exynos: Make use of the helper function devm_platform_ioremap_resource() drm/i915/gvt: fix the usage of ww lock in gvt scheduler.
2021-10-01io_uring: kill fasyncPavel Begunkov1-15/+2
We have never supported fasync properly, it would only fire when there is something polling io_uring making it useless. The original support came in through the initial io_uring merge for 5.1. Since it's broken and nobody has reported it, get rid of the fasync bits. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/2f7ca3d344d406d34fa6713824198915c41cea86.1633080236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2021-10-01Merge tag 'iommu-fixes-v5.15-rc3' of ↵Linus Torvalds2-24/+38
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Two fixes for the new Apple DART driver to fix a kernel panic and a stale data usage issue - Intel VT-d fix for how PCI device ids are printed * tag 'iommu-fixes-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/dart: Clear sid2group entry when a group is freed iommu/vt-d: Drop "0x" prefix from PCI bus & device addresses iommu/dart: Remove iommu_flush_ops
2021-10-01Merge tag 'exynos-drm-fixes-for-v5.15-rc4' of ↵Daniel Vetter9-31/+9
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes One cleanup - Use devm_platform_ioremap_resource() helper function instead of old one. Signed-off-by: Daniel Vetter <[email protected]> From: Inki Dae <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-10-01Merge tag 'amd-drm-fixes-5.15-2021-09-29' of ↵Daniel Vetter7-11/+53
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.15-2021-09-29: amdgpu: - gart pin count fix - eDP flicker fix - GFX9 MQD fix - Display fixes - Tiling flags fix for pre-GFX9 - SDMA resume fix for S0ix Signed-off-by: Daniel Vetter <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-10-01Merge tag 'drm-intel-fixes-2021-09-30' of ↵Daniel Vetter5-23/+14
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.15-rc4: - Fix GVT scheduler ww lock usage - Fix pdfdocs documentation build - Fix request early tracepoints - Fix an invalid warning from rps worker Signed-off-by: Daniel Vetter <[email protected]> From: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-10-01sched: Always inline is_percpu_thread()Peter Zijlstra1-1/+1
vmlinux.o: warning: objtool: check_preemption_disabled()+0x81: call to is_percpu_thread() leaves .noinstr.text section Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-10-01sched/fair: Null terminate buffer when updating tunable_scalingMel Gorman1-1/+7
This patch null-terminates the temporary buffer in sched_scaling_write() so kstrtouint() does not return failure and checks the value is valid. Before: $ cat /sys/kernel/debug/sched/tunable_scaling 1 $ echo 0 > /sys/kernel/debug/sched/tunable_scaling -bash: echo: write error: Invalid argument $ cat /sys/kernel/debug/sched/tunable_scaling 1 After: $ cat /sys/kernel/debug/sched/tunable_scaling 1 $ echo 0 > /sys/kernel/debug/sched/tunable_scaling $ cat /sys/kernel/debug/sched/tunable_scaling 0 $ echo 3 > /sys/kernel/debug/sched/tunable_scaling -bash: echo: write error: Invalid argument Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to debugfs") Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Vincent Guittot <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-01sched/fair: Add ancestors of unthrottled undecayed cfs_rqMichal Koutný1-1/+5
Since commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") we add cfs_rqs with no runnable tasks but not fully decayed into the load (leaf) list. We may ignore adding some ancestors and therefore breaking tmp_alone_branch invariant. This broke LTP test cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling"). I noticed the named test still fails even with the fix (but with low probability, 1 in ~1000 executions of the test). The reason is when bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly. Fix this by adding ancestors if we notice the unthrottled cfs_rq was added to the load list. Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") Signed-off-by: Michal Koutný <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Reviewed-by: Odin Ugedal <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-01perf/core: fix userpage->time_enabled of inactive eventsSong Liu2-5/+33
Users of rdpmc rely on the mmapped user page to calculate accurate time_enabled. Currently, userpage->time_enabled is only updated when the event is added to the pmu. As a result, inactive event (due to counter multiplexing) does not have accurate userpage->time_enabled. This can be reproduced with something like: /* open 20 task perf_event "cycles", to create multiplexing */ fd = perf_event_open(); /* open task perf_event "cycles" */ userpage = mmap(fd); /* use mmap and rdmpc */ while (true) { time_enabled_mmap = xxx; /* use logic in perf_event_mmap_page */ time_enabled_read = read(fd).time_enabled; if (time_enabled_mmap > time_enabled_read) BUG(); } Fix this by updating userpage for inactive events in merge_sched_in. Suggested-by: Peter Zijlstra (Intel) <[email protected]> Reported-and-tested-by: Lucian Grijincu <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-10-01perf/x86/intel: Update event constraints for ICXKan Liang1-0/+1
According to the latest event list, the event encoding 0xEF is only available on the first 4 counters. Add it into the event constraints table. Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support") Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-10-01perf/x86: Reset destroy callback on event init failureAnand K Mistry1-0/+1
perf_init_event tries multiple init callbacks and does not reset the event state between tries. When x86_pmu_event_init runs, it unconditionally sets the destroy callback to hw_perf_event_destroy. On the next init attempt after x86_pmu_event_init, in perf_try_init_event, if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy callback will be run. However, if the next init didn't set the destroy callback, hw_perf_event_destroy will be run (since the callback wasn't reset). Looking at other pmu init functions, the common pattern is to only set the destroy callback on a successful init. Resetting the callback on failure tries to replicate that pattern. This was discovered after commit f11dd0d80555 ("perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second) run of the perf tool after a reboot results in 0 samples being generated. The extra run of hw_perf_event_destroy results in active_events having an extra decrement on each perf run. The second run has active_events == 0 and every subsequent run has active_events < 0. When active_events == 0, the NMI handler will early-out and not record any samples. Signed-off-by: Anand K Mistry <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid
2021-10-01objtool: Teach get_alt_entry() about more relocation typesPeter Zijlstra1-7/+25
Occasionally objtool encounters symbol (as opposed to section) relocations in .altinstructions. Typically they are the alternatives written by elf_add_alternative() as encountered on a noinstr validation run on vmlinux after having already ran objtool on the individual .o files. Basically this is the counterpart of commit 44f6a7c0755d ("objtool: Fix seg fault with Clang non-section symbols"), because when these new assemblers (binutils now also does this) strip the section symbols, elf_add_reloc_to_insn() is forced to emit symbol based relocations. As such, teach get_alt_entry() about different relocation types. Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls") Reported-by: Stephen Rothwell <[email protected]> Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-01KVM: x86: only allocate gfn_track when necessaryDavid Stevens5-8/+93
Avoid allocating the gfn_track arrays if nothing needs them. If there are no external to KVM users of the API (i.e. no GVT-g), then page tracking is only needed for shadow page tables. This means that when tdp is enabled and there are no external users, then the gfn_track arrays can be lazily allocated when the shadow MMU is actually used. This avoid allocations equal to .05% of guest memory when nested virtualization is not used, if the kernel is compiled without GVT-g. Signed-off-by: David Stevens <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86: add config for non-kvm users of page trackingDavid Stevens2-0/+4
Add a config option that allows kvm to determine whether or not there are any external users of page tracking. Signed-off-by: David Stevens <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01nSVM: Check for reserved encodings of TLB_CONTROL in nested VMCBKrish Sadhukhan1-0/+15
According to section "TLB Flush" in APM vol 2, "Support for TLB_CONTROL commands other than the first two, is optional and is indicated by CPUID Fn8000_000A_EDX[FlushByAsid]. All encodings of TLB_CONTROL not defined in the APM are reserved." Signed-off-by: Krish Sadhukhan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01kvm: use kvfree() in kvm_arch_free_vm()Juergen Gross5-11/+11
By switching from kfree() to kvfree() in kvm_arch_free_vm() Arm64 can use the common variant. This can be accomplished by adding another macro __KVM_HAVE_ARCH_VM_FREE, which will be used only by x86 for now. Further simplification can be achieved by adding __kvm_arch_free_vm() doing the common part. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86: Expose Predictive Store Forwarding DisableBabu Moger1-1/+9
Predictive Store Forwarding: AMD Zen3 processors feature a new technology called Predictive Store Forwarding (PSF). PSF is a hardware-based micro-architectural optimization designed to improve the performance of code execution by predicting address dependencies between loads and stores. How PSF works: It is very common for a CPU to execute a load instruction to an address that was recently written by a store. Modern CPUs implement a technique known as Store-To-Load-Forwarding (STLF) to improve performance in such cases. With STLF, data from the store is forwarded directly to the load without having to wait for it to be written to memory. In a typical CPU, STLF occurs after the address of both the load and store are calculated and determined to match. PSF expands on this by speculating on the relationship between loads and stores without waiting for the address calculation to complete. With PSF, the CPU learns over time the relationship between loads and stores. If STLF typically occurs between a particular store and load, the CPU will remember this. In typical code, PSF provides a performance benefit by speculating on the load result and allowing later instructions to begin execution sooner than they otherwise would be able to. The details of security analysis of AMD predictive store forwarding is documented here. https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf Predictive Store Forwarding controls: There are two hardware control bits which influence the PSF feature: - MSR 48h bit 2 – Speculative Store Bypass (SSBD) - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD) The PSF feature is disabled if either of these bits are set. These bits are controllable on a per-thread basis in an SMT system. By default, both SSBD and PSFD are 0 meaning that the speculation features are enabled. While the SSBD bit disables PSF and speculative store bypass, PSFD only disables PSF. PSFD may be desirable for software which is concerned with the speculative behavior of PSF but desires a smaller performance impact than setting SSBD. Support for PSFD is indicated in CPUID Fn8000_0008 EBX[28]. All processors that support PSF will also support PSFD. Linux kernel does not have the interface to enable/disable PSFD yet. Plan here is to expose the PSFD technology to KVM so that the guest kernel can make use of it if they wish to. Signed-off-by: Babu Moger <[email protected]> Message-Id: <163244601049.30292.5855870305350227855.stgit@bmoger-ubuntu> [Keep feature private to KVM, as requested by Borislav Petkov. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Avoid memslot lookup in make_spte and mmu_try_to_unsync_pagesDavid Matlack8-24/+21
mmu_try_to_unsync_pages checks if page tracking is active for the given gfn, which requires knowing the memslot. We can pass down the memslot via make_spte to avoid this lookup. The memslot is also handy for make_spte's marking of the gfn as dirty: we can test whether dirty page tracking is enabled, and if so ensure that pages are mapped as writable with 4K granularity. Apart from the warning, no functional change is intended. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Avoid memslot lookup in rmap_addDavid Matlack2-23/+16
Avoid the memslot lookup in rmap_add, by passing it down from the fault handling code to mmu_set_spte and then to rmap_add. No functional change intended. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: pass struct kvm_page_fault to mmu_set_sptePaolo Bonzini2-17/+13
mmu_set_spte is called for either PTE prefetching or page faults. The three boolean arguments write_fault, speculative and host_writable are always respectively false/true/true for prefetching and coming from a struct kvm_page_fault for page faults. Let mmu_set_spte distinguish these two situation by accepting a possibly NULL struct kvm_page_fault argument. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: pass kvm_mmu_page struct to make_sptePaolo Bonzini5-16/+18
The level and A/D bit support of the new SPTE can be found in the role, which is stored in the kvm_mmu_page struct. This merges two arguments into one. For the TDP MMU, the kvm_mmu_page was not used (kvm_tdp_mmu_map does not use it if the SPTE is already present) so we fetch it just before calling make_spte. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: set ad_disabled in TDP MMU rolePaolo Bonzini1-0/+1
Prepare for removing the ad_disabled argument of make_spte; instead it can be found in the role of a struct kvm_mmu_page. First of all, the TDP MMU must set the role accurately. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: remove unnecessary argument to mmu_set_sptePaolo Bonzini2-6/+7
The level of the new SPTE can be found in the kvm_mmu_page struct; there is no need to pass it down. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: clean up make_spte return valuePaolo Bonzini5-22/+12
Now that make_spte is called directly by the shadow MMU (rather than wrapped by set_spte), it only has to return one boolean value. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: inline set_spte in FNAME(sync_page)Paolo Bonzini2-30/+12
Since the two callers of set_spte do different things with the results, inlining it actually makes the code simpler to reason about. For example, FNAME(sync_page) already has a struct kvm_mmu_page *, but set_spte had to fish it back out of sptep's private page data. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: inline set_spte in mmu_set_sptePaolo Bonzini1-15/+16
Since the two callers of set_spte do different things with the results, inlining it actually makes the code simpler to reason about. For example, mmu_set_spte looks quite like tdp_mmu_map_handle_target_level, but the similarity is hidden by set_spte. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_trackDavid Matlack3-8/+16
Now that kvm_page_fault has a pointer to the memslot it can be passed down to the page tracking code to avoid a redundant slot lookup. No functional change intended. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Pass the memslot around via struct kvm_page_faultDavid Matlack4-23/+20
The memslot for the faulting gfn is used throughout the page fault handling code, so capture it in kvm_page_fault as soon as we know the gfn and use it in the page fault handling code that has direct access to the kvm_page_fault struct. Replace various tests using is_noslot_pfn with more direct tests on fault->slot being NULL. This, in combination with the subsequent patch, improves "Populate memory time" in dirty_log_perf_test by 5% when using the legacy MMU. There is no discerable improvement to the performance of the TDP MMU. No functional change intended. Suggested-by: Ben Gardon <[email protected]> Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: unify tdp_mmu_map_set_spte_atomic and ↵Paolo Bonzini1-30/+10
tdp_mmu_set_spte_atomic_no_dirty_log tdp_mmu_map_set_spte_atomic is not taking care of dirty logging anymore, the only difference that remains is that it takes a vCPU instead of the struct kvm. Merge the two functions. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: MMU: mark page dirty in make_sptePaolo Bonzini3-23/+4
This simplifies set_spte, which we want to remove, and unifies code between the shadow MMU and the TDP MMU. The warning will be added back later to make_spte as well. There is a small disadvantage in the TDP MMU; it may unnecessarily mark a page as dirty twice if two vCPUs end up mapping the same page twice. However, this is a very small cost for a case that is already rare. Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Fold rmap_recycle into rmap_addDavid Matlack1-26/+14
Consolidate rmap_recycle and rmap_add into a single function since they are only ever called together (and only from one place). This has a nice side effect of eliminating an extra kvm_vcpu_gfn_to_memslot(). In addition it makes mmu_set_spte(), which is a very long function, a little shorter. No functional change intended. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-01KVM: x86/mmu: Verify shadow walk doesn't terminate early in page faultsSean Christopherson2-2/+8
WARN and bail if the shadow walk for faulting in a SPTE terminates early, i.e. doesn't reach the expected level because the walk encountered a terminal SPTE. The shadow walks for page faults are subtle in that they install non-leaf SPTEs (zapping leaf SPTEs if necessary!) in the loop body, and consume the newly created non-leaf SPTE in the loop control, e.g. __shadow_walk_next(). In other words, the walks guarantee that the walk will stop if and only if the target level is reached by installing non-leaf SPTEs to guarantee the walk remains valid. Opportunistically use fault->goal-level instead of it.level in FNAME(fetch) to further clarify that KVM always installs the leaf SPTE at the target level. Reviewed-by: Lai Jiangshan <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Lai Jiangshan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>