aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-11-18KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cacheDavid Woodhouse3-101/+12
In commit 7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status") I removed the only user of these functions because it was basically impossible to use them safely. There are two stages to the GFN->PFN mapping; first through the KVM memslots to a userspace HVA and then through the page tables to translate that HVA to an underlying PFN. Invalidations of the former were being handled correctly, but no attempt was made to use the MMU notifiers to invalidate the cache when the HVA->GFN mapping changed. As a prelude to reinventing the gfn_to_pfn_cache with more usable semantics, rip it out entirely and untangle the implementation of the unsafe kvm_vcpu_map()/kvm_vcpu_unmap() functions from it. All current users of kvm_vcpu_map() also look broken right now, and will be dealt with separately. They broadly fall into two classes: * Those which map, access the data and immediately unmap. This is mostly gratuitous and could just as well use the existing user HVA, and could probably benefit from a gfn_to_hva_cache as they do so. * Those which keep the mapping around for a longer time, perhaps even using the PFN directly from the guest. These will need to be converted to the new gfn_to_pfn_cache and then kvm_vcpu_map() can be removed too. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use a gfn_to_hva_cache for vmptrldDavid Woodhouse2-9/+22
And thus another call to kvm_vcpu_map() can die. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS checkDavid Woodhouse1-11/+15
Kill another mostly gratuitous kvm_vcpu_map() which could just use the userspace HVA for it. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/xen: Use sizeof_field() instead of open-coding itDavid Woodhouse1-9/+9
Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12David Woodhouse2-9/+20
Using kvm_vcpu_map() for reading from the guest is entirely gratuitous, when all we do is a single memcpy and unmap it again. Fix it up to use kvm_read_guest()... but in fact I couldn't bring myself to do that without also making it use a gfn_to_hva_cache for both that *and* the copy in the other direction. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFODavid Woodhouse1-1/+1
In commit 319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache") we stopped storing this in-kernel as a GPA, and started storing it as a GFN. Which means we probably should have stopped calling gpa_to_gfn() on it when userspace asks for it back. Cc: [email protected] Fixes: 319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache") Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/mmu: include EFER.LMA in extended mmu roleMaxim Levitsky2-0/+2
Incorporate EFER.LMA into kvm_mmu_extended_role, as it used to compute the guest root level and is not reflected in kvm_mmu_page_role.level when TDP is in use. When simply running the guest, it is impossible for EFER.LMA and kvm_mmu.root_level to get out of sync, as the guest cannot transition from PAE paging to 64-bit paging without toggling CR0.PG, i.e. without first bouncing through a different MMU context. And stuffing guest state via KVM_SET_SREGS{,2} also ensures a full MMU context reset. However, if KVM_SET_SREGS{,2} is followed by KVM_SET_NESTED_STATE, e.g. to set guest state when migrating the VM while L2 is active, the vCPU state will reflect L2, not L1. If L1 is using TDP for L2, then root_mmu will have been configured using L2's state, despite not being used for L2. If L2.EFER.LMA != L1.EFER.LMA, and L2 is using PAE paging, then root_mmu will be configured for guest PAE paging, but will match the mmu_role for 64-bit paging and cause KVM to not reconfigure root_mmu on the next nested VM-Exit. Alternatively, the root_mmu's role could be invalidated after a successful KVM_SET_NESTED_STATE that yields vcpu->arch.mmu != vcpu->arch.root_mmu, i.e. that switches the active mmu to guest_mmu, but doing so is unnecessarily tricky, and not even needed if L1 and L2 do have the same role (e.g., they are both 64-bit guests and run with the same CR4). Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested ↵Maxim Levitsky1-5/+17
state load When loading nested state, don't use check vcpu->arch.efer to get the L1 host's 64-bit vs. 32-bit state and don't check it for consistency with respect to VM_EXIT_HOST_ADDR_SPACE_SIZE, as register state in vCPU may be stale when KVM_SET_NESTED_STATE is called---and architecturally does not exist. When restoring L2 state in KVM, the CPU is placed in non-root where nested VMX code has no snapshot of L1 host state: VMX (conditionally) loads host state fields loaded on VM-exit, but they need not correspond to the state before entry. A simple case occurs in KVM itself, where the host RIP field points to vmx_vmexit rather than the instruction following vmlaunch/vmresume. However, for the particular case of L1 being in 32- or 64-bit mode on entry, the exit controls can be treated instead as the source of truth regarding the state of L1 on entry, and can be used to check that vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE matches vmcs12.HOST_EFER if vmcs12.VM_EXIT_LOAD_IA32_EFER is set. The consistency check on CPU EFER vs. vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE, instead, happens only on VM-Enter. That's because, again, there's conceptually no "current" L1 EFER to check on KVM_SET_NESTED_STATE. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: Fix steal time asm constraintsDavid Woodhouse1-3/+3
In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use the [abcd] registers. For which, GCC has the "q" constraint instead of the less restrictive "r". Also fix st->preempted, which is an input/output operand rather than an input. Fixes: 7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status") Reported-by: kernel test robot <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18cpuid: kvm_find_kvm_cpuid_features() should be declared 'static'Paul Durrant1-1/+1
The lack a static declaration currently results in: arch/x86/kvm/cpuid.c:128:26: warning: no previous prototype for function 'kvm_find_kvm_cpuid_features' when compiling with "W=1". Reported-by: kernel test robot <[email protected]> Fixes: 760849b1476c ("KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES") Signed-off-by: Paul Durrant <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18ata: libata-sata: Declare ata_ncq_sdev_attrs staticDamien Le Moal1-1/+1
Since ata_ncq_sdev_attrs is a local struct, declare it static. This avoids a sparse warning at compile time. Signed-off-by: Damien Le Moal <[email protected]>
2021-11-18ata: libahci: Adjust behavior when StorageD3Enable _DSD is setMario Limonciello1-0/+15
The StorageD3Enable _DSD is used for the vendor to indicate that the disk should be opted into or out of a different behavior based upon the platform design. For AMD's Renoir and Green Sardine platforms it's important that any attached SATA storage has transitioned into DevSlp when s2idle is used. If the disk is left in active/partial/slumber, then the system is not able to resume properly. When the StorageD3Enable _DSD is detected, check the system is using s2idle and DevSlp is enabled and if so explicitly wait long enough for the disk to enter DevSlp. Cc: Nehal-bakulchandra Shah <[email protected]> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2021-11-18ata: ahci: Add Green Sardine vendor ID as board_ahci_mobileMario Limonciello1-0/+1
AMD requires that the SATA controller be configured for devsleep in order for S0i3 entry to work properly. commit b1a9585cc396 ("ata: ahci: Enable DEVSLP by default on x86 with SLP_S0") sets up a kernel policy to enable devsleep on Intel mobile platforms that are using s0ix. Add the PCI ID for the SATA controller in Green Sardine platforms to extend this policy by default for AMD based systems using s0i3 as well. Cc: Nehal-bakulchandra Shah <[email protected]> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2021-11-18ata: libata: add missing ata_identify_page_supported() callsDamien Le Moal1-1/+5
ata_dev_config_ncq_prio() and ata_dev_config_devslp() both access pages of the IDENTIFY DEVICE data log. Before calling ata_read_log_page(), make sure to check for the existence of the IDENTIFY DEVICE data log and of the log page accessed using ata_identify_page_supported(). This avoids useless error messages from ata_read_log_page() and failures with some LLDD scsi drivers using libsas. Reported-by: Nikolay <[email protected]> Cc: [email protected] # 5.15 Signed-off-by: Damien Le Moal <[email protected]> Tested-by: Matthew Perkowski <[email protected]>
2021-11-17octeontx2-af: debugfs: don't corrupt user memoryDan Carpenter1-7/+10
The user supplies the "count" value to say how big its read buffer is. The rvu_dbg_lmtst_map_table_display() function does not take the "count" into account but instead just copies the whole table, potentially corrupting the user's data. Introduce the "ret" variable to store how many bytes we can copy. Also I changed the type of "off" to size_t to make using min() simpler. Fixes: 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") Signed-off-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/20211117073454.GD5237@kili Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17NFC: add NCI_UNREG flag to eliminate the raceLin Ma2-2/+18
There are two sites that calls queue_work() after the destroy_workqueue() and lead to possible UAF. The first site is nci_send_cmd(), which can happen after the nci_close_device as below nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | flush_workqueue | del_timer_sync | nci_unregister_device | nfc_get_device destroy_workqueue | nfc_dev_up nfc_unregister_device | nci_dev_up device_del | nci_open_device | __nci_request | nci_send_cmd | queue_work !!! Another site is nci_cmd_timer, awaked by the nci_cmd_work from the nci_send_cmd. ... | ... nci_unregister_device | queue_work destroy_workqueue | nfc_unregister_device | ... device_del | nci_cmd_work | mod_timer | ... | nci_cmd_timer | queue_work !!! For the above two UAF, the root cause is that the nfc_dev_up can race between the nci_unregister_device routine. Therefore, this patch introduce NCI_UNREG flag to easily eliminate the possible race. In addition, the mutex_lock in nci_close_device can act as a barrier. Signed-off-by: Lin Ma <[email protected]> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17NFC: reorder the logic in nfc_{un,}register_deviceLin Ma1-14/+18
There is a potential UAF between the unregistration routine and the NFC netlink operations. The race that cause that UAF can be shown as below: (FREE) | (USE) nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | nci_unregister_device | nfc_get_device nfc_unregister_device | nfc_dev_up rfkill_destory | device_del | rfkill_blocked ... | ... The root cause for this race is concluded below: 1. The rfkill_blocked (USE) in nfc_dev_up is supposed to be placed after the device_is_registered check. 2. Since the netlink operations are possible just after the device_add in nfc_register_device, the nfc_dev_up() can happen anywhere during the rfkill creation process, which leads to data race. This patch reorder these actions to permit 1. Once device_del is finished, the nfc_dev_up cannot dereference the rfkill object. 2. The rfkill_register need to be placed after the device_add of nfc_dev because the parent device need to be created first. So this patch keeps the order but inject device_lock to prevent the data race. Signed-off-by: Lin Ma <[email protected]> Fixes: be055b2f89b5 ("NFC: RFKILL support") Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17NFC: reorganize the functions in nci_requestLin Ma1-4/+7
There is a possible data race as shown below: thread-A in nci_request() | thread-B in nci_close_device() | mutex_lock(&ndev->req_lock); test_bit(NCI_UP, &ndev->flags); | ... | test_and_clear_bit(NCI_UP, &ndev->flags) mutex_lock(&ndev->req_lock); | | This race will allow __nci_request() to be awaked while the device is getting removed. Similar to commit e2cb6b891ad2 ("bluetooth: eliminate the potential race condition when removing the HCI controller"). this patch alters the function sequence in nci_request() to prevent the data races between the nci_close_device(). Signed-off-by: Lin Ma <[email protected]> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17drm/amd/amdgpu: fix potential memleakBernard Zhao1-0/+1
In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Bernard Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered againshaoyunl1-0/+5
In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch already been called, the start_cpsch will not be called since there is no resume in this case. When reset been triggered again, driver should avoid to do uninitialization again. Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17tipc: check for null after calling kmemdupTadeusz Struk1-0/+4
kmemdup can return a null pointer so need to check for it, otherwise the null key will be dereferenced later in tipc_crypto_key_xmit as can be seen in the trace [1]. Cc: [email protected] Cc: [email protected] # 5.15, 5.14, 5.10 [1] https://syzkaller.appspot.com/bug?id=bca180abb29567b189efdbdb34cbf7ba851c2a58 Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Tadeusz Struk <[email protected]> Acked-by: Ying Xue <[email protected]> Acked-by: Jon Maloy <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17drm/amd/pm: add GFXCLK/SCLK clocks level print support for APUsPerry Yuan4-2/+74
add support that allow the userspace tool like RGP to get the GFX clock value at runtime, the fix follow the old way to show the min/current/max clocks level for compatible consideration. === Test === $ cat /sys/class/drm/card0/device/pp_dpm_sclk 0: 200Mhz * 1: 1100Mhz 2: 1600Mhz then run stress test on one APU system. $ cat /sys/class/drm/card0/device/pp_dpm_sclk 0: 200Mhz 1: 1040Mhz * 2: 1600Mhz The current GFXCLK value is updated at runtime. BugLink: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5260 Reviewed-by: Huang Ray <[email protected]> Signed-off-by: Perry Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-11-17drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga ↵hongao1-0/+1
and dvi connectors amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-11-17drm/amd/display: Fix OLED brightness control on eDPRoman Li1-1/+2
[Why] After commit ("drm/amdgpu/display: add support for multiple backlights") number of eDPs is defined while registering backlight device. However the panel's extended caps get updated once before register call. That leads to regression with extended caps like oled brightness control. [How] Update connector ext caps after register_backlight_device Fixes: 7fd13baeb7a3a4 ("drm/amdgpu/display: add support for multiple backlights") Link: https://www.reddit.com/r/AMDLaptops/comments/qst0fm/after_updating_to_linux_515_my_brightness/ Signed-off-by: Roman Li <[email protected]> Tested-by: Samuel Čavoj <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Jasdeep Dhillon <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-11-17i40e: Fix display error code in dmesgGrzegorz Szczurek1-3/+2
Fix misleading display error in dmesg if tc filter return fail. Only i40e status error code should be converted to string, not linux error code. Otherwise, we return false information about the error. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix creation of first queue by omitting it if is not power of twoJedrzej Jagielski1-40/+19
Reject TCs creation with proper message if the first queue assignment is not equal to the power of two. The first queue number was checked too late in the second queue iteration, if second queue was configured at all. Now if first queue value is not a power of two, then trying to create qdisc will be rejected. Fixes: 8f88b3034db3 ("i40e: Add infrastructure for queue channel support") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix warning message and call stack during rmmod i40e driverKaren Sornek1-32/+21
Restore part of reset functionality used when reset is called from the VF to reset itself. Without this fix warning message is displayed when VF is being removed via sysfs. Fix the crash of the VF during reset by ensuring that the PF receives the reset message successfully. Refactor code to use one function instead of two. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17Merge tag 'gfs2-v5.16-rc2-fixes' of ↵Linus Torvalds4-19/+18
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: - The current iomap_file_buffered_write behavior of failing the entire write when part of the user buffer cannot be faulted in leads to an endless loop in gfs2. Work around that in gfs2 for now. - Various other bugs all over the place. * tag 'gfs2-v5.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Prevent endless loops in gfs2_file_buffered_write gfs2: Fix "Introduce flag for glock holder auto-demotion" gfs2: Fix length of holes reported at end-of-file gfs2: release iopen glock early in evict gfs2: Fix atomic bug in gfs2_instantiate gfs2: Only dereference i->iov when iter_is_iovec(i)
2021-11-17Merge tag 'mips-fixes_5.16_1' of ↵Linus Torvalds6-1/+16
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - wire futex_waitv syscall - build fixes for lantiq and bcm63xx configs - yamon-dt bugfix * tag 'mips-fixes_5.16_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: lantiq: add support for clk_get_parent() mips: bcm63xx: add support for clk_get_parent() MIPS: generic/yamon-dt: fix uninitialized variable error MIPS: syscalls: Wire up futex_waitv syscall
2021-11-17drm/amd/pm: Remove artificial freq level on Navi1xLijo Lazar1-5/+8
Print Navi1x fine grained clocks in a consistent manner with other SOCs. Don't show aritificial DPM level when the current clock equals min or max. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17drm/amd/pm: avoid duplicate powergate/ungate settingEvan Quan4-1/+23
Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/[email protected]/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Fixes: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Evan Quan <[email protected]> Tested-by: Borislav Petkov <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-11-17drm/amdgpu: add error print when failing to add IP block(v2)Guchun Chen1-0/+36
Driver initialization is driven by IP version from IP discovery table. So add error print when failing to add ip block during driver initialization, this will be more friendly to user to know which IP version is not correct. [ 40.467361] [drm] host supports REQ_INIT_DATA handshake [ 40.474076] [drm] add ip block number 0 <nv_common> [ 40.474090] [drm] add ip block number 1 <gmc_v10_0> [ 40.474101] [drm] add ip block number 2 <psp> [ 40.474103] [drm] add ip block number 3 <navi10_ih> [ 40.474114] [drm] add ip block number 4 <smu> [ 40.474119] [drm] add ip block number 5 <amdgpu_vkms> [ 40.474134] [drm] add ip block number 6 <gfx_v10_0> [ 40.474143] [drm] add ip block number 7 <sdma_v5_2> [ 40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init [ 40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device. v2: use dev_err to multi-GPU system Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17drm/amd/pm: Enhanced reporting also for a stuck commandLuben Tuikov1-2/+6
Also print the message index and parameter of the stuck command. Cc: Alex Deucher <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17RDMA/nldev: Check stat attribute before accessing itLeon Romanovsky1-1/+2
The access to non-existent netlink attribute causes to the following kernel panic. Fix it by checking existence before trying to read it. general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 6744 Comm: syz-executor.0 Not tainted 5.15.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nla_get_u32 include/net/netlink.h:1554 [inline] RIP: 0010:nldev_stat_set_mode_doit drivers/infiniband/core/nldev.c:1909 [inline] RIP: 0010:nldev_stat_set_doit+0x578/0x10d0 drivers/infiniband/core/nldev.c:2040 Code: fa 4c 8b a4 24 f8 02 00 00 48 b8 00 00 00 00 00 fc ff df c7 84 24 80 00 00 00 00 00 00 00 49 8d 7c 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 02 RSP: 0018:ffffc90004acf2e8 EFLAGS: 00010247 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90002b94000 RDX: 0000000000000000 RSI: ffffffff8684c5ff RDI: 0000000000000004 RBP: ffff88807cda4000 R08: 0000000000000000 R09: ffff888023fb8027 R10: ffffffff8684c5d7 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000001 R14: ffff888041024280 R15: ffff888031ade780 FS: 00007eff9dddd700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2ef24000 CR3: 0000000036902000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x86d/0xda0 net/netlink/af_netlink.c:1916 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 822cf785ac6d ("RDMA/nldev: Split nldev_stat_set_mode_doit out of nldev_stat_set_doit") Link: https://lore.kernel.org/r/b21967c366f076ff1988862f9c8a1aa0244c599f.1637151999.git.leonro@nvidia.com Reported-by: [email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-11-17RDMA/mlx4: Do not fail the registration on port statsJack Wang1-3/+15
If the FW doesn't support MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT, mlx4 driver will fail the ib_setup_port_attrs, which is called from ib_register_device()/enable_device_and_get(), in the end leads to device not detected[1][2] To fix it, add a new mlx4_ib_hw_stats_ops1, w/o alloc_hw_port_stats if FW does not support MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT. [1] https://bugzilla.redhat.com/show_bug.cgi?id=2014094 [2] https://lore.kernel.org/linux-rdma/CAMGffEn2wvEnmzc0xe=xYiCLqpphiHDBxCxqAELrBofbUAMQxw@mail.gmail.com Fixes: 4b5f4d3fb408 ("RDMA: Split the alloc_hw_stats() ops to port and device variants") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jack Wang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-11-17Merge tag 'hyperv-fixes-signed-20211117' of ↵Linus Torvalds3-14/+20
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Fix ring size calculation for balloon driver (Boqun Feng) - Fix issues in Hyper-V setup code (Sean Christopherson) * tag 'hyperv-fixes-signed-20211117' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Move required MSRs check to initial platform probing x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails Drivers: hv: balloon: Use VMBUS_RING_SIZE() wrapper for dm_ring_size
2021-11-17Merge tag 'nfsd-5.16-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-5/+2
Pull nfsd bugfix from Bruce Fields: "This is just one bugfix for a buffer overflow in knfsd's xdr decoding" * tag 'nfsd-5.16-1' of git://linux-nfs.org/~bfields/linux: NFSD: Fix exposure in nfsd4_decode_bitmap()
2021-11-17Revert "ACPI: scan: Release PM resources blocked by unused objects"Rafael J. Wysocki3-32/+0
Revert commit c10383e8ddf4 ("ACPI: scan: Release PM resources blocked by unused objects"), because it causes boot issues to appear on some platforms. Reported-by: Kyle D. Pelton <[email protected]> Reported-by: Saranya Gopal <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2021-11-17i40e: Fix ping is lost after configuring ADq on VFEryk Rybak3-8/+74
Properly reconfigure VF VSIs after VF request ADQ. Created new function to update queue mapping and queue pairs per TC with AQ update VSI. This sets proper RSS size on NIC. VFs num_queue_pairs should not be changed during setup of queue maps. Previously, VF main VSI in ADQ had configured too many queues and had wrong RSS size, which lead to packets not being consumed and drops in connectivity. Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use") Co-developed-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix changing previously set num_queue_pairs for PFsEryk Rybak1-12/+23
Currently, the i40e_vsi_setup_queue_map is basing the count of queues in TCs on a VSI's alloc_queue_pairs member which is not changed throughout any user's action (for example via ethtool's set_channels callback). This implies that vsi->tc_config.tc_info[n].qcount value that is given to the kernel via netdev_set_tc_queue() that notifies about the count of queues per particular traffic class is constant even if user has changed the total count of queues. This in turn caused the kernel warning after setting the queue count to the lower value than the initial one: $ ethtool -l ens801f0 Channel parameters for ens801f0: Pre-set maximums: RX: 0 TX: 0 Other: 1 Combined: 64 Current hardware settings: RX: 0 TX: 0 Other: 1 Combined: 64 $ ethtool -L ens801f0 combined 40 [dmesg] Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled! Reason was that vsi->alloc_queue_pairs stayed at 64 value which was used to set the qcount on TC0 (by default only TC0 exists so all of the existing queues are assigned to TC0). we update the offset/qcount via netdev_set_tc_queue() back to the old value but then the netif_set_real_num_tx_queues() is using the vsi->num_queue_pairs as a value which got set to 40. Fix it by using vsi->req_queue_pairs as a queue count that will be distributed across TCs. Do it only for non-zero values, which implies that user actually requested the new count of queues. For VSIs other than main, stay with the vsi->alloc_queue_pairs as we only allow manipulating the queue count on main VSI. Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use") Co-developed-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Co-developed-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix NULL ptr dereference on VSI filter syncMichal Maloszewski2-2/+4
Remove the reason of null pointer dereference in sync VSI filters. Added new I40E_VSI_RELEASING flag to signalize deleting and releasing of VSI resources to sync this thread with sync filters subtask. Without this patch it is possible to start update the VSI filter list after VSI is removed, that's causing a kernel oops. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Michal Maloszewski <[email protected]> Reviewed-by: Przemyslaw Patynowski <[email protected]> Reviewed-by: Witold Fijalkowski <[email protected]> Reviewed-by: Jaroslaw Gawin <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix correct max_pkt_size on VF RX queueEryk Rybak1-44/+9
Setting VLAN port increasing RX queue max_pkt_size by 4 bytes to take VLAN tag into account. Trigger the VF reset when setting port VLAN for VF to renegotiate its capabilities and reinitialize. Fixes: ba4e003d29c1 ("i40e: don't hold spinlock while resetting VF") Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17net: ax88796c: use bit numbers insetad of bit masksŁukasz Stelmach1-3/+3
Change the values of EVENT_* constants from bit masks to bit numbers as accepted by {clear,set,test}_bit() functions. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Łukasz Stelmach <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17net: virtio_net_hdr_to_skb: count transport header in UFOJonathan Davies1-1/+6
virtio_net_hdr_to_skb does not set the skb's gso_size and gso_type correctly for UFO packets received via virtio-net that are a little over the GSO size. This can lead to problems elsewhere in the networking stack, e.g. ovs_vport_send dropping over-sized packets if gso_size is not set. This is due to the comparison if (skb->len - p_off > gso_size) not properly accounting for the transport layer header. p_off includes the size of the transport layer header (thlen), so skb->len - p_off is the size of the TCP/UDP payload. gso_size is read from the virtio-net header. For UFO, fragmentation happens at the IP level so does not need to include the UDP header. Hence the calculation could be comparing a TCP/UDP payload length with an IP payload length, causing legitimate virtio-net packets to have lack gso_type/gso_size information. Example: a UDP packet with payload size 1473 has IP payload size 1481. If the guest used UFO, it is not fragmented and the virtio-net header's flags indicate that it is a GSO frame (VIRTIO_NET_HDR_GSO_UDP), with gso_size = 1480 for an MTU of 1500. skb->len will be 1515 and p_off will be 42, so skb->len - p_off = 1473. Hence the comparison fails, and shinfo->gso_size and gso_type are not set as they should be. Instead, add the UDP header length before comparing to gso_size when using UFO. In this way, it is the size of the IP payload that is compared to gso_size. Fixes: 6dd912f82680 ("net: check untrusted gso_size at kernel entry") Signed-off-by: Jonathan Davies <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17net: dpaa2-eth: fix use-after-free in dpaa2_eth_removePavel Skripkin1-2/+2
Access to netdev after free_netdev() will cause use-after-free bug. Move debug log before free_netdev() call to avoid it. Fixes: 7472dd9f6499 ("staging: fsl-dpaa2/eth: Move print message") Signed-off-by: Pavel Skripkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17net: usb: r8152: Add MAC passthrough support for more Lenovo DocksAaron Ma1-6/+3
Like ThinkaPad Thunderbolt 4 Dock, more Lenovo docks start to use the original Realtek USB ethernet chip ID 0bda:8153. Lenovo Docks always use their own IDs for usb hub, even for older Docks. If parent hub is from Lenovo, then r8152 should try MAC passthrough. Verified on Lenovo TBT3 dock too. Signed-off-by: Aaron Ma <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17drm/i915/guc: fix NULL vs IS_ERR() checkingDan Carpenter1-2/+2
The intel_engine_create_virtual() function does not return NULL. It returns error pointers. Fixes: e5e32171a2cf ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: John Harrison <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20211116114916.GB11936@kili (cherry picked from commit fc12b70d12d07598cde27cc17dbfafc2a2a33ff8) Signed-off-by: Rodrigo Vivi <[email protected]>
2021-11-17drm/i915/dsi/xelpd: Fix the bit mask for wakeup GBVandita Kulkarni2-2/+5
v2: Fix the typo, move out the hardcoding from macro(Jani, Ville) Fixes: f87c46c43175 ("drm/i915/dsi/xelpd: Add WA to program LP to HS wakeup guardband") Signed-off-by: Vandita Kulkarni <[email protected]> Reviewed-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 6f07707fa09e1dc58c431d57c25ef2e68b9bec47) Signed-off-by: Rodrigo Vivi <[email protected]>
2021-11-17Revert "drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping"Vandita Kulkarni1-8/+2
This reverts commit 991d9557b0c4 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping"). The Bspec was updated recently with the pll ungate sequence similar to that of icl dsi enable sequence. Hence reverting. Bspec: 49187 Fixes: 991d9557b0c4 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping") Cc: <[email protected]> # v5.4+ Signed-off-by: Vandita Kulkarni <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 4579509ef181480f4e4510d436c691519167c5c2) Signed-off-by: Rodrigo Vivi <[email protected]>
2021-11-17Documentation/process: fix a cross referenceMauro Carvalho Chehab1-2/+2
The cross-reference for the handbooks section works. However, it is meant to describe the path inside the Kernel's doc where the section is, but there's an space instead of a dash, plus it lacks the .rst at the end, which makes: ./scripts/documentation-file-ref-check to complain. Fixes: 604370e106cc ("Documentation/process: Add maintainer handbooks section") Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>