aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-06drm/radeon: fix si_enable_smc_cac() failed issueAlex Deucher1-0/+1
Need to set the dte flag on this asic. Port the fix from amdgpu: 5cb818b861be114 ("drm/amd/amdgpu: fix si_enable_smc_cac() failed issue") Reviewed-by: Yong Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/renoir: move gfxoff handling into gfx9 moduleAlex Deucher2-5/+6
To properly handle the option parsing ordering. Reviewed-by: Yong Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: fix double reference droppingPan Bian1-4/+2
The reference to object fence is dropped at the end of the loop. However, it is dropped again outside the loop. The reference can be dropped immediately after calling dma_fence_wait() in the loop and thus the dropping operation outside the loop can be removed. Reviewed-by: Christian König <[email protected]> Signed-off-by: Pan Bian <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: fix struct init in renoir_print_clk_levelsRaul E Rangel1-1/+3
drivers/gpu/drm/amd/powerplay/renoir_ppt.c:186:2: error: missing braces around initializer [-Werror=missing-braces] SmuMetrics_t metrics = {0}; ^ Fixes: 8b8031703bd7 ("drm/amd/powerplay: implement sysfs for getting dpm clock") Signed-off-by: Raul E Rangel <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: fix potential double drop fence referencePan Bian1-0/+2
The object fence is not set to NULL after its reference is dropped. As a result, its reference may be dropped again if error occurs after that, which may lead to a use after free bug. To avoid the issue, fence is explicitly set to NULL after dropping its reference. Acked-by: Christian König <[email protected]> Signed-off-by: Pan Bian <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: change read of GPU clock counter on Vega10 VFEric Huang1-3/+16
Using unified VBIOS has performance drop in sriov environment. The fix is switching to another register instead. Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9changzhu1-0/+7
It needs to add warning to update firmware in gfx9 in case that firmware is too old to have function to realize dummy read in cp firmware. Signed-off-by: changzhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10changzhu4-6/+64
The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. For cp ucode, it has realized dummy read in cp firmware.It covers the use of WAIT_REG_MEM operation 1 case only.So it needs to call gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to update firmware in case firmware is too old to have function to realize dummy read in cp firmware. For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is moved to gfxhub in gfx10. So it needs to add dummy read in driver between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0. Signed-off-by: changzhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: fix deadlock on setting power_dpm_force_performance_levelEvan Quan1-5/+14
smu_enable_umd_pstate() will try to get the smu->mutex which was already hold by its parent API smu_force_performance_level() on the call path. Thus deadlock happens. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: perform p-state switch after the whole hive initializedEvan Quan1-12/+35
P-state switch should be performed after all devices from the hive get initialized. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: fix possible pstate switch race conditionEvan Quan2-2/+35
Added lock protection so that the p-state switch will be guarded to be sequential. Also update the hive pstate only all device from the hive are in the same state. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: register gpu instance before fan boost feature enablmentEvan Quan2-1/+7
Otherwise, the feature enablement will be skipped due to wrong count. Fixes: beff74bc6e0fa91 ("drm/amdgpu: fix a race in GPU reset with IB test (v2)") Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/swSMU: fix smu workload bit map errorKevin Wang2-2/+2
fix workload bit (WORKLOAD_PPLIB_COMPUTE_BIT) map error on vega20 and navi asic. fix commit: drm/amd/powerplay: add function get_workload_type_map for swsmu Signed-off-by: Kevin Wang <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: update Arcturus driver-smu interface headerEvan Quan2-2/+2
To fit the latest SMU firmware. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Le Ma <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: disallow direct upload save restore list from gfx driverHawking Zhang1-1/+2
Direct uploading save/restore list via mmio register writes breaks the security policy. Instead, the driver should pass s&r list to psp. For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list twice, in non-psp ucode front door loading phase and gfx pg initialization phase. The latter is not allowed. VG12 is the only exception where the driver still keeps legacy approach for S&R list uploading. In theory, this can be elimnated if we have valid srcntl ucode for VG12. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Candice Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/sched: Fix passing zero to 'PTR_ERR' warning v2Andrey Grodzovsky1-2/+5
Fix a static code checker warning. v2: Drop PTR_ERR_OR_ZERO. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Emily Deng <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/discovery: Need to free discovery memoryEmily Deng1-3/+3
When unloading driver, need to free discovery memory. Signed-off-by: Emily Deng <[email protected]> Reviewed-by: Xiaojie Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: print the pptable providerXiaojie Yuan1-0/+2
So we know where the tables came from. Signed-off-by: Xiaojie Yuan <[email protected]> Reviewed-by: Kevin Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06Revert "drm/amd/display: setting the DIG_MODE to the correct value."Zhan Liu1-9/+0
This reverts commit 967a3b85bac91c55eff740e61bf270c2732f48b2. Reason for revert: Root cause of this issue is found. The workaround is not needed anymore. Signed-off-by: Zhan Liu <[email protected]> Reviewed-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/display: Add ENGINE_ID_DIGD condition check for Navi14Zhan Liu1-0/+5
[Why] Navi10 has 6 PHY, but Navi14 only has 5 PHY, that is because there is no ENGINE_ID_DIGD in Navi14. Without this patch, many HDMI related issues (e.g. HDMI S3 resume failure, HDMI pink screen on boot) will be observed. [How] If "eng_id" is larger than ENGINE_ID_DIGD, then add "eng_id" by 1. Signed-off-by: Zhan Liu <[email protected]> Reviewed-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: Show resolution correctly in mode validation debug outputNeil Mayhew1-1/+1
Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Neil Mayhew <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/gpuvm: add some additional comments in amdgpu_vm_update_ptesAlex Deucher1-1/+9
To better clarify what is happening in this function. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: enable VCN DPG on Raven and Raven2Alex Deucher1-2/+6
It's safe to enable dynamic VCN powergating on raven and raven2 for increased power savings. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: add navi14 PCI IDTianci.Yin1-0/+1
Add the navi14 PCI device id. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: support xgmi pstate setting on powerplay routine V2Evan Quan6-4/+44
Add xgmi pstate setting on powerplay routine. V2: split the change of is_support_sw_smu_xgmi into a separate patch Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/powerplay: update is_sw_smu_xgmi checkEvan Quan1-1/+1
Add check for is_sw_smu routine and drop check for amdgpu_dpm which seems non-sense. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: change pstate only after all XGMI device initializedEvan Quan1-3/+12
Pstate settings should be performed after all device of the XGMI setup get initialized. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06HID: wacom: generic: Treat serial number and related fields as unsignedJason Gerecke2-4/+21
The HID descriptors for most Wacom devices oddly declare the serial number and other related fields as signed integers. When these numbers are ingested by the HID subsystem, they are automatically sign-extended into 32-bit integers. We treat the fields as unsigned elsewhere in the kernel and userspace, however, so this sign-extension causes problems. In particular, the sign-extended tool ID sent to userspace as ABS_MISC does not properly match unsigned IDs used by xf86-input-wacom and libwacom. We introduce a function 'wacom_s32tou' that can undo the automatic sign extension performed by 'hid_snto32'. We call this function when processing the serial number and related fields to ensure that we are dealing with and reporting the unsigned form. We opt to use this method rather than adding a descriptor fixup in 'wacom_hid_usage_quirk' since it should be more robust in the face of future devices. Ref: https://github.com/linuxwacom/input-wacom/issues/134 Fixes: f85c9dc678 ("HID: wacom: generic: Support tool ID and additional tool types") CC: <[email protected]> # v4.10+ Signed-off-by: Jason Gerecke <[email protected]> Reviewed-by: Aaron Armstrong Skomra <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2019-11-06drm/amdgpu: add navi14 PCI IDTianci.Yin1-0/+1
Add the navi14 PCI device id. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06Revert "drm/amd/display: setting the DIG_MODE to the correct value."Zhan Liu1-9/+0
This reverts commit 385857adb8154563840e5b0f200254126618f464. Reason for revert: Root cause of this issue is found. The workaround is not needed anymore. Signed-off-by: Zhan Liu <[email protected]> Reviewed-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amd/display: Add ENGINE_ID_DIGD condition check for Navi14Zhan Liu1-0/+5
[Why] Navi10 has 6 PHY, but Navi14 only has 5 PHY, that is because there is no ENGINE_ID_DIGD in Navi14. Without this patch, many HDMI related issues (e.g. HDMI S3 resume failure, HDMI pink screen on boot) will be observed. [How] If "eng_id" is larger than ENGINE_ID_DIGD, then add "eng_id" by 1. Signed-off-by: Zhan Liu <[email protected]> Reviewed-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu: dont schedule jobs while in resetShirish S1-1/+4
[Why] doing kthread_park()/unpark() from drm_sched_entity_fini while GPU reset is in progress defeats all the purpose of drm_sched_stop->kthread_park. If drm_sched_entity_fini->kthread_unpark() happens AFTER drm_sched_stop->kthread_park nothing prevents from another (third) thread to keep submitting job to HW which will be picked up by the unparked scheduler thread and try to submit to HW but fail because the HW ring is deactivated. [How] grab the reset lock before calling drm_sched_entity_fini() Signed-off-by: Shirish S <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZEAlex Deucher1-0/+9
These were not aligned for optimal performance for GPUVM. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/mst: Fix up u64 divisionSean Paul1-2/+2
Change rem_nsec to u32 since that's what do_div returns, this avoids the u64 divide in the drm_print args. Changes in v2: - Instead of doing do_div in drm_print, make rem_nsec u32 (Ville) Link to v1: https://patchwork.freedesktop.org/patch/msgid/[email protected] Fixes: 12a280c72868 ("drm/dp_mst: Add topology ref history tracking for debugging") Cc: Juston Li <[email protected]> Cc: Imre Deak <[email protected]> Cc: Ville Syrjälä <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sean Paul <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: [email protected] Reviewed-by: Ville Syrjälä <[email protected]> Signed-off-by: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-11-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds15-79/+188
Merge more fixes from Andrew Morton: "17 fixes" Mostly mm fixes and one ocfs2 locking fix. * emailed patches from Andrew Morton <[email protected]>: mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges mm/memory_hotplug: fix updating the node span scripts/gdb: fix debugging modules compiled with hot/cold partitioning mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly MAINTAINERS: update information for "MEMORY MANAGEMENT" dump_stack: avoid the livelock of the dump_lock zswap: add Vitaly to the maintainers list mm/page_alloc.c: ratelimit allocation failure warnings more aggressively mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo mm, vmstat: hide /proc/pagetypeinfo from normal users mm/mmu_notifiers: use the right return code for WARN_ON ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() mm: thp: handle page cache THP correctly in PageTransCompoundMap mm, meminit: recalculate pcpu batch and high limits after init completes mm/gup_benchmark: fix MAP_HUGETLB case mm: memcontrol: fix NULL-ptr deref in percpu stats flush
2019-11-06arm64: Do not mask out PTE_RDONLY in pte_same()Catalin Marinas1-17/+0
Following commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), the PTE_RDONLY bit is no longer managed by set_pte_at() but built into the PAGE_* attribute definitions. Consequently, pte_same() must include this bit when checking two PTEs for equality. Remove the arm64-specific pte_same() function, practically reverting commit 747a70e60b72 ("arm64: Fix copy-on-write referencing in HugeTLB") Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: <[email protected]> # 4.14.x- Cc: Will Deacon <[email protected]> Cc: Steve Capper <[email protected]> Reported-by: John Stultz <[email protected]> Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-06drm/amdgpu: dont schedule jobs while in resetShirish S1-1/+4
[Why] doing kthread_park()/unpark() from drm_sched_entity_fini while GPU reset is in progress defeats all the purpose of drm_sched_stop->kthread_park. If drm_sched_entity_fini->kthread_unpark() happens AFTER drm_sched_stop->kthread_park nothing prevents from another (third) thread to keep submitting job to HW which will be picked up by the unparked scheduler thread and try to submit to HW but fail because the HW ring is deactivated. [How] grab the reset lock before calling drm_sched_entity_fini() Signed-off-by: Shirish S <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZEAlex Deucher1-0/+9
These were not aligned for optimal performance for GPUVM. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06Merge branch 'net-bcmgenet-restore-internal-EPHY-support'David S. Miller3-72/+110
Doug Berger says: ==================== net: bcmgenet: restore internal EPHY support (part 2) This is a follow up to my previous submission (see [1]). The first commit provides what is intended to be a complete solution for the issues that can result from insufficient clocking of the MAC during reset of its state machines. It should be backported to the stable releases. It is intended to replace the partial solution of commit 1f515486275a ("net: bcmgenet: soft reset 40nm EPHYs before MAC init") which is reverted by the second commit of this series and should not be back- ported as noted in [2]. The third commit corrects a timing hazard with a polled PHY that can occur when the MAC resumes and also when a v3 internal EPHY is reset by the change in commit 25382b991d25 ("net: bcmgenet: reset 40nm EPHY on energy detect"). It is expected that commit 25382b991d25 be back- ported to stable first before backporting this commit. [1] https://lkml.org/lkml/2019/10/16/1706 [2] https://lkml.org/lkml/2019/10/31/749 ==================== Signed-off-by: David S. Miller <[email protected]>
2019-11-06net: bcmgenet: reapply manual settings to the PHYDoug Berger1-1/+4
The phy_init_hw() function may reset the PHY to a configuration that does not match manual network settings stored in the phydev structure. If the phy state machine is polled rather than event driven this can create a timing hazard where the phy state machine might alter the settings stored in the phydev structure from the value read from the BMCR. This commit follows invocations of phy_init_hw() by the bcmgenet driver with invocations of the genphy_config_aneg() function to ensure that the BMCR is written to match the settings held in the phydev structure. This prevents the risk of manual settings being accidentally altered. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-06Revert "net: bcmgenet: soft reset 40nm EPHYs before MAC init"Doug Berger3-69/+73
This reverts commit 1f515486275a08a17a2c806b844cca18f7de5b34. This commit improved the chances of the umac resetting cleanly by ensuring that the PHY was restored to its normal operation prior to resetting the umac. However, there were still cases when the PHY might not be driving a Tx clock to the umac during this window (e.g. when the PHY detects no link). The previous commit now ensures that the unimac receives clocks from the MAC during its reset window so this commit is no longer needed. This commit also has an unintended negative impact on the MDIO performance of the UniMAC MDIO interface because it is used before the MDIO interrupts are reenabled, so it should be removed. Signed-off-by: Doug Berger <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-06net: bcmgenet: use RGMII loopback for MAC resetDoug Berger2-2/+33
As noted in commit 28c2d1a7a0bf ("net: bcmgenet: enable loopback during UniMAC sw_reset") the UniMAC must be clocked while sw_reset is asserted for its state machines to reset cleanly. The transmit and receive clocks used by the UniMAC are derived from the signals used on its PHY interface. The bcmgenet MAC can be configured to work with different PHY interfaces including MII, GMII, RGMII, and Reverse MII on internal and external interfaces. Unfortunately for the UniMAC, when configured for MII the Tx clock is always driven from the PHY which places it outside of the direct control of the MAC. The earlier commit enabled a local loopback mode within the UniMAC so that the receive clock would be derived from the transmit clock which addressed the observed issue with an external GPHY disabling it's Rx clock. However, when a Tx clock is not available this loopback is insufficient. This commit implements a workaround that leverages the fact that the MAC can reliably generate all of its necessary clocking by enterring the external GPHY RGMII interface mode with the UniMAC in local loopback during the sw_reset interval. Unfortunately, this has the undesirable side efect of the RGMII GTXCLK signal being driven during the same window. In most configurations this is a benign side effect as the signal is either not routed to a pin or is already expected to drive the pin. The one exception is when an external MII PHY is expected to drive the same pin with its TX_CLK output creating output driver contention. This commit exploits the IEEE 802.3 clause 22 standard defined isolate mode to force an external MII PHY to present a high impedance on its TX_CLK output during the window to prevent any contention at the pin. The MII interface is used internally with the 40nm internal EPHY which agressively disables its clocks for power savings leading to incomplete resets of the UniMAC and many instabilities observed over the years. The workaround of this commit is expected to put an end to those problems. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-06Merge tag 'perf-urgent-for-mingo-5.4-20191105' of ↵Thomas Gleixner5-38/+14
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf fixes from Arnaldo Carvalho de Melo: perf report/top: Jiri Olsa: - Fix time sorting for big numbers, i.e.: perf report -s time -F time,overhead --stdio was failing because the sort comparision routine was returning 'int' while that particular -s key was int64_t, fix it. perf scripting engines: Steven Rostedt (VMware): - Iterate on tep event arrays directly, fixing a bug when generating python/perl source code from a perf.data file with more than one tracepoint event. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-06drm/atomic: fix self-refresh helpers crtc state dereferenceRob Clark3-9/+27
drm_self_refresh_helper_update_avg_times() was incorrectly accessing the new incoming state after drm_atomic_helper_commit_hw_done(). But this state might have already been superceeded by an !nonblock atomic update resulting in dereferencing an already free'd crtc_state. TODO I *think* this will more or less do the right thing.. althought I'm not 100% sure if, for example, we enter psr in a nonblock commit, and then leave psr in a !nonblock commit that overtakes the completion of the nonblock commit. Not sure if this sort of scenario can happen in practice. But not crashing is better than crashing, so I guess we should either take this patch or rever the self-refresh helpers until Sean can figure out a better solution. Fixes: d4da4e33341c ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing") Cc: Sean Paul <[email protected]> Signed-off-by: Rob Clark <[email protected]> [seanpaul fixed up some checkpatch warns] Signed-off-by: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-11-06drm: msm: a6xx: fix debug bus register configurationSharat Masetty1-12/+12
Fix the cx debugbus related register configuration, to collect accurate bus data during gpu snapshot. This helps with complete snapshot dump and also complete proper GPU recovery. Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state") Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Sharat Masetty <[email protected]> Signed-off-by: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/339165
2019-11-06configfs: calculate the depth of parent itemHonggang Li1-1/+1
When create symbolic link, create_link should calculate the depth of the parent item. However, both the first and second parameters of configfs_get_target_path had been set to the target. Broken symbolic link created. $ targetcli ls / o- / ............................................................. [...] o- backstores .................................................. [...] | o- block ...................................... [Storage Objects: 0] | o- fileio ..................................... [Storage Objects: 2] | | o- vdev0 .......... [/dev/ramdisk1 (16.0MiB) write-thru activated] | | | o- alua ....................................... [ALUA Groups: 1] | | | o- default_tg_pt_gp ........... [ALUA state: Active/optimized] | | o- vdev1 .......... [/dev/ramdisk2 (16.0MiB) write-thru activated] | | o- alua ....................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ........... [ALUA state: Active/optimized] | o- pscsi ...................................... [Storage Objects: 0] | o- ramdisk .................................... [Storage Objects: 0] o- iscsi ................................................ [Targets: 0] o- loopback ............................................. [Targets: 0] o- srpt ................................................. [Targets: 2] | o- ib.e89a8f91cb3200000000000000000000 ............... [no-gen-acls] | | o- acls ................................................ [ACLs: 2] | | | o- ib.e89a8f91cb3200000000000000000000 ........ [Mapped LUNs: 2] | | | | o- mapped_lun0 ............................. [BROKEN LUN LINK] | | | | o- mapped_lun1 ............................. [BROKEN LUN LINK] | | | o- ib.e89a8f91cb3300000000000000000000 ........ [Mapped LUNs: 2] | | | o- mapped_lun0 ............................. [BROKEN LUN LINK] | | | o- mapped_lun1 ............................. [BROKEN LUN LINK] | | o- luns ................................................ [LUNs: 2] | | o- lun0 ...... [fileio/vdev0 (/dev/ramdisk1) (default_tg_pt_gp)] | | o- lun1 ...... [fileio/vdev1 (/dev/ramdisk2) (default_tg_pt_gp)] | o- ib.e89a8f91cb3300000000000000000000 ............... [no-gen-acls] | o- acls ................................................ [ACLs: 0] | o- luns ................................................ [LUNs: 0] o- vhost ................................................ [Targets: 0] Fixes: e9c03af21cc7 ("configfs: calculate the symlink target only once") Signed-off-by: Honggang Li <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2019-11-06ALSA: timer: Fix incorrectly assigned timer instanceTakashi Iwai1-3/+3
The clean up commit 41672c0c24a6 ("ALSA: timer: Simplify error path in snd_timer_open()") unified the error handling code paths with the standard goto, but it introduced a subtle bug: the timer instance is stored in snd_timer_open() incorrectly even if it returns an error. This may eventually lead to UAF, as spotted by fuzzer. The culprit is the snd_timer_open() code checks the SNDRV_TIMER_IFLG_EXCLUSIVE flag with the common variable timeri. This variable is supposed to be the newly created instance, but we (ab-)used it for a temporary check before the actual creation of a timer instance. After that point, there is another check for the max number of instances, and it bails out if over the threshold. Before the refactoring above, it worked fine because the code returned directly from that point. After the refactoring, however, it jumps to the unified error path that stores the timeri variable in return -- even if it returns an error. Unfortunately this stored value is kept in the caller side (snd_timer_user_tselect()) in tu->timeri. This causes inconsistency later, as if the timer was successfully assigned. In this patch, we fix it by not re-using timeri variable but a temporary variable for testing the exclusive connection, so timeri remains NULL at that point. Fixes: 41672c0c24a6 ("ALSA: timer: Simplify error path in snd_timer_open()") Reported-and-tested-by: Tristan Madani <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2019-11-06mm: memcontrol: fix network errors from failing __GFP_ATOMIC chargesJohannes Weiner1-0/+9
While upgrading from 4.16 to 5.2, we noticed these allocation errors in the log of the new kernel: SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC) cache: tw_sock_TCPv6(960:helper-logs), object size: 232, buffer size: 240, default order: 1, min order: 0 node 0: slabs: 5, objs: 170, free: 0 slab_out_of_memory+1 ___slab_alloc+969 __slab_alloc+14 kmem_cache_alloc+346 inet_twsk_alloc+60 tcp_time_wait+46 tcp_fin+206 tcp_data_queue+2034 tcp_rcv_state_process+784 tcp_v6_do_rcv+405 __release_sock+118 tcp_close+385 inet_release+46 __sock_release+55 sock_close+17 __fput+170 task_work_run+127 exit_to_usermode_loop+191 do_syscall_64+212 entry_SYSCALL_64_after_hwframe+68 accompanied by an increase in machines going completely radio silent under memory pressure. One thing that changed since 4.16 is e699e2c6a654 ("net, mm: account sock objects to kmemcg"), which made these slab caches subject to cgroup memory accounting and control. The problem with that is that cgroups, unlike the page allocator, do not maintain dedicated atomic reserves. As a cgroup's usage hovers at its limit, atomic allocations - such as done during network rx - can fail consistently for extended periods of time. The kernel is not able to operate under these conditions. We don't want to revert the culprit patch, because it indeed tracks a potentially substantial amount of memory used by a cgroup. We also don't want to implement dedicated atomic reserves for cgroups. There is no point in keeping a fixed margin of unused bytes in the cgroup's memory budget to accomodate a consumer that is impossible to predict - we'd be wasting memory and get into configuration headaches, not unlike what we have going with min_free_kbytes. We do this for physical mem because we have to, but cgroups are an accounting game. Instead, account these privileged allocations to the cgroup, but let them bypass the configured limit if they have to. This way, we get the benefits of accounting the consumed memory and have it exert pressure on the rest of the cgroup, but like with the page allocator, we shift the burden of reclaimining on behalf of atomic allocations onto the regular allocations that can block. Link: http://lkml.kernel.org/r/[email protected] Fixes: e699e2c6a654 ("net, mm: account sock objects to kmemcg") Signed-off-by: Johannes Weiner <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Suleiman Souhlal <[email protected]> Cc: Michal Hocko <[email protected]> Cc: <[email protected]> [4.18+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-11-06mm/memory_hotplug: fix updating the node spanDavid Hildenbrand1-0/+8
We recently started updating the node span based on the zone span to avoid touching uninitialized memmaps. Currently, we will always detect the node span to start at 0, meaning a node can easily span too many pages. pgdat_is_empty() will still work correctly if all zones span no pages. We should skip over all zones without spanned pages and properly handle the first detected zone that spans pages. Unfortunately, in contrast to the zone span (/proc/zoneinfo), the node span cannot easily be inspected and tested. The node span gives no real guarantees when an architecture supports memory hotplug, meaning it can easily contain holes or span pages of different nodes. The node span is not really used after init on architectures that support memory hotplug. E.g., we use it in mm/memory_hotplug.c:try_offline_node() and in mm/kmemleak.c:kmemleak_scan(). These users seem to be fine. Link: http://lkml.kernel.org/r/[email protected] Fixes: 00d6c019b5bc ("mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()") Signed-off-by: David Hildenbrand <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Dan Williams <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-11-06scripts/gdb: fix debugging modules compiled with hot/cold partitioningIlya Leoshkevich1-1/+2
gcc's -freorder-blocks-and-partition option makes it group frequently and infrequently used code in .text.hot and .text.unlikely sections respectively. At least when building modules on s390, this option is used by default. gdb assumes that all code is located in .text section, and that .text section is located at module load address. With such modules this is no longer the case: there is code in .text.hot and .text.unlikely, and either of them might precede .text. Fix by explicitly telling gdb the addresses of code sections. It might be tempting to do this for all sections, not only the ones in the white list. Unfortunately, gdb appears to have an issue, when telling it about e.g. loadable .note.gnu.build-id section causes it to think that non-loadable .note.Linux section is loaded at address 0, which in turn causes NULL pointers to be resolved to bogus symbols. So keep using the white list approach for the time being. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Jan Kiszka <[email protected]> Cc: Kieran Bingham <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>