aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-12ASoC: rt5640: Do not disable/enable IRQ twice on suspend/resumeHans de Goede1-4/+3
When jack-detect was originally added disabling the IRQ during suspend was done by the sound/soc/intel/boards/bytcr_rt5640.c driver calling snd_soc_component_set_jack(NULL) on suspend, which calls rt5640_disable_jack_detect(), which calls free_irq() which also disables it. Commit 5fabcc90e79b ("ASoC: rt5640: Fix Jack work after system suspend") added disable_irq() / enable_irq() calls on suspend/resume for machine drivers which do not call snd_soc_component_set_jack(NULL) on suspend. The new disable_irq() / enable_irq() are made conditional by "if (rt5640->irq)" statements, but this is true for the machine drivers which do call snd_soc_component_set_jack(NULL) on suspend too, causing a disable_irq() call there on the already free-ed IRQ. Change the "if (rt5640->irq)" condition to "if (rt5640->jack)" to fix this, rt5640->jack is only set if the jack-detect IRQ handler is still active when rt5640_suspend() runs. And adjust rt5640_enable_hda_jack_detect()'s request_irq() error handling to set rt5640->jack to NULL to match (note that the old setting of irq to -ENOXIO still resulted in disable_irq(-ENOXIO) calls on suspend). Fixes: 5fabcc90e79b ("ASoC: rt5640: Fix Jack work after system suspend") Cc: Oder Chiou <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-09-12ASoC: rt5640: Fix sleep in atomic contextHans de Goede1-4/+2
Following prints are observed while testing audio on Jetson AGX Orin which has onboard RT5640 audio codec: BUG: sleeping function called from invalid context at kernel/workqueue.c:3027 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/0 preempt_count: 10001, expected: 0 RCU nest depth: 0, expected: 0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:159 __handle_irq_event_percpu+0x1e0/0x270 ---[ end trace ad1c64905aac14a6 ]- The IRQ handler rt5640_irq() runs in interrupt context and can sleep during cancel_delayed_work_sync(). The only thing which rt5640_irq() does is cancel + (re-)queue the jack_work delayed_work. This can be done in a single non sleeping call by replacing queue_delayed_work() with mod_delayed_work(), avoiding the sleep in atomic context. Fixes: 051dade34695 ("ASoC: rt5640: Fix the wrong state of JD1 and JD2") Reported-by: Sameer Pujar <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Cc: Oder Chiou <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-09-12ASoC: rt5640: Revert "Fix sleep in atomic context"Hans de Goede1-7/+5
Commit 70a6404ff610 ("ASoC: rt5640: Fix sleep in atomic context") not only switched from request_irq() to request_threaded_irq(), to fix the sleep in atomic context issue, but it also added devm management of the IRQ by actually switching to devm_request_threaded_irq() (without any explanation in the commit message for this change). This is wrong since the IRQ was already explicitly managed by the driver. On unbind the ASoC core will call rt5640_set_jack(NULL) which in turn will call rt5640_disable_jack_detect() which frees the IRQ already. So now we have a double free. Besides the unexplained switch to devm being wrong, the actual fix for the sleep in atomic context issue also is not the best solution. The only thing which rt5640_irq() does is cancel + (re-)queue the jack_work delayed_work. This can be done in a single non sleeping call by replacing queue_delayed_work() with mod_delayed_work(), which does not sleep. Using mod_delayed_work() is a much better fix then adding a thread which does nothing other then queuing a work-item. This patch is a straight revert of the troublesome changes, the switch to mod_delayed_work() is done in a separate follow-up patch. Fixes: 70a6404ff610 ("ASoC: rt5640: Fix sleep in atomic context") Cc: Sameer Pujar <[email protected]> Cc: Oder Chiou <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-09-12ALSA: core: Use dev_name of card_dev as debugfs directory namePeter Ujfalusi1-5/+2
There is no need to use temporary string for the debugfs directory name as we can use the device name of the card. This change will also fixes the following compiler warning/error (W=1): sound/core/init.c: In function ‘snd_card_init’: sound/core/init.c:367:28: error: ‘%d’ directive writing between 1 and 10 bytes into a region of size 4 [-Werror=format-overflow=] 367 | sprintf(name, "card%d", idx); | ^~ sound/core/init.c:367:23: note: directive argument in the range [0, 2147483646] 367 | sprintf(name, "card%d", idx); | ^~~~~~~~ sound/core/init.c:367:9: note: ‘sprintf’ output between 6 and 15 bytes into a destination of size 8 367 | sprintf(name, "card%d", idx); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors The idx is guarantied to be less than SNDRV_CARDS (max 256 or 8) by the code in snd_card_init(), however the compiler does not see that. The warnings got brought to light by a recent patch upstream: commit 6d4ab2e97dcf ("extrawarn: enable format and stringop overflow warnings in W=1") Suggested-by: Arnd Bergmann <[email protected]> Suggested-by: Takashi Iwai <[email protected]> Signed-off-by: Peter Ujfalusi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-09-12net: macb: fix sleep inside spinlockSascha Hauer1-2/+3
macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() which can sleep. This results in: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 | pps pps1: new PPS source ptp1 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | 4 locks held by kworker/u4:3/40: | #0: ffff000003409148 | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac | irq event stamp: 113998 | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 | Hardware name: ... ZynqMP ... (DT) | Workqueue: events_power_efficient phylink_resolve | Call trace: | dump_backtrace+0x98/0xf0 | show_stack+0x18/0x24 | dump_stack_lvl+0x60/0xac | dump_stack+0x18/0x24 | __might_resched+0x144/0x24c | __might_sleep+0x48/0x98 | __mutex_lock+0x58/0x7b0 | mutex_lock_nested+0x24/0x30 | clk_prepare_lock+0x4c/0xa8 | clk_set_rate+0x24/0x8c | macb_mac_link_up+0x25c/0x2ac | phylink_resolve+0x178/0x4e8 | process_one_work+0x1ec/0x51c | worker_thread+0x1ec/0x3e4 | kthread+0x120/0x124 | ret_from_fork+0x10/0x20 The obvious fix is to move the call to macb_set_tx_clk() out of the protected area. This seems safe as rx and tx are both disabled anyway at this point. It is however not entirely clear what the spinlock shall protect. It could be the read-modify-write access to the NCFGR register, but this is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well without holding the spinlock. It could also be the register accesses done in mog_init_rings() or macb_init_buffers(), but again these functions are called without holding the spinlock in macb_hresp_error_task(). The locking seems fishy in this driver and it might deserve another look before this patch is applied. Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()") Signed-off-by: Sascha Hauer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-09-12drm/i915: Only check eDP HPD when AUX CH is sharedVille Syrjälä3-1/+28
Apparently Acer Chromebook C740 (BDW-ULT) doesn't have the eDP HPD line properly connected, and thus fails the new HPD check during eDP probe. The result is that we lose the eDP output. I suspect all such machines would be Chromebooks or other Linux exclusive systems as the Windows driver likely wouldn't work either. I did check a few other BDW machines here and those do have eDP HPD connected, one of them even is a different Chromebook (Samus). To account for these funky machines let's skip the HPD check when it looks like the eDP port is the only one using that specific AUX channel. In case of multiple ports sharing the same AUX CH (eg. on Asrock B250M-HDV) we still do the check and thus should correctly ignore the eDP port in favor of the other DP port (usually a DP->VGA converter). v2: Don't oops during list iteration Cc: [email protected] Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9264 Fixes: cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Luca Coelho <[email protected]> (cherry picked from commit 70052100fabec5d8c1b09c9959817a2f4517e6b5) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-09-12ASoC: amd: yc: Fix non-functional mic on Lenovo 82QF and 82UGAugust Wikerfors1-0/+14
Like the Lenovo 82TL and 82V2, the Lenovo 82QF (Yoga 7 14ARB7) and 82UG (Legion S7 16ARHA7) both need a quirk entry for the internal microphone to function. Commit c008323fe361 ("ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJ") restricted the quirk that previously matched "82" to "82V2", breaking microphone functionality on these devices. Fix this by adding specific quirks for these models, as was done for the Lenovo 82TL. Fixes: c008323fe361 ("ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJ") Closes: https://github.com/tomsom/yoga-linux/issues/51 Link: https://bugzilla.kernel.org/show_bug.cgi?id=208555#c780 Cc: [email protected] Signed-off-by: August Wikerfors <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-09-12KVM: arm64: nvhe: Ignore SVE hint in SMCCC function IDJean-Philippe Brucker7-8/+13
When SVE is enabled, the host may set bit 16 in SMCCC function IDs, a hint that indicates an unused SVE state. At the moment NVHE doesn't account for this bit when inspecting the function ID, and rejects most calls. Clear the hint bit before comparing function IDs. About version compatibility: the host's PSCI driver initially probes the firmware for a SMCCC version number. If the firmware implements a protocol recent enough (1.3), subsequent SMCCC calls have the hint bit set. Since the hint bit was reserved in earlier versions of the protocol, clearing it is fine regardless of the version in use. When a new hint is added to the protocol in the future, it will be added to ARM_SMCCC_CALL_HINTS and NVHE will handle it straight away. This patch only clears known hints and leaves reserved bits as is, because future SMCCC versions could use reserved bits as modifiers for the function ID, rather than hints. Fixes: cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") Reported-by: Ben Horgan <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-09-12KVM: arm64: Properly return allocated EL2 VA from hyp_alloc_private_va_range()Marc Zyngier1-0/+3
Marek reports that his RPi4 spits out a warning at boot time, right at the point where the GICv2 virtual CPU interface gets mapped. Upon investigation, it seems that we never return the allocated VA and use whatever was on the stack at this point. Yes, this is good stuff, and Marek was pretty lucky that he ended-up with a VA that intersected with something that was already mapped. On my setup, this random value is plausible enough for the mapping to take place. Who knows what happens... Fixes: f156a7d13fc3 ("KVM: arm64: Remove size-order align in the nVHE hyp private VA range") Reported-by: Marek Szyprowski <[email protected]> Tested-by: Marek Szyprowski <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Vincent Donnefort <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected]
2023-09-12ALSA: hda/realtek - Fixed two speaker platformKailang Yang1-2/+4
If system has two speakers and one connect to 0x14 pin, use this function will disable it. Fixes: e43252db7e20 ("ALSA: hda/realtek - ALC287 I2S speaker platform support") Signed-off-by: Kailang Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-09-12PM: hibernate: Fix the exclusive get block device in test_resume modeChen Yu1-6/+6
Commit 5904de0d735b ("PM: hibernate: Do not get block device exclusively in test_resume mode") fixes a hibernation issue under test_resume mode. That commit is supposed to open the block device in non-exclusive mode when in test_resume. However the code does the opposite, which is against its description. In summary, the swap device is only opened exclusively by swsusp_check() with its corresponding *close(), and must be in non test_resume mode. This is to avoid the race condition that different processes scribble the device at the same time. All the other cases should use non-exclusive mode. Fix it by really disabling exclusive mode under test_resume. Fixes: 5904de0d735b ("PM: hibernate: Do not get block device exclusively in test_resume mode") Closes: https://lore.kernel.org/lkml/[email protected]/ Reported-by: Pengfei Xu <[email protected]> Signed-off-by: Chen Yu <[email protected]> Tested-by: Chenzhou Feng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2023-09-12PM: hibernate: Rename function parameter from snapshot_test to exclusiveChen Yu2-8/+10
Several functions reply on snapshot_test to decide whether to open the resume device exclusively. However there is no strict connection between the snapshot_test and the open mode. Rename the 'snapshot_test' input parameter to 'exclusive' to better reflect the use case. No functional change is expected. Signed-off-by: Chen Yu <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2023-09-12ALSA: seq: Avoid delivery of events for disabled UMP groupsTakashi Iwai2-0/+24
ALSA sequencer core still delivers events to the disabled UMP Group, leaving this handling to the device. But it's rather risky and it's easy to imagine that such an unexpected event may screw up the device firmware. This patch avoids the superfluous event deliveries by setting the group_filter of the UMP client as default, and evaluate the group_filter properly at delivery from non-UMP clients. The grouop_filter is updated upon the dynamic UMP Function Block updates, so that it follows the change of the disabled UMP Groups, too. Fixes: d2b706077792 ("ALSA: seq: Add UMP group filter") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-09-12ALSA: docs: Fix a typo of midi2_ump_probe option for snd-usb-audioTakashi Iwai1-2/+2
A simple typo fix: midi2_probe => midi2_ump_probe. Fixes: febdfa0e9c8a ("ALSA: docs: Update MIDI 2.0 documentation for UMP 1.1 enhancement") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-09-12ALSA: hda: cs35l56: Call pm_runtime_dont_use_autosuspend()Richard Fitzgerald1-0/+1
Driver remove() must call pm_runtime_dont_use_autosuspend(). Drivers that call pm_runtime_use_autosuspend() must disable it in driver remove(). Unfortunately until recently this was only mentioned in 1 line in a 900+ line document so most people hadn't noticed this. It has only recently been added to the kerneldoc of pm_runtime_use_autosuspend(). Signed-off-by: Richard Fitzgerald <[email protected]> Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-09-12net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()Liu Jian1-2/+2
I got the below warning when do fuzzing test: BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470 Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9 CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G OE Hardware name: linux,dummy-virt (DT) Workqueue: pencrypt_parallel padata_parallel_worker Call trace: dump_backtrace+0x0/0x420 show_stack+0x34/0x44 dump_stack+0x1d0/0x248 __kasan_report+0x138/0x140 kasan_report+0x44/0x6c __asan_load4+0x94/0xd0 scatterwalk_copychunks+0x320/0x470 skcipher_next_slow+0x14c/0x290 skcipher_walk_next+0x2fc/0x480 skcipher_walk_first+0x9c/0x110 skcipher_walk_aead_common+0x380/0x440 skcipher_walk_aead_encrypt+0x54/0x70 ccm_encrypt+0x13c/0x4d0 crypto_aead_encrypt+0x7c/0xfc pcrypt_aead_enc+0x28/0x84 padata_parallel_worker+0xd0/0x2dc process_one_work+0x49c/0xbdc worker_thread+0x124/0x880 kthread+0x210/0x260 ret_from_fork+0x10/0x18 This is because the value of rec_seq of tls_crypto_info configured by the user program is too large, for example, 0xffffffffffffff. In addition, TLS is asynchronously accelerated. When tls_do_encryption() returns -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow, skmsg is released before the asynchronous encryption process ends. As a result, the UAF problem occurs during the asynchronous processing of the encryption module. If the operation is asynchronous and the encryption module returns EINPROGRESS, do not free the record information. Fixes: 635d93981786 ("net/tls: free record only on encryption error") Signed-off-by: Liu Jian <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-09-12Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann14738-219057/+488085
Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <[email protected]>
2023-09-11Merge branch 'Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init'Martin KaFai Lau2-5/+68
Eduard Zingerman says: ==================== For a device bound BPF program with flag BPF_F_XDP_DEV_BOUND_ONLY, in case if device does not support offload, __bpf_prog_dev_bound_init() creates a dummy bpf_offload_netdev struct with .offdev field set to NULL. This dummy struct might be reused for programs without this flag bound to the same device. However, bpf_prog_offload_verifier_prep() that uses bpf_offload_netdev assumes that .offdev field cannot be NULL. This bug was reported by syzbot in [1]. [1] https://lore.kernel.org/bpf/[email protected]/ ==================== Signed-off-by: Martin KaFai Lau <[email protected]>
2023-09-11selftests/bpf: Offloaded prog after non-offloaded should not cause BUGEduard Zingerman1-0/+61
Check what happens if non-offloaded dev bound BPF program is followed by offloaded dev bound program. Test case adapated from syzbot report [1]. [1] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-09-12objtool: Fix _THIS_IP_ detection for cold functionsJosh Poimboeuf1-1/+2
Cold functions and their non-cold counterparts can use _THIS_IP_ to reference each other. Don't warn about !ENDBR in that case. Note that for GCC this is currently irrelevant in light of the following commit c27cd083cfb9 ("Compiler attributes: GCC cold function alignment workarounds") which disabled cold functions in the kernel. However this may still be possible with Clang. Fixes several warnings like the following: drivers/scsi/bnx2i/bnx2i.prelink.o: warning: objtool: bnx2i_hw_ep_disconnect+0x19d: relocation to !ENDBR: bnx2i_hw_ep_disconnect.cold+0x0 drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr4_event.cold+0x28: relocation to !ENDBR: ipvlan_addr4_event+0xda drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr6_event.cold+0x26: relocation to !ENDBR: ipvlan_addr6_event+0xb7 drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_set_ringparam.cold+0x17: relocation to !ENDBR: tg3_set_ringparam+0x115 drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_self_test.cold+0x17: relocation to !ENDBR: tg3_self_test+0x2e1 drivers/target/iscsi/cxgbit/cxgbit.prelink.o: warning: objtool: __cxgbit_free_conn.cold+0x24: relocation to !ENDBR: __cxgbit_free_conn+0xfb net/can/can.prelink.o: warning: objtool: can_rx_unregister.cold+0x2c: relocation to !ENDBR: can_rx_unregister+0x11b drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_spq_post+0xc0: relocation to !ENDBR: qed_spq_post.cold+0x9a drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_iwarp_ll2_comp_syn_pkt.cold+0x12f: relocation to !ENDBR: qed_iwarp_ll2_comp_syn_pkt+0x34b net/tipc/tipc.prelink.o: warning: objtool: tipc_nametbl_publish.cold+0x21: relocation to !ENDBR: tipc_nametbl_publish+0xa6 Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/d8f1ab6a23a6105bc023c132b105f245c7976be6.1694476559.git.jpoimboe@kernel.org
2023-09-11bpf: Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_initEduard Zingerman1-5/+7
Fix for a bug observable under the following sequence of events: 1. Create a network device that does not support XDP offload. 2. Load a device bound XDP program with BPF_F_XDP_DEV_BOUND_ONLY flag (such programs are not offloaded). 3. Load a device bound XDP program with zero flags (such programs are offloaded). At step (2) __bpf_prog_dev_bound_init() associates with device (1) a dummy bpf_offload_netdev struct with .offdev field set to NULL. At step (3) __bpf_prog_dev_bound_init() would reuse dummy struct allocated at step (2). However, downstream usage of the bpf_offload_netdev assumes that .offdev field can't be NULL, e.g. in bpf_prog_offload_verifier_prep(). Adjust __bpf_prog_dev_bound_init() to require bpf_offload_netdev with non-NULL .offdev for offloaded BPF programs. Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") Reported-by: [email protected] Closes: https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-09-11tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()Steven Rostedt (Google)1-1/+2
The eventfs files list is protected by SRCU. In earlier iterations it was protected with just RCU, but because it needed to also call sleepable code, it had to be switch to SRCU. The dcache_dir_open_wrapper() list_for_each_rcu() was missed and did not get converted over to list_for_each_srcu(). That needs to be fixed. Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Mark Rutland <[email protected]> Cc: Ajay Kaher <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Reported-by: Masami Hiramatsu (Google) <[email protected]> Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions") Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-09-11bpf: Avoid deadlock when using queue and stack maps from NMIToke Høiland-Jørgensen1-3/+18
Sysbot discovered that the queue and stack maps can deadlock if they are being used from a BPF program that can be called from NMI context (such as one that is attached to a perf HW counter event). To fix this, add an in_nmi() check and use raw_spin_trylock() in NMI context, erroring out if grabbing the lock fails. Fixes: f1a2e44a3aec ("bpf: add queue and stack maps") Reported-by: Hsin-Wei Hung <[email protected]> Tested-by: Hsin-Wei Hung <[email protected]> Co-developed-by: Hsin-Wei Hung <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-09-11selftests: user_events: create test-specific Kconfig fragmentsNaresh Kamboju1-0/+1
Create the config file in user_events directory of testcase which need more kernel configuration than the default defconfig. User could use these configs with merge_config.sh script: The Kconfig CONFIG_USER_EVENTS=y is needed for the test to read data from the following files, - "/sys/kernel/tracing/user_events_data" - "/sys/kernel/tracing/user_events_status" - "/sys/kernel/tracing/events/user_events/*" Enable config for specific testcase: (export ARCH=xxx #for cross compiling) ./scripts/kconfig/merge_config.sh .config \ tools/testing/selftests/user_events/config Enable configs for all testcases: (export ARCH=xxx #for cross compiling) ./scripts/kconfig/merge_config.sh .config \ tools/testing/selftests/*/config Cc: Beau Belgrave <[email protected]> Cc: Shuah Khan <[email protected]> Cc: [email protected] Signed-off-by: Naresh Kamboju <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-09-11ftrace/selftests: Add softlink to latest log directorySteven Rostedt (Google)1-1/+9
When I'm debugging something with the ftrace selftests and need to look at the logs, it becomes tedious that I need to do the following: ls -ltr logs [ copy the last directory ] ls logs/<paste-last-dir> to see where the logs are. Instead, do the common practice of having a "latest" softlink to the last run selftest. This way after running the selftest I only need to do: ls logs/latest/ and it will always give me the directory of the last run selftest logs! Signed-off-by: Steven Rostedt (Google) <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-09-11selftests/user_events: Fix failures when user_events is not installedBeau Belgrave5-0/+111
When user_events is not installed the self tests currently fail. Now that these self tests run by default we need to ensure they don't fail when user_events was not enabled for the kernel being tested. Add common methods to detect if tracefs and user_events is enabled. If either is not enabled skip the test. If tracefs is enabled, but is not mounted, mount tracefs and fail if there were any errors. Fail if not run as root. Fixes: 68b4d2d58389 ("selftests/user_events: Reenable build") Reported-by: Naresh Kamboju <[email protected]> Link: https://lore.kernel.org/all/CA+G9fYuugZ0OMeS6HvpSS4nuf_A3s455ecipGBvER0LJHojKZg@mail.gmail.com/ Signed-off-by: Beau Belgrave <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2023-09-11drm/amd/display: Fix 2nd DPIA encoder AssignmentMustapha Ghaddar1-3/+1
[HOW & Why] There seems to be an issue with 2nd DPIA acquiring link encoder for tiled displays. Solution is to remove check for eng_id before we get first dynamic encoder for it Reviewed-by: Cruise Hung <[email protected]> Reviewed-by: Meenakshikumar Somasundaram <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Stylon Wang <[email protected]> Signed-off-by: Mustapha Ghaddar <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Add DPIA Link Encoder Assignment FixMustapha Ghaddar5-6/+58
For DPIA we should have preferred DIG assignment based on DPIA selected as per the ASIC design. Reviewed-by: George Shen <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Mustapha Ghaddar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-09-11drm/amd/display: fix replay_mode kernel-doc warningRandy Dunlap1-1/+1
Fix the typo in the kernel-doc for @replay_mode to prevent kernel-doc warnings: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:623: warning: Incorrect use of kernel-doc format: * @replay mode: Replay supported drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:626: warning: Function parameter or member 'replay_mode' not described in 'amdgpu_hdmi_vsdb_info' Fixes: ec8e59cb4e0c ("drm/amd/display: Get replay info from VSDB") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Bhawanpreet Lakha <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Leo Li <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: Handle null atom context in VBIOS info ioctlDavid Francis1-6/+11
On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: David Francis <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11cxl/pci: Replace host_bridge->native_aer with pcie_aer_is_native()Smita Koralahalli1-2/+1
Use pcie_aer_is_native() to determine the native AER ownership as the usage of host_bride->native_aer does not cover command line override of AER ownership. Signed-off-by: Smita Koralahalli <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Reviewed-by: Robert Richter <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
2023-09-11PCI/AER: Export pcie_aer_is_native()Smita Koralahalli3-2/+3
Export and move the declaration of pcie_aer_is_native() to a common header file to be reused by cxl/pci module. Signed-off-by: Smita Koralahalli <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Reviewed-by: Robert Richter <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
2023-09-11cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registersSmita Koralahalli1-3/+3
cxl_pci fails to unmask CXL protocol errors when CXL memory error reporting is not granted native control. Given that CXL memory error reporting uses the event interface and protocol errors use AER, unmask protocol errors based only on the native AER setting. Without this change end user deployments will fail to report protocol errors in the case where native memory error handling is not granted to Linux. Also, return zero instead of an error code to not block the communication with the cxl device when in native memory error reporting mode. Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") Cc: <[email protected]> Signed-off-by: Smita Koralahalli <[email protected]> Reviewed-by: Robert Richter <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
2023-09-11tracing/synthetic: Print out u64 values properlyTero Kristo1-1/+1
The synth traces incorrectly print pointer to the synthetic event values instead of the actual value when using u64 type. Fix by addressing the contents of the union properly. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts") Cc: [email protected] Signed-off-by: Tero Kristo <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-09-11drm/amdkfd: Checkpoint and restore queues on GFX11David Francis1-0/+41
The code in kfd_mqd_manager_v11.c to support criu dump and restore of queue state was missing. Added it; should be equivalent to kfd_mqd_manager_v10.c. CC: Felix Kuehling <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: David Francis <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11tracing/synthetic: Fix order of struct trace_dynamic_infoSteven Rostedt (Google)1-3/+3
To make handling BIG and LITTLE endian better the offset/len of dynamic fields of the synthetic events was changed into a structure of: struct trace_dynamic_info { #ifdef CONFIG_CPU_BIG_ENDIAN u16 offset; u16 len; #else u16 len; u16 offset; #endif }; to replace the manual changes of: data_offset = offset & 0xffff; data_offest = len << 16; But if you look closely, the above is: <len> << 16 | offset Which in little endian would be in memory: offset_lo offset_hi len_lo len_hi and in big endian: len_hi len_lo offset_hi offset_lo Which if broken into a structure would be: struct trace_dynamic_info { #ifdef CONFIG_CPU_BIG_ENDIAN u16 len; u16 offset; #else u16 offset; u16 len; #endif }; Which is the opposite of what was defined. Fix this and just to be safe also add "__packed". Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Cc: Mark Rutland <[email protected]> Tested-by: Sven Schnelle <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Fixes: ddeea494a16f3 ("tracing/synthetic: Use union instead of casts") Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-09-11drm/amd/display: Adjust the MST resume flowWayne Lin1-13/+80
[Why] In drm_dp_mst_topology_mgr_resume() today, it will resume the mst branch to be ready handling mst mode and also consecutively do the mst topology probing. Which will cause the dirver have chance to fire hotplug event before restoring the old state. Then Userspace will react to the hotplug event based on a wrong state. [How] Adjust the mst resume flow as: 1. set dpcd to resume mst branch status 2. restore source old state 3. Do mst resume topology probing For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to pull out topology probing work into a 2nd part procedure of the mst resume. Will have a follow up patch in drm. Reviewed-by: Chao-kai Wang <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Stylon Wang <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: fallback to old RAS error message for aqua_vanjaramHawking Zhang1-2/+4
So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOVAlex Deucher1-0/+3
Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu/soc21: don't remap HDP registers for SR-IOVAlex Deucher1-1/+1
This matches the behavior for soc15 and nv. Acked-by: Christian König <[email protected]> Reviewed-by: Timmy Tsai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Don't check registers, if using AUX BL controlSwapnil Patel1-1/+3
[Why] Currently the driver looks DCN registers to access if BL is on or not. This check is not valid if we are using AUX based brightness control. This causes driver to not send out "backlight off" command during power off sequence as it already thinks it is off. [How] Only check DCN registers if we aren't using AUX based brightness control. Reviewed-by: Wenjing Liu <[email protected]> Acked-by: Stylon Wang <[email protected]> Signed-off-by: Swapnil Patel <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: fix retry loop testDan Carpenter1-1/+1
This loop will exit with "retry" set to -1 if it fails but the code checks for if "retry" is zero. Fix this by changing post-op to a pre-op. --retry vs retry--. Fixes: e01eeffc3f86 ("drm/amd/pm: avoid driver getting empty metrics table for the first time") Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Add dirty rect support for ReplayBhawanpreet Lakha1-1/+2
Dirty rect can be used with replay, so enable them to allow for more powersaving. Reviewed-by: Sun peng Li <[email protected]> Acked-by: Stylon Wang <[email protected]> Signed-off-by: Bhawanpreet Lakha <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory"Hamza Mahfooz3-29/+3
This reverts commit 70e64c4d522b732e31c6475a3be2349de337d321. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: [email protected] # 6.1+ Acked-by: Harry Wentland <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: fix the white screen issue when >= 64GB DRAMYifan Zhang1-5/+9
Dropping bit 31:4 of page table base is wrong, it makes page table base points to wrong address if phys addr is beyond 64GB; dropping page_table_start/end bit 31:4 is unnecessary since dcn20_vmid_setup will do that. Also, while we are at it, cleanup the assignments using upper_32_bits()/lower_32_bits() and AMDGPU_GPU_PAGE_SHIFT. Cc: [email protected] Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)") Acked-by: Harry Wentland <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yifan Zhang <[email protected]> Co-developed-by: Hamza Mahfooz <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11blk-mq: fix tags UAF when shrinking q->nr_hw_queuesChengming Zhou1-6/+7
When nr_hw_queues shrink, we free the excess tags before realloc'ing hw_ctxs for each queue. During that resize, we may need to access those tags, like blk_mq_tag_idle(hctx) will access queue shared tags. This can cause a slab use-after-free, as reported by KASAN. Fix it by moving the releasing of excess tags to the end. Fixes: e1dd7bc93029 ("blk-mq: fix tags leak when shrink nr_hw_queues") Reported-by: Yi Zhang <[email protected]> Closes: https://lore.kernel.org/all/CAHj4cs_CK63uoDpGBGZ6DN4OCTpzkR3UaVgK=LX8Owr8ej2ieQ@mail.gmail.com/ Cc: Ming Lei <[email protected]> Signed-off-by: Chengming Zhou <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-09-11drm/amdkfd: Update CU masking for GFX 9.4.3Mukul Joshi7-28/+56
The CU mask passed from user-space will change based on different spatial partitioning mode. As a result, update CU masking code for GFX9.4.3 to work for all partitioning modes. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Update cache info reporting for GFX v9.4.3Mukul Joshi3-37/+51
Update cache info reporting in sysfs to report the correct number of CUs and associated cache information based on different spatial partitioning modes. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3Mukul Joshi14-65/+60
Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Fix unaligned 64-bit doorbell warningMukul Joshi1-0/+2
This patch fixes the following unaligned 64-bit doorbell warning seen when submitting packets on HIQ on GFX v9.4.3 by making the HIQ doorbell 64-bit aligned. The warning is seen when GPU is loaded in any mode other than SPX mode. [ +0.000301] ------------[ cut here ]------------ [ +0.000003] Unaligned 64-bit doorbell [ +0.000030] WARNING: /amdkfd/kfd_doorbell.c:339 write_kernel_doorbell64+0x72/0x80 [ +0.000003] RIP: 0010:write_kernel_doorbell64+0x72/0x80 [ +0.000004] RSP: 0018:ffffc90004287730 EFLAGS: 00010246 [ +0.000005] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ +0.000003] RDX: 0000000000000001 RSI: ffffffff82837c71 RDI: 00000000ffffffff [ +0.000003] RBP: ffffc90004287748 R08: 0000000000000003 R09: 0000000000000001 [ +0.000002] R10: 000000000000001a R11: ffff88a034008198 R12: ffffc900013bd004 [ +0.000003] R13: 0000000000000008 R14: ffffc900042877b0 R15: 000000000000007f [ +0.000003] FS: 00007fa8c7b62000(0000) GS:ffff889f88400000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000056111c45aaf0 CR3: 00000001414f2002 CR4: 0000000000770ee0 [ +0.000003] PKRU: 55555554 [ +0.000002] Call Trace: [ +0.000004] <TASK> [ +0.000006] kq_submit_packet+0x45/0x50 [amdgpu] [ +0.000524] pm_send_set_resources+0x7f/0xc0 [amdgpu] [ +0.000500] set_sched_resources+0xe4/0x160 [amdgpu] [ +0.000503] start_cpsch+0x1c5/0x2a0 [amdgpu] [ +0.000497] kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu] [ +0.000743] amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu] [ +0.000602] amdgpu_device_init.cold+0x1813/0x2176 [amdgpu] [ +0.000684] ? pci_bus_read_config_word+0x4a/0x80 [ +0.000012] ? do_pci_enable_device+0xdc/0x110 [ +0.000008] amdgpu_driver_load_kms+0x1a/0x110 [amdgpu] [ +0.000545] amdgpu_pci_probe+0x197/0x400 [amdgpu] Fixes: c31866651086 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells") Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>