aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2023-09-14net: renesas: rswitch: Fix unmasking irq conditionYoshihiro Shimoda1-4/+4
Fix unmasking irq condition by using napi_complete_done(). Otherwise, redundant interrupts happen. Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-09-13scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rportsJustin Tee2-8/+19
During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is decremented twice. Once for SCSI transport registration and second to remove the initial node allocation kref. If there is also an NVMe transport registration, another reference count decrement is expected in lpfc_nvme_unregister_port(). Race conditions between the NVMe transport remoteport_delete and dev_loss_tmo callbacks sometimes results in premature ndlp object release resulting in use-after-free issues. Fix by not dropping the ndlp object in dev_loss_tmo callback with an outstanding NVMe transport registration. Inversely, mark the final NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set. Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmoJustin Tee1-1/+1
When a dev_loss_tmo event occurs, an ndlp lock is taken before checking nlp_flag for NLP_DROPPED. There is an attempt to restore the ndlp lock when exiting the if statement, but the nlp_put kref could be the final decrement causing a use-after-free memory access on a released ndlp object. Instead of trying to reacquire the ndlp lock after checking nlp_flag, just return after calling nlp_put. Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: "Ewan D. Milne" <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()Jinjie Ruan1-7/+7
Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to check the return value. Fixes: 2fcbc569b9f5 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI") Fixes: 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures") Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable") Fixes: 9f77870870d8 ("scsi: lpfc: Add debugfs support for cm framework buffers") Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing") Signed-off-by: Jinjie Ruan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Justin Tee <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: target: core: Fix target_cmd_counter leakDavid Disseldorp1-0/+1
The target_cmd_counter struct allocated via target_alloc_cmd_counter() is never freed, resulting in leaks across various transport types, e.g.: unreferenced object 0xffff88801f920120 (size 96): comm "sh", pid 102, jiffies 4294892535 (age 713.412s) hex dump (first 32 bytes): 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 38 01 92 1f 80 88 ff ff ........8....... backtrace: [<00000000e58a6252>] kmalloc_trace+0x11/0x20 [<0000000043af4b2f>] target_alloc_cmd_counter+0x17/0x90 [target_core_mod] [<000000007da2dfa7>] target_setup_session+0x2d/0x140 [target_core_mod] [<0000000068feef86>] tcm_loop_tpg_nexus_store+0x19b/0x350 [tcm_loop] [<000000006a80e021>] configfs_write_iter+0xb1/0x120 [<00000000e9f4d860>] vfs_write+0x2e4/0x3c0 [<000000008143433b>] ksys_write+0x80/0xb0 [<00000000a7df29b2>] do_syscall_64+0x42/0x90 [<0000000053f45fb8>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Free the structure alongside the corresponding iscsit_conn / se_sess parent. Signed-off-by: David Disseldorp <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: becd9be6069e ("scsi: target: Move sess cmd counter to new struct") Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: pm8001: Setup IRQs on resumeDamien Le Moal1-34/+17
The function pm8001_pci_resume() only calls pm8001_request_irq() without calling pm8001_setup_irq(). This causes the IRQ allocation to fail, which leads all drives being removed from the system. Fix this issue by integrating the code for pm8001_setup_irq() directly inside pm8001_request_irq() so that MSI-X setup is performed both during normal initialization and resume operations. Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver") Cc: [email protected] Signed-off-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Jack Wang <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: pm80xx: Avoid leaking tags when processing ↵Michal Grzedzicki1-0/+2
OPC_INB_SET_CONTROLLER_CONFIG command Tags allocated for OPC_INB_SET_CONTROLLER_CONFIG command need to be freed when we receive the response. Signed-off-by: Michal Grzedzicki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Jack Wang <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13scsi: pm80xx: Use phy-specific SAS address when sending PHY_START commandMichal Grzedzicki2-2/+2
Some cards have more than one SAS address. Using an incorrect address causes communication issues with some devices like expanders. Closes: https://lore.kernel.org/linux-kernel/[email protected]/ Signed-off-by: Michal Grzedzicki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Jack Wang <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13Merge branch '6.6/scsi-staging' into 6.6/scsi-fixesMartin K. Petersen13-52/+60
Pull in staged fixes for 6.6. Signed-off-by: Martin K. Petersen <[email protected]>
2023-09-13Merge tag 'pmdomain-v6.6-rc1' of ↵Linus Torvalds86-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull genpm / pmdomain rename from Ulf Hansson: "This renames the genpd subsystem to pmdomain. As discussed on LKML, using 'genpd' as the name of a subsystem isn't very self-explanatory and the acronym itself that means Generic PM Domain, is known only by a limited group of people. The suggestion to improve the situation is to rename the subsystem to 'pmdomain', which there seems to be a good consensus around using. Ideally it should indicate that its purpose is to manage Power Domains or 'PM domains' as we often also use within the Linux Kernel terminology" * tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: Rename the genpd subsystem to pmdomain
2023-09-13Merge tag 'tpmdd-v6.6-rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fix from Jarkko Sakkinen. * tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Fix typo in tpmrm class definition
2023-09-13Merge tag 'parisc-for-6.6-rc2' of ↵Linus Torvalds6-41/+33
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: - fix reference to exported symbols for parisc64 [Masahiro Yamada] - Block-TLB (BTLB) support on 32-bit CPUs - sparse and build-warning fixes * tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: linux/export: fix reference to exported functions for parisc64 parisc: BTLB: Initialize BTLB tables at CPU startup parisc: firmware: Simplify calling non-PA20 functions parisc: BTLB: _edata symbol has to be page aligned for BTLB support parisc: BTLB: Add BTLB insert and purge firmware function wrappers parisc: BTLB: Clear possibly existing BTLB entries parisc: Prepare for Block-TLB support on 32-bit kernel parisc: shmparam.h: Document aliasing requirements of PA-RISC parisc: irq: Make irq_stack_union static to avoid sparse warning parisc: drivers: Fix sparse warning parisc: iosapic.c: Fix sparse warnings parisc: ccio-dma: Fix sparse warnings parisc: sba-iommu: Fix sparse warnigs parisc: sba: Fix compile warning wrt list of SBA devices parisc: sba_iommu: Fix build warning if procfs if disabled
2023-09-13igb: clean up in all error paths when enabling SR-IOVCorinna Vinschen1-1/+4
After commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), removing the igb module could hang or crash (depending on the machine) when the module has been loaded with the max_vfs parameter set to some value != 0. In case of one test machine with a dual port 82580, this hang occurred: [ 232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1 [ 233.093257] igb 0000:41:00.1: IOV Disabled [ 233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0 [ 233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata) [ 233.352248] igb 0000:41:00.0: device [8086:1516] error status/mask=00100000 [ 233.361088] igb 0000:41:00.0: [20] UnsupReq (First) [ 233.368183] igb 0000:41:00.0: AER: TLP Header: 40000001 0000040f cdbfc00c c [ 233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata) [ 233.388779] igb 0000:41:00.1: device [8086:1516] error status/mask=00100000 [ 233.397629] igb 0000:41:00.1: [20] UnsupReq (First) [ 233.404736] igb 0000:41:00.1: AER: TLP Header: 40000001 0000040f cdbfc00c c [ 233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback) [ 233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0 [ 233.546197] pcieport 0000:40:01.0: AER: device recovery failed [ 234.157244] igb 0000:41:00.0: IOV Disabled [ 371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds. [ 371.627489] Not tainted 6.4.0-dirty #2 [ 371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this. [ 371.641000] task:irq/35-aerdrv state:D stack:0 pid:257 ppid:2 f0 [ 371.650330] Call Trace: [ 371.653061] <TASK> [ 371.655407] __schedule+0x20e/0x660 [ 371.659313] schedule+0x5a/0xd0 [ 371.662824] schedule_preempt_disabled+0x11/0x20 [ 371.667983] __mutex_lock.constprop.0+0x372/0x6c0 [ 371.673237] ? __pfx_aer_root_reset+0x10/0x10 [ 371.678105] report_error_detected+0x25/0x1c0 [ 371.682974] ? __pfx_report_normal_detected+0x10/0x10 [ 371.688618] pci_walk_bus+0x72/0x90 [ 371.692519] pcie_do_recovery+0xb2/0x330 [ 371.696899] aer_process_err_devices+0x117/0x170 [ 371.702055] aer_isr+0x1c0/0x1e0 [ 371.705661] ? __set_cpus_allowed_ptr+0x54/0xa0 [ 371.710723] ? __pfx_irq_thread_fn+0x10/0x10 [ 371.715496] irq_thread_fn+0x20/0x60 [ 371.719491] irq_thread+0xe6/0x1b0 [ 371.723291] ? __pfx_irq_thread_dtor+0x10/0x10 [ 371.728255] ? __pfx_irq_thread+0x10/0x10 [ 371.732731] kthread+0xe2/0x110 [ 371.736243] ? __pfx_kthread+0x10/0x10 [ 371.740430] ret_from_fork+0x2c/0x50 [ 371.744428] </TASK> The reproducer was a simple script: #!/bin/sh for i in `seq 1 5`; do modprobe -rv igb modprobe -v igb max_vfs=1 sleep 1 modprobe -rv igb done It turned out that this could only be reproduce on 82580 (quad and dual-port), but not on 82576, i350 and i210. Further debugging showed that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because dev->is_physfn is 0 on 82580. Prior to commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), igb_enable_sriov() jumped into the "err_out" cleanup branch. After this commit it only returned the error code. So the cleanup didn't take place, and the incorrect VF setup in the igb_adapter structure fooled the igb driver into assuming that VFs have been set up where no VF actually existed. Fix this problem by cleaning up again if pci_enable_sriov() fails. Fixes: 50f303496d92 ("igb: Enable SR-IOV after reinit") Signed-off-by: Corinna Vinschen <[email protected]> Reviewed-by: Akihiko Odaki <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-09-13ixgbe: fix timestamp configuration codeVadim Fedorenko1-13/+15
The commit in fixes introduced flags to control the status of hardware configuration while processing packets. At the same time another structure is used to provide configuration of timestamper to user-space applications. The way it was coded makes this structures go out of sync easily. The repro is easy for 82599 chips: [root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1 current settings: tx_type 0 rx_filter 0 new settings: tx_type 1 rx_filter 12 The eth0 device is properly configured to timestamp any PTPv2 events. [root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1 current settings: tx_type 1 rx_filter 12 SIOCSHWTSTAMP failed: Numerical result out of range The requested time stamping mode is not supported by the hardware. The error is properly returned because HW doesn't support all packets timestamping. But the adapter->flags is cleared of timestamp flags even though no HW configuration was done. From that point no RX timestamps are received by user-space application. But configuration shows good values: [root@hostname ~]# hwstamp_ctl -i eth0 current settings: tx_type 1 rx_filter 12 Fix the issue by applying new flags only when the HW was actually configured. Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Vadim Fedorenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-09-13i2c: cadence: Fix the kernel-doc warningsShubhrajyoti Datta1-0/+1
This fixes the below warnings drivers/i2c/busses/i2c-cadence.c:221: warning: Function parameter or member 'rinfo' not described in 'cdns_i2c' Reviewed-by: Andi Shyti <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Shubhrajyoti Datta <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-09-13pmdomain: Rename the genpd subsystem to pmdomainUlf Hansson86-1/+1
It has been pointed out that naming a subsystem "genpd" isn't very self-explanatory and the acronym itself that means Generic PM Domain, is known only by a limited group of people. In a way to improve the situation, let's rename the subsystem to pmdomain, which ideally should indicate that this is about so called Power Domains or "PM domains" as we often also use within the Linux Kernel terminology. Suggested-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Heiko Stuebner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-09-13i2c: aspeed: Reset the i2c controller when timeout occursTommy Huang1-2/+5
Reset the i2c controller when an i2c transfer timeout occurs. The remaining interrupts and device should be reset to avoid unpredictable controller behavior. Fixes: 2e57b7cebb98 ("i2c: aspeed: Add multi-master use case support") Cc: <[email protected]> # v5.1+ Signed-off-by: Tommy Huang <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-09-13i2c: I2C_MLXCPLD on ARM64 should depend on ACPIGeert Uytterhoeven1-2/+2
The "i2c_mlxcpld" platform device is only instantiated on X86 systems (through drivers/platform/x86/mlx-platform.c), or on ARM64 systems with ACPI (through drivers/platform/mellanox/nvsw-sn2201.c). Hence further restrict the dependency on ARM64 to ACPI, to prevent asking the user about this driver when configuring an ARM64 kernel without ACPI support. While at it, document in the Kconfig help text that the driver supports ARM64/ACPI based systems, too. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Acked-by: Andi Shyti <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-09-13i2c: Make I2C_ATR invisibleGeert Uytterhoeven1-1/+1
I2C Address Translator (ATR) support is not a stand-alone driver, but a library. All of its users select I2C_ATR. Hence there is no need for the user to enable this symbol manually, except when compile-testing. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Luca Ceresoli <[email protected]> Reviewed-by: Tomi Valkeinen <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-09-13w1: ds2482: Switch back to use struct i2c_driver's .probe()Uwe Kleine-König1-1/+1
After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-09-12drm/amdkfd: Insert missing TLB flush on GFX10 and laterHarish Kasiviswanathan1-2/+1
Heavy-weight TLB flush is required after unmap on all GPUs for correctness and security. Signed-off-by: Harish Kasiviswanathan <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-09-12tpm: Fix typo in tpmrm class definitionJustin M. Forbes1-1/+1
Commit d2e8071bed0be ("tpm: make all 'class' structures const") unfortunately had a typo for the name on tpmrm. Fixes: d2e8071bed0b ("tpm: make all 'class' structures const") Signed-off-by: Justin M. Forbes <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2023-09-12nvme-pci: do not set the NUMA node of device if it has nonePratyush Yadav1-3/+0
If a device has no NUMA node information associated with it, the driver puts the device in node first_memory_node (say node 0). Not having a NUMA node and being associated with node 0 are completely different things and it makes little sense to mix the two. Signed-off-by: Pratyush Yadav <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2023-09-12veth: Update XDP feature set when bringing up deviceToke Høiland-Jørgensen1-0/+2
There's an early return in veth_set_features() if the device is in a down state, which leads to the XDP feature flags not being updated when enabling GRO while the device is down. Which in turn leads to XDP_REDIRECT not working, because the redirect code now checks the flags. Fix this by updating the feature flags after bringing the device up. Before this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: no NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: no After this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: yes NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: yes Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-09-12driver core: return an error when dev_set_name() hasn't happenedAndy Shevchenko1-0/+2
The commit d21fdd07cea4 ("driver core: Return proper error code when dev_set_name() fails") rewrote the logic of handling the dev_set_name() error codes, but missed the point that initially set error value to -EINVAL might be rewritten and hence the error path can't be triggered at some circumstances. To fix this, make sure that error variable is set to -EINVAL when other conditionals are false. Reported-by: [email protected] Fixes: d21fdd07cea4 ("driver core: Return proper error code when dev_set_name() fails") Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-09-12Revert "comedi: add HAS_IOPORT dependencies"Ian Abbott1-68/+35
This reverts commit b5c75b68b7ded84d4c82118974ce3975a4dcaa74. The commit makes it impossible to select configuration options that depend on COMEDI_8254, COMEDI_DAS08, COMEDI_NI_LABPC, or COMEDI_AMPLC_DIO200 options due to changing 'select' directives to 'depends on' directives and there being no other way to select those codependent configuration options. Fixes: b5c75b68b7de ("comedi: add HAS_IOPORT dependencies") Cc: Niklas Schnelle <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: <[email protected]> # v6.5+ Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Ian Abbott <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-09-12net: macb: fix sleep inside spinlockSascha Hauer1-2/+3
macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() which can sleep. This results in: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 | pps pps1: new PPS source ptp1 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | 4 locks held by kworker/u4:3/40: | #0: ffff000003409148 | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac | irq event stamp: 113998 | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 | Hardware name: ... ZynqMP ... (DT) | Workqueue: events_power_efficient phylink_resolve | Call trace: | dump_backtrace+0x98/0xf0 | show_stack+0x18/0x24 | dump_stack_lvl+0x60/0xac | dump_stack+0x18/0x24 | __might_resched+0x144/0x24c | __might_sleep+0x48/0x98 | __mutex_lock+0x58/0x7b0 | mutex_lock_nested+0x24/0x30 | clk_prepare_lock+0x4c/0xa8 | clk_set_rate+0x24/0x8c | macb_mac_link_up+0x25c/0x2ac | phylink_resolve+0x178/0x4e8 | process_one_work+0x1ec/0x51c | worker_thread+0x1ec/0x3e4 | kthread+0x120/0x124 | ret_from_fork+0x10/0x20 The obvious fix is to move the call to macb_set_tx_clk() out of the protected area. This seems safe as rx and tx are both disabled anyway at this point. It is however not entirely clear what the spinlock shall protect. It could be the read-modify-write access to the NCFGR register, but this is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well without holding the spinlock. It could also be the register accesses done in mog_init_rings() or macb_init_buffers(), but again these functions are called without holding the spinlock in macb_hresp_error_task(). The locking seems fishy in this driver and it might deserve another look before this patch is applied. Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()") Signed-off-by: Sascha Hauer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-09-12drm/i915: Only check eDP HPD when AUX CH is sharedVille Syrjälä3-1/+28
Apparently Acer Chromebook C740 (BDW-ULT) doesn't have the eDP HPD line properly connected, and thus fails the new HPD check during eDP probe. The result is that we lose the eDP output. I suspect all such machines would be Chromebooks or other Linux exclusive systems as the Windows driver likely wouldn't work either. I did check a few other BDW machines here and those do have eDP HPD connected, one of them even is a different Chromebook (Samus). To account for these funky machines let's skip the HPD check when it looks like the eDP port is the only one using that specific AUX channel. In case of multiple ports sharing the same AUX CH (eg. on Asrock B250M-HDV) we still do the check and thus should correctly ignore the eDP port in favor of the other DP port (usually a DP->VGA converter). v2: Don't oops during list iteration Cc: [email protected] Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9264 Fixes: cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Luca Coelho <[email protected]> (cherry picked from commit 70052100fabec5d8c1b09c9959817a2f4517e6b5) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-09-12Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann6940-109305/+239077
Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <[email protected]>
2023-09-11drm/amd/display: Fix 2nd DPIA encoder AssignmentMustapha Ghaddar1-3/+1
[HOW & Why] There seems to be an issue with 2nd DPIA acquiring link encoder for tiled displays. Solution is to remove check for eng_id before we get first dynamic encoder for it Reviewed-by: Cruise Hung <[email protected]> Reviewed-by: Meenakshikumar Somasundaram <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Stylon Wang <[email protected]> Signed-off-by: Mustapha Ghaddar <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Add DPIA Link Encoder Assignment FixMustapha Ghaddar5-6/+58
For DPIA we should have preferred DIG assignment based on DPIA selected as per the ASIC design. Reviewed-by: George Shen <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Mustapha Ghaddar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-09-11drm/amd/display: fix replay_mode kernel-doc warningRandy Dunlap1-1/+1
Fix the typo in the kernel-doc for @replay_mode to prevent kernel-doc warnings: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:623: warning: Incorrect use of kernel-doc format: * @replay mode: Replay supported drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:626: warning: Function parameter or member 'replay_mode' not described in 'amdgpu_hdmi_vsdb_info' Fixes: ec8e59cb4e0c ("drm/amd/display: Get replay info from VSDB") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Bhawanpreet Lakha <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Leo Li <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: Handle null atom context in VBIOS info ioctlDavid Francis1-6/+11
On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: David Francis <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Checkpoint and restore queues on GFX11David Francis1-0/+41
The code in kfd_mqd_manager_v11.c to support criu dump and restore of queue state was missing. Added it; should be equivalent to kfd_mqd_manager_v10.c. CC: Felix Kuehling <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: David Francis <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Adjust the MST resume flowWayne Lin1-13/+80
[Why] In drm_dp_mst_topology_mgr_resume() today, it will resume the mst branch to be ready handling mst mode and also consecutively do the mst topology probing. Which will cause the dirver have chance to fire hotplug event before restoring the old state. Then Userspace will react to the hotplug event based on a wrong state. [How] Adjust the mst resume flow as: 1. set dpcd to resume mst branch status 2. restore source old state 3. Do mst resume topology probing For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to pull out topology probing work into a 2nd part procedure of the mst resume. Will have a follow up patch in drm. Reviewed-by: Chao-kai Wang <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Stylon Wang <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: fallback to old RAS error message for aqua_vanjaramHawking Zhang1-2/+4
So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOVAlex Deucher1-0/+3
Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu/soc21: don't remap HDP registers for SR-IOVAlex Deucher1-1/+1
This matches the behavior for soc15 and nv. Acked-by: Christian König <[email protected]> Reviewed-by: Timmy Tsai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Don't check registers, if using AUX BL controlSwapnil Patel1-1/+3
[Why] Currently the driver looks DCN registers to access if BL is on or not. This check is not valid if we are using AUX based brightness control. This causes driver to not send out "backlight off" command during power off sequence as it already thinks it is off. [How] Only check DCN registers if we aren't using AUX based brightness control. Reviewed-by: Wenjing Liu <[email protected]> Acked-by: Stylon Wang <[email protected]> Signed-off-by: Swapnil Patel <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: fix retry loop testDan Carpenter1-1/+1
This loop will exit with "retry" set to -1 if it fails but the code checks for if "retry" is zero. Fix this by changing post-op to a pre-op. --retry vs retry--. Fixes: e01eeffc3f86 ("drm/amd/pm: avoid driver getting empty metrics table for the first time") Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: Add dirty rect support for ReplayBhawanpreet Lakha1-1/+2
Dirty rect can be used with replay, so enable them to allow for more powersaving. Reviewed-by: Sun peng Li <[email protected]> Acked-by: Stylon Wang <[email protected]> Signed-off-by: Bhawanpreet Lakha <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory"Hamza Mahfooz3-29/+3
This reverts commit 70e64c4d522b732e31c6475a3be2349de337d321. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: [email protected] # 6.1+ Acked-by: Harry Wentland <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amd/display: fix the white screen issue when >= 64GB DRAMYifan Zhang1-5/+9
Dropping bit 31:4 of page table base is wrong, it makes page table base points to wrong address if phys addr is beyond 64GB; dropping page_table_start/end bit 31:4 is unnecessary since dcn20_vmid_setup will do that. Also, while we are at it, cleanup the assignments using upper_32_bits()/lower_32_bits() and AMDGPU_GPU_PAGE_SHIFT. Cc: [email protected] Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)") Acked-by: Harry Wentland <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yifan Zhang <[email protected]> Co-developed-by: Hamza Mahfooz <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Update CU masking for GFX 9.4.3Mukul Joshi7-28/+56
The CU mask passed from user-space will change based on different spatial partitioning mode. As a result, update CU masking code for GFX9.4.3 to work for all partitioning modes. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Update cache info reporting for GFX v9.4.3Mukul Joshi3-37/+51
Update cache info reporting in sysfs to report the correct number of CUs and associated cache information based on different spatial partitioning modes. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3Mukul Joshi14-65/+60
Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Fix unaligned 64-bit doorbell warningMukul Joshi1-0/+2
This patch fixes the following unaligned 64-bit doorbell warning seen when submitting packets on HIQ on GFX v9.4.3 by making the HIQ doorbell 64-bit aligned. The warning is seen when GPU is loaded in any mode other than SPX mode. [ +0.000301] ------------[ cut here ]------------ [ +0.000003] Unaligned 64-bit doorbell [ +0.000030] WARNING: /amdkfd/kfd_doorbell.c:339 write_kernel_doorbell64+0x72/0x80 [ +0.000003] RIP: 0010:write_kernel_doorbell64+0x72/0x80 [ +0.000004] RSP: 0018:ffffc90004287730 EFLAGS: 00010246 [ +0.000005] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ +0.000003] RDX: 0000000000000001 RSI: ffffffff82837c71 RDI: 00000000ffffffff [ +0.000003] RBP: ffffc90004287748 R08: 0000000000000003 R09: 0000000000000001 [ +0.000002] R10: 000000000000001a R11: ffff88a034008198 R12: ffffc900013bd004 [ +0.000003] R13: 0000000000000008 R14: ffffc900042877b0 R15: 000000000000007f [ +0.000003] FS: 00007fa8c7b62000(0000) GS:ffff889f88400000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000056111c45aaf0 CR3: 00000001414f2002 CR4: 0000000000770ee0 [ +0.000003] PKRU: 55555554 [ +0.000002] Call Trace: [ +0.000004] <TASK> [ +0.000006] kq_submit_packet+0x45/0x50 [amdgpu] [ +0.000524] pm_send_set_resources+0x7f/0xc0 [amdgpu] [ +0.000500] set_sched_resources+0xe4/0x160 [amdgpu] [ +0.000503] start_cpsch+0x1c5/0x2a0 [amdgpu] [ +0.000497] kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu] [ +0.000743] amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu] [ +0.000602] amdgpu_device_init.cold+0x1813/0x2176 [amdgpu] [ +0.000684] ? pci_bus_read_config_word+0x4a/0x80 [ +0.000012] ? do_pci_enable_device+0xdc/0x110 [ +0.000008] amdgpu_driver_load_kms+0x1a/0x110 [amdgpu] [ +0.000545] amdgpu_pci_probe+0x197/0x400 [amdgpu] Fixes: c31866651086 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells") Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11drm/amdkfd: Fix reg offset for setting CWSR grace periodMukul Joshi7-16/+8
This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-11md/raid1: fix error: ISO C90 forbids mixed declarationsNigel Croxon1-2/+1
There is a compile error when this commit is added: md: raid1: fix potential OOB in raid1_remove_disk() drivers/md/raid1.c: In function 'raid1_remove_disk': drivers/md/raid1.c:1844:9: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 1844 |         struct raid1_info *p = conf->mirrors + number;     |         ^~~~~~ That's because the new code was inserted before the struct. The change is move the struct command above this commit. Fixes: 8b0472b50bcf ("md: raid1: fix potential OOB in raid1_remove_disk()") Signed-off-by: Nigel Croxon <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-09-11thermal: Constify the trip argument of the .get_trend() zone callbackRafael J. Wysocki2-2/+3
Add 'const' to the definition of the 'trip' argument of the .get_trend() thermal zone callback to indicate that the trip point passed to it should not be modified by it and adjust the callback functions implementing it, thermal_get_trend() in the ACPI thermal driver and __ti_thermal_get_trend(), accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Michal Wilczynski <[email protected]>