aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-07-19amt: add missing regeneration nonce logic in request logicTaehee Yoo1-0/+4
When AMT gateway starts sending a new request message, it should regenerate the nonce variable. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19amt: use READ_ONCE() in amt moduleTaehee Yoo1-7/+8
There are some data races in the amt module. amt->ready4, amt->ready6, and amt->status can be accessed concurrently without locks. So, it uses READ_ONCE() and WRITE_ONCE(). Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19amt: remove unnecessary locksTaehee Yoo1-27/+5
By the previous patch, amt gateway handlers are changed to worked by a single thread. So, most locks for gateway are not needed. So, it removes. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19amt: use workqueue for gateway side message handlingTaehee Yoo2-15/+164
There are some synchronization issues(amt->status, amt->req_cnt, etc) if the interface is in gateway mode because gateway message handlers are processed concurrently. This applies a work queue for processing these messages instead of expanding the locking context. So, the purposes of this patch are to fix exist race conditions and to make gateway to be able to validate a gateway status more correctly. When the AMT gateway interface is created, it tries to establish to relay. The establishment step looks stateless, but it should be managed well. In order to handle messages in the gateway, it saves the current status(i.e. AMT_STATUS_XXX). This patch makes gateway code to be worked with a single thread. Now, all messages except the multicast are triggered(received or delay expired), and these messages will be stored in the event queue(amt->events). Then, the single worker processes stored messages asynchronously one by one. The multicast data message type will be still processed immediately. Now, amt->lock is only needed to access the event queue(amt->events) if an interface is the gateway mode. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19net: dsa: vitesse-vsc73xx: silent spi_device_id warningsOleksij Rempel1-0/+10
Add spi_device_id entries to silent SPI warnings. Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible") Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19net: dsa: sja1105: silent spi_device_id warningsOleksij Rempel1-0/+16
Add spi_device_id entries to silent following warnings: SPI driver sja1105 has no spi_device_id for nxp,sja1105e SPI driver sja1105 has no spi_device_id for nxp,sja1105t SPI driver sja1105 has no spi_device_id for nxp,sja1105p SPI driver sja1105 has no spi_device_id for nxp,sja1105q SPI driver sja1105 has no spi_device_id for nxp,sja1105r SPI driver sja1105 has no spi_device_id for nxp,sja1105s SPI driver sja1105 has no spi_device_id for nxp,sja1110a SPI driver sja1105 has no spi_device_id for nxp,sja1110b SPI driver sja1105 has no spi_device_id for nxp,sja1110c SPI driver sja1105 has no spi_device_id for nxp,sja1110d Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible") Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19be2net: Fix buffer overflow in be_get_module_eepromHristo Venev3-18/+25
be_cmd_read_port_transceiver_data assumes that it is given a buffer that is at least PAGE_DATA_LEN long, or twice that if the module supports SFF 8472. However, this is not always the case. Fix this by passing the desired offset and length to be_cmd_read_port_transceiver_data so that we only copy the bytes once. Fixes: e36edd9d26cf ("be2net: add ethtool "-m" option support") Signed-off-by: Hristo Venev <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-07-19gpio: pca953x: use the correct register address when regcache sync during initHaibo Chen1-4/+7
For regcache_sync_region, we need to use pca953x_recalc_addr() to get the real register address. Fixes: ec82d1eba346 ("gpio: pca953x: Zap ad-hoc reg_output cache") Fixes: 0f25fda840a9 ("gpio: pca953x: Zap ad-hoc reg_direction cache") Signed-off-by: Haibo Chen <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
2022-07-19gpio: pca953x: use the correct range when do regmap syncHaibo Chen1-6/+6
regmap will sync a range of registers, here use the correct range to make sure the sync do not touch other unexpected registers. Find on pca9557pw on imx8qxp/dxl evk board, this device support 8 pin, so only need one register(8 bits) to cover all the 8 pins's property setting. But when sync the output, we find it actually update two registers, output register and the following register. Fixes: b76574300504 ("gpio: pca953x: Restore registers after suspend/resume cycle") Fixes: ec82d1eba346 ("gpio: pca953x: Zap ad-hoc reg_output cache") Fixes: 0f25fda840a9 ("gpio: pca953x: Zap ad-hoc reg_direction cache") Signed-off-by: Haibo Chen <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
2022-07-19gpio: pca953x: only use single read/write for No AI modeHaibo Chen1-0/+3
For the device use NO AI mode(not support auto address increment), only use the single read/write when config the regmap. We meet issue on PCA9557PW on i.MX8QXP/DXL evk board, this device do not support AI mode, but when do the regmap sync, regmap will sync 3 byte data to register 1, logically this means write first data to register 1, write second data to register 2, write third data to register 3. But this device do not support AI mode, finally, these three data write only into register 1 one by one. the reault is the value of register 1 alway equal to the latest data, here is the third data, no operation happened on register 2 and register 3. This is not what we expect. Fixes: 49427232764d ("gpio: pca953x: Perform basic regmap conversion") Signed-off-by: Haibo Chen <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
2022-07-19clk: lan966x: Fix the lan966x clock gate register addressHerve Codina1-1/+1
The register address used for the clock gate register is the base register address coming from first reg map (ie. the generic clock registers) instead of the second reg map defining the clock gate register. Use the correct clock gate register address. Fixes: 5ad5915dea00 ("clk: lan966x: Extend lan966x clock driver for clock gating support") Signed-off-by: Herve Codina <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Claudiu Beznea <[email protected]> Tested-by: Michael Walle <[email protected]> Signed-off-by: Stephen Boyd <[email protected]>
2022-07-18net: stmmac: remove redunctant disable xPCS EEE callWong Vee Khee1-8/+0
Disable is done in stmmac_init_eee() on the event of MAC link down. Since setting enable/disable EEE via ethtool will eventually trigger a MAC down, removing this redunctant call in stmmac_ethtool.c to avoid calling xpcs_config_eee() twice. Fixes: d4aeaed80b0e ("net: stmmac: trigger PCS EEE to turn off on link down") Signed-off-by: Wong Vee Khee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18Merge branch 'fix-2-dsa-issues-with-vlan_filtering_is_global'Jakub Kicinski1-3/+4
Vladimir Oltean says: ==================== Fix 2 DSA issues with vlan_filtering_is_global This patch set fixes 2 issues with vlan_filtering_is_global switches. Both are regressions introduced by refactoring commit d0004a020bb5 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core"), which wasn't tested on a wide enough variety of switches. Tested on the sja1105 driver. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filteringVladimir Oltean1-2/+3
The "ds" iterator variable used in dsa_port_reset_vlan_filtering() -> dsa_switch_for_each_port() overwrites the "dp" received as argument, which is later used to call dsa_port_vlan_filtering() proper. As a result, switches which do enter that code path (the ones with vlan_filtering_is_global=true) will dereference an invalid dp in dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge. Use a dedicated "other_dp" iterator variable to avoid this from happening. Fixes: d0004a020bb5 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: dsa: fix dsa_port_vlan_filtering when globalVladimir Oltean1-1/+1
The blamed refactoring commit changed a "port" iterator with "other_dp", but still looked at the slave_dev of the dp outside the loop, instead of other_dp->slave from the loop. As a result, dsa_port_vlan_filtering() would not call dsa_slave_manage_vlan_filtering() except for the port in cause, and not for all switch ports as expected. Fixes: d0004a020bb5 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core") Reported-by: Lucian Banu <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18ixgbe: Add locking to prevent panic when setting sriov_numvfs to zeroPiotr Skajewski3-0/+10
It is possible to disable VFs while the PF driver is processing requests from the VF driver. This can result in a panic. BUG: unable to handle kernel paging request at 000000000000106c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I --------- - Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020 RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe] Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff 01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c 00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002 RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006 RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780 R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020 R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80 FS: 0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? ttwu_do_wakeup+0x19/0x140 ? try_to_wake_up+0x1cd/0x550 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf] ixgbe_msix_other+0x17e/0x310 [ixgbe] __handle_irq_event_percpu+0x40/0x180 handle_irq_event_percpu+0x30/0x80 handle_irq_event+0x36/0x53 handle_edge_irq+0x82/0x190 handle_irq+0x1c/0x30 do_IRQ+0x49/0xd0 common_interrupt+0xf/0xf This can be eventually be reproduced with the following script: while : do echo 63 > /sys/class/net/<devname>/device/sriov_numvfs sleep 1 echo 0 > /sys/class/net/<devname>/device/sriov_numvfs sleep 1 done Add lock when disabling SR-IOV to prevent process VF mailbox communication. Fixes: d773d1310625 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned") Signed-off-by: Piotr Skajewski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18i40e: Fix erroneous adapter reinitialization during recovery processDawid Lukwinski1-8/+5
Fix an issue when driver incorrectly detects state of recovery process and erroneously reinitializes interrupts, which results in a kernel error and call trace message. The issue was caused by a combination of two factors: 1. Assuming the EMP reset issued after completing firmware recovery means the whole recovery process is complete. 2. Erroneous reinitialization of interrupt vector after detecting the above mentioned EMP reset. Fixes (1) by changing how recovery state change is detected and (2) by adjusting the conditional expression to ensure using proper interrupt reinitialization method, depending on the situation. Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support") Signed-off-by: Dawid Lukwinski <[email protected]> Signed-off-by: Jan Sokolowski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: ethernet: mtk_eth_soc: fix off by one check of ARRAY_SIZETom Rix1-1/+1
In mtk_wed_tx_ring_setup(.., int idx, ..), idx is used as an index here struct mtk_wed_ring *ring = &dev->tx_ring[idx]; The bounds of idx are checked here BUG_ON(idx > ARRAY_SIZE(dev->tx_ring)); If idx is the size of the array, it will pass this check and overflow. So change the check to >= . Fixes: 804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)") Signed-off-by: Tom Rix <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18Merge branch 'net-lan966x-fix-issues-with-mac-table'Jakub Kicinski1-32/+80
Horatiu Vultur says: ==================== net: lan966x: Fix issues with MAC table The patch series fixes 2 issues: - when an entry was forgotten the irq thread was holding a spin lock and then was talking also rtnl_lock. - the access to the HW MAC table is indirect, so the access to the HW MAC table was not synchronized, which means that there could be race conditions. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: lan966x: Fix usage of lan966x->mac_lock when used by FDBHoratiu Vultur1-11/+23
When the SW bridge was trying to add/remove entries to/from HW, the access to HW was not protected by any lock. In this way, it was possible to have race conditions. Fix this by using the lan966x->mac_lock to protect parallel access to HW for this cases. Fixes: 25ee9561ec622 ("net: lan966x: More MAC table functionality") Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: lan966x: Fix usage of lan966x->mac_lock inside lan966x_mac_irq_handlerHoratiu Vultur1-7/+12
The problem with this spin lock is that it was just protecting the list of the MAC entries in SW and not also the access to the MAC entries in HW. Because the access to HW is indirect, then it could happen to have race conditions. For example when SW introduced an entry in MAC table and the irq mac is trying to read something from the MAC. Update such that also the access to MAC entries in HW is protected by this lock. Fixes: 5ccd66e01cbef ("net: lan966x: add support for interrupts from analyzer") Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: lan966x: Fix usage of lan966x->mac_lock when entry is removedHoratiu Vultur1-4/+20
To remove an entry to the MAC table, it is required first to setup the entry and then issue a command for the MAC to forget the entry. So if it happens for two threads to remove simultaneously an entry in MAC table then it would be a race condition. Fix this by using lan966x->mac_lock to protect the HW access. Fixes: e18aba8941b40 ("net: lan966x: add mactable support") Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: lan966x: Fix usage of lan966x->mac_lock when entry is addedHoratiu Vultur1-1/+7
To add an entry to the MAC table, it is required first to setup the entry and then issue a command for the MAC to learn the entry. So if it happens for two threads to add simultaneously an entry in MAC table then it would be a race condition. Fix this by using lan966x->mac_lock to protect the HW access. Fixes: fc0c3fe7486f2 ("net: lan966x: Add function lan966x_mac_ip_learn()") Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18net: lan966x: Fix taking rtnl_lock while holding spin_lockHoratiu Vultur1-9/+18
When the HW deletes an entry in MAC table then it generates an interrupt. The SW will go through it's own list of MAC entries and if it is not found then it would notify the listeners about this. The problem is that when the SW will go through it's own list it would take a spin lock(lan966x->mac_lock) and when it notifies that the entry is deleted. But to notify the listeners it taking the rtnl_lock which is illegal. This is fixed by instead of notifying right away that the entry is deleted, move the entry on a temp list and once, it checks all the entries then just notify that the entries from temp list are deleted. Fixes: 5ccd66e01cbe ("net: lan966x: add support for interrupts from analyzer") Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds5-52/+5
Pull rdma fixes from Jason Gunthorpe: "Two bug fixes for irdma: - x722 does not support 1GB pages, trying to configure them will corrupt the dma mapping - Fix a sleep while holding a spinlock" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/irdma: Fix sleep from invalid context BUG RDMA/irdma: Do not advertise 1GB page size for x722
2022-07-19pinctrl: armada-37xx: use raw spinlocks for regmap to avoid invalid wait contextVladimir Oltean1-6/+21
The irqchip->irq_set_type method is called by __irq_set_trigger() under the desc->lock raw spinlock. The armada-37xx implementation, armada_37xx_irq_set_type(), uses an MMIO regmap created by of_syscon_register(), which uses plain spinlocks (the kind that are sleepable on RT). Therefore, this is an invalid locking scheme for which we get a kernel splat stating just that ("[ BUG: Invalid wait context ]"), because the context in which the plain spinlock may sleep is atomic due to the raw spinlock. We need to go raw spinlocks all the way. Make this driver create its own MMIO regmap, with use_raw_spinlock=true, and stop relying on syscon to provide it. This patch depends on commit 67021f25d952 ("regmap: teach regmap to use raw spinlocks if requested in the config"). Cc: <[email protected]> # 5.15+ Fixes: 2f227605394b ("pinctrl: armada-37xx: Add irqchip support") Signed-off-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-07-19pinctrl: armada-37xx: make irq_lock a raw spinlock to avoid invalid wait contextVladimir Oltean1-19/+19
The irqchip->irq_set_type method is called by __irq_set_trigger() under the desc->lock raw spinlock. The armada-37xx implementation, armada_37xx_irq_set_type(), takes a plain spinlock, the kind that becomes sleepable on RT. Therefore, this is an invalid locking scheme for which we get a kernel splat stating just that ("[ BUG: Invalid wait context ]"), because the context in which the plain spinlock may sleep is atomic due to the raw spinlock. We need to go raw spinlocks all the way. Replace the driver's irq_lock with a raw spinlock, to disable preemption even on RT. Cc: <[email protected]> # 5.15+ Fixes: 2f227605394b ("pinctrl: armada-37xx: Add irqchip support") Signed-off-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-07-18Revert "ocfs2: mount shared volume without ha stack"Junxiao Bi3-51/+20
This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144. This commit introduced a regression that can cause mount hung. The changes in __ocfs2_find_empty_slot causes that any node with none-zero node number can grab the slot that was already taken by node 0, so node 1 will access the same journal with node 0, when it try to grab journal cluster lock, it will hung because it was already acquired by node 0. It's very easy to reproduce this, in one cluster, mount node 0 first, then node 1, you will see the following call trace from node 1. [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds. [13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2 [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000 [13148.749354] Call Trace: [13148.750718] <TASK> [13148.752019] ? usleep_range+0x90/0x89 [13148.753882] __schedule+0x210/0x567 [13148.755684] schedule+0x44/0xa8 [13148.757270] schedule_timeout+0x106/0x13c [13148.759273] ? __prepare_to_swait+0x53/0x78 [13148.761218] __wait_for_common+0xae/0x163 [13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2] [13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2] [13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2] [13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2] [13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2] [13148.775401] ? iput+0x69/0xba [13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2] [13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2] [13148.781756] mount_bdev+0x190/0x1b7 [13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2] [13148.785634] legacy_get_tree+0x27/0x48 [13148.787466] vfs_get_tree+0x25/0xd0 [13148.789270] do_new_mount+0x18c/0x2d9 [13148.791046] __x64_sys_mount+0x10e/0x142 [13148.792911] do_syscall_64+0x3b/0x89 [13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0 [13148.797051] RIP: 0033:0x7f2309f6e26e [13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e [13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0 [13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820 [13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420 [13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000 [13148.816564] </TASK> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit introduced the feature to mount ocfs2 locally even it is cluster based, that is a very dangerous, it can easily cause serious data corruption, there is no way to stop other nodes mounting the fs and corrupting it. Setup ha or other cluster-aware stack is just the cost that we have to take for avoiding corruption, otherwise we have to do it in kernel. Link: https://lkml.kernel.org/r/[email protected] Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack") Signed-off-by: Junxiao Bi <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pteMiaohe Lin1-0/+1
When alloc_huge_page fails, *pagep is set to NULL without put_page first. So the hugepage indicated by *pagep is leaked. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY") Signed-off-by: Miaohe Lin <[email protected]> Acked-by: Muchun Song <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18fs: sendfile handles O_NONBLOCK of out_fdAndrei Vagin1-0/+3
sendfile has to return EAGAIN if out_fd is nonblocking and the write into it would block. Here is a small reproducer for the problem: #define _GNU_SOURCE /* See feature_test_macros(7) */ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <errno.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/sendfile.h> #define FILE_SIZE (1UL << 30) int main(int argc, char **argv) { int p[2], fd; if (pipe2(p, O_NONBLOCK)) return 1; fd = open(argv[1], O_RDWR | O_TMPFILE, 0666); if (fd < 0) return 1; ftruncate(fd, FILE_SIZE); if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) { fprintf(stderr, "FAIL\n"); } if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) { fprintf(stderr, "FAIL\n"); } return 0; } It worked before b964bf53e540, it is stuck after b964bf53e540, and it works again with this fix. This regression occurred because do_splice_direct() calls pipe_write that handles O_NONBLOCK. Here is a trace log from the reproducer: 1) | __x64_sys_sendfile64() { 1) | do_sendfile() { 1) | __fdget() 1) | rw_verify_area() 1) | __fdget() 1) | rw_verify_area() 1) | do_splice_direct() { 1) | rw_verify_area() 1) | splice_direct_to_actor() { 1) | do_splice_to() { 1) | rw_verify_area() 1) | generic_file_splice_read() 1) + 74.153 us | } 1) | direct_splice_actor() { 1) | iter_file_splice_write() { 1) | __kmalloc() 1) 0.148 us | pipe_lock(); 1) 0.153 us | splice_from_pipe_next.part.0(); 1) 0.162 us | page_cache_pipe_buf_confirm(); ... 16 times 1) 0.159 us | page_cache_pipe_buf_confirm(); 1) | vfs_iter_write() { 1) | do_iter_write() { 1) | rw_verify_area() 1) | do_iter_readv_writev() { 1) | pipe_write() { 1) | mutex_lock() 1) 0.153 us | mutex_unlock(); 1) 1.368 us | } 1) 1.686 us | } 1) 5.798 us | } 1) 6.084 us | } 1) 0.174 us | kfree(); 1) 0.152 us | pipe_unlock(); 1) + 14.461 us | } 1) + 14.783 us | } 1) 0.164 us | page_cache_pipe_buf_release(); ... 16 times 1) 0.161 us | page_cache_pipe_buf_release(); 1) | touch_atime() 1) + 95.854 us | } 1) + 99.784 us | } 1) ! 107.393 us | } 1) ! 107.699 us | } Link: https://lkml.kernel.org/r/[email protected] Fixes: b964bf53e540 ("teach sendfile(2) to handle send-to-pipe directly") Signed-off-by: Andrei Vagin <[email protected]> Cc: Al Viro <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18ntfs: fix use-after-free in ntfs_ucsncmp()ChenXiaoSong1-2/+6
Syzkaller reported use-after-free bug as follows: ================================================================== BUG: KASAN: use-after-free in ntfs_ucsncmp+0x123/0x130 Read of size 2 at addr ffff8880751acee8 by task a.out/879 CPU: 7 PID: 879 Comm: a.out Not tainted 5.19.0-rc4-next-20220630-00001-gcc5218c8bd2c-dirty #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x1c0/0x2b0 print_address_description.constprop.0.cold+0xd4/0x484 print_report.cold+0x55/0x232 kasan_report+0xbf/0xf0 ntfs_ucsncmp+0x123/0x130 ntfs_are_names_equal.cold+0x2b/0x41 ntfs_attr_find+0x43b/0xb90 ntfs_attr_lookup+0x16d/0x1e0 ntfs_read_locked_attr_inode+0x4aa/0x2360 ntfs_attr_iget+0x1af/0x220 ntfs_read_locked_inode+0x246c/0x5120 ntfs_iget+0x132/0x180 load_system_files+0x1cc6/0x3480 ntfs_fill_super+0xa66/0x1cf0 mount_bdev+0x38d/0x460 legacy_get_tree+0x10d/0x220 vfs_get_tree+0x93/0x300 do_new_mount+0x2da/0x6d0 path_mount+0x496/0x19d0 __x64_sys_mount+0x284/0x300 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f3f2118d9ea Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc269deac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3f2118d9ea RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffc269dec00 RBP: 00007ffc269dec80 R08: 00007ffc269deb00 R09: 00007ffc269dec44 R10: 0000000000000000 R11: 0000000000000202 R12: 000055f81ab1d220 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> The buggy address belongs to the physical page: page:0000000085430378 refcount:1 mapcount:1 mapping:0000000000000000 index:0x555c6a81d pfn:0x751ac memcg:ffff888101f7e180 anon flags: 0xfffffc00a0014(uptodate|lru|mappedtodisk|swapbacked|node=0|zone=1|lastcpupid=0x1fffff) raw: 000fffffc00a0014 ffffea0001bf2988 ffffea0001de2448 ffff88801712e201 raw: 0000000555c6a81d 0000000000000000 0000000100000000 ffff888101f7e180 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880751acd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880751ace00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8880751ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ ffff8880751acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880751acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== The reason is that struct ATTR_RECORD->name_offset is 6485, end address of name string is out of bounds. Fix this by adding sanity check on end address of attribute name string. [[email protected]: coding-style cleanups] [[email protected]: cleanup suggested by Hawkins Jiawei] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ChenXiaoSong <[email protected]> Signed-off-by: Hawkins Jiawei <[email protected]> Cc: Anton Altaparmakov <[email protected]> Cc: ChenXiaoSong <[email protected]> Cc: Yongqiang Liu <[email protected]> Cc: Zhang Yi <[email protected]> Cc: Zhang Xiaoxu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18secretmem: fix unhandled fault in truncateMike Rapoport1-7/+26
syzkaller reports the following issue: BUG: unable to handle page fault for address: ffff888021f7e005 PGD 11401067 P4D 11401067 PUD 11402067 PMD 21f7d063 PTE 800fffffde081060 Oops: 0002 [#1] PREEMPT SMP KASAN CPU: 0 PID: 3761 Comm: syz-executor281 Not tainted 5.19.0-rc4-syzkaller-00014-g941e3e791269 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64 Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 RSP: 0018:ffffc9000329fa90 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000ffb RDX: 0000000000000ffb RSI: 0000000000000000 RDI: ffff888021f7e005 RBP: ffffea000087df80 R08: 0000000000000001 R09: ffff888021f7e005 R10: ffffed10043efdff R11: 0000000000000000 R12: 0000000000000005 R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000ffb FS: 00007fb29d8b2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888021f7e005 CR3: 0000000026e7b000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> zero_user_segments include/linux/highmem.h:272 [inline] folio_zero_range include/linux/highmem.h:428 [inline] truncate_inode_partial_folio+0x76a/0xdf0 mm/truncate.c:237 truncate_inode_pages_range+0x83b/0x1530 mm/truncate.c:381 truncate_inode_pages mm/truncate.c:452 [inline] truncate_pagecache+0x63/0x90 mm/truncate.c:753 simple_setattr+0xed/0x110 fs/libfs.c:535 secretmem_setattr+0xae/0xf0 mm/secretmem.c:170 notify_change+0xb8c/0x12b0 fs/attr.c:424 do_truncate+0x13c/0x200 fs/open.c:65 do_sys_ftruncate+0x536/0x730 fs/open.c:193 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fb29d900899 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fb29d8b2318 EFLAGS: 00000246 ORIG_RAX: 000000000000004d RAX: ffffffffffffffda RBX: 00007fb29d988408 RCX: 00007fb29d900899 RDX: 00007fb29d900899 RSI: 0000000000000005 RDI: 0000000000000003 RBP: 00007fb29d988400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb29d98840c R13: 00007ffca01a23bf R14: 00007fb29d8b2400 R15: 0000000000022000 </TASK> Modules linked in: CR2: ffff888021f7e005 ---[ end trace 0000000000000000 ]--- Eric Biggers suggested that this happens when secretmem_setattr()->simple_setattr() races with secretmem_fault() so that a page that is faulted in by secretmem_fault() (and thus removed from the direct map) is zeroed by inode truncation right afterwards. Use mapping->invalidate_lock to make secretmem_fault() and secretmem_setattr() mutually exclusive. [[email protected]: v3] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Reported-by: [email protected] Signed-off-by: Mike Rapoport <[email protected]> Suggested-by: Eric Biggers <[email protected]> Reviewed-by: Axel Rasmussen <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()Naoya Horiguchi1-2/+7
Originally copy_hugetlb_page_range() handles migration entries and hwpoisoned entries in similar manner. But recently the related code path has more code for migration entries, and when is_writable_migration_entry() was converted to !is_readable_migration_entry(), hwpoison entries on source processes got to be unexpectedly updated (which is legitimate for migration entries, but not for hwpoison entries). This results in unexpected serious issues like kernel panic when forking processes with hwpoison entries in pmd. Separate the if branch into one for hwpoison entries and one for migration entries. Link: https://lkml.kernel.org/r/[email protected] Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") Signed-off-by: Naoya Horiguchi <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: <[email protected]> [5.18] Cc: David Hildenbrand <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18mm: fix missing wake-up event for FSDAX pagesMuchun Song3-10/+16
FSDAX page refcounts are 1-based, rather than 0-based: if refcount is 1, then the page is freed. The FSDAX pages can be pinned through GUP, then they will be unpinned via unpin_user_page() using a folio variant to put the page, however, folio variants did not consider this special case, the result will be to miss a wakeup event (like the user of __fuse_dax_break_layouts()). This results in a task being permanently stuck in TASK_INTERRUPTIBLE state. Since FSDAX pages are only possibly obtained by GUP users, so fix GUP instead of folio_put() to lower overhead. Link: https://lkml.kernel.org/r/[email protected] Fixes: d8ddc099c6b3 ("mm/gup: Add gup_put_folio()") Signed-off-by: Muchun Song <[email protected]> Suggested-by: Matthew Wilcox <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: William Kucharski <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jan Kara <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18mm: fix page leak with multiple threads mapping the same pageJosef Bacik1-2/+5
We have an application with a lot of threads that use a shared mmap backed by tmpfs mounted with -o huge=within_size. This application started leaking loads of huge pages when we upgraded to a recent kernel. Using the page ref tracepoints and a BPF program written by Tejun Heo we were able to determine that these pages would have multiple refcounts from the page fault path, but when it came to unmap time we wouldn't drop the number of refs we had added from the faults. I wrote a reproducer that mmap'ed a file backed by tmpfs with -o huge=always, and then spawned 20 threads all looping faulting random offsets in this map, while using madvise(MADV_DONTNEED) randomly for huge page aligned ranges. This very quickly reproduced the problem. The problem here is that we check for the case that we have multiple threads faulting in a range that was previously unmapped. One thread maps the PMD, the other thread loses the race and then returns 0. However at this point we already have the page, and we are no longer putting this page into the processes address space, and so we leak the page. We actually did the correct thing prior to f9ce0be71d1f, however it looks like Kirill copied what we do in the anonymous page case. In the anonymous page case we don't yet have a page, so we don't have to drop a reference on anything. Previously we did the correct thing for file based faults by returning VM_FAULT_NOPAGE so we correctly drop the reference on the page we faulted in. Fix this by returning VM_FAULT_NOPAGE in the pmd_devmap_trans_unstable() case, this makes us drop the ref on the page properly, and now my reproducer no longer leaks the huge pages. [[email protected]: v2] Link: https://lkml.kernel.org/r/e90c8f0dbae836632b669c2afc434006a00d4a67.1657721478.git.josef@toxicpanda.com Link: https://lkml.kernel.org/r/2b798acfd95c9ab9395fe85e8d5a835e2e10a920.1657051137.git.josef@toxicpanda.com Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Signed-off-by: Chris Mason <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18mailmap: update Seth Forshee's email addressSeth Forshee1-0/+1
[email protected] is no longer valid, use [email protected] instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Seth Forshee <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18tmpfs: fix the issue that the mount and remount results are inconsistent.ZhaoLong Wang1-5/+2
An undefined-behavior issue has not been completely fixed since commit d14f5efadd84 ("tmpfs: fix undefined-behaviour in shmem_reconfigure()"). In the commit, check in the shmem_reconfigure() is added in remount process to avoid the Ubsan problem. However, the check is not added to the mount process. It causes inconsistent results between mount and remount. The operations to reproduce the problem in user mode as follows: If nr_blocks is set to 0x8000000000000000, the mounting is successful. # mount tmpfs /dev/shm/ -t tmpfs -o nr_blocks=0x8000000000000000 However, when -o remount is used, the mount fails because of the check in the shmem_reconfigure() # mount tmpfs /dev/shm/ -t tmpfs -o remount,nr_blocks=0x8000000000000000 mount: /dev/shm: mount point not mounted or bad option. Therefore, add checks in the shmem_parse_one() function and remove the check in shmem_reconfigure() to avoid this problem. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhaoLong Wang <[email protected]> Cc: Luo Meng <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Yu Kuai <[email protected]> Cc: Zhihao Cheng <[email protected]> Cc: Zhang Yi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18mm: kfence: apply kmemleak_ignore_phys on early allocated poolYee Lee1-9/+9
This patch solves two issues. (1) The pool allocated by memblock needs to unregister from kmemleak scanning. Apply kmemleak_ignore_phys to replace the original kmemleak_free as its address now is stored in the phys tree. (2) The pool late allocated by page-alloc doesn't need to unregister. Move out the freeing operation from its call path. Link: https://lkml.kernel.org/r/[email protected] Fixes: 0c24e061196c21d5 ("mm: kmemleak: add rbtree and store physical address for objects allocated with PA") Signed-off-by: Yee Lee <[email protected]> Suggested-by: Catalin Marinas <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Suggested-by: Marco Elver <[email protected]> Reviewed-by: Marco Elver <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-07-18Merge tag 'hte/for-5.19' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux Pull hardware timestamp fix from Thierry Reding: "A single fix for an out-of-sync kerneldoc comment" * tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: gpiolib: cdev: Fix kernel doc for struct line
2022-07-18ACPI: CPPC: Don't require flexible address space if X86_FEATURE_CPPC is ↵Mario Limonciello1-2/+4
supported Commit 0651ab90e4ad ("ACPI: CPPC: Check _OSC for flexible address space") changed _CPC probing to require flexible address space to be negotiated for CPPC to work. However it was observed that this caused a regression for Arek's ROG Zephyrus G15 GA503QM which previously CPPC worked, but now it stopped working. To avoid causing a regression waive this failure when the CPU is known to support CPPC. Cc: Pierre Gondois <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216248 Fixes: 0651ab90e4ad ("ACPI: CPPC: Check _OSC for flexible address space") Reported-and-tested-by: Arek Ruśniak <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-07-18iavf: Fix missing state logsPrzemyslaw Patynowski1-0/+4
Fix debug prints, by adding missing state prints. Extend iavf_state_str by strings for __IAVF_INIT_EXTENDED_CAPS and __IAVF_INIT_CONFIG_ADAPTER. Without this patch, when enabling debug prints for iavf.h, user will see: iavf 0000:06:0e.0: state transition from:__IAVF_INIT_GET_RESOURCES to:__IAVF_UNKNOWN_STATE iavf 0000:06:0e.0: state transition from:__IAVF_UNKNOWN_STATE to:__IAVF_UNKNOWN_STATE Fixes: 605ca7c5c670 ("iavf: Fix kernel BUG in free_msi_irqs") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Jun Zhang <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-07-18iavf: Fix handling of dummy receive descriptorsPrzemyslaw Patynowski1-3/+2
Fix memory leak caused by not handling dummy receive descriptor properly. iavf_get_rx_buffer now sets the rx_buffer return value for dummy receive descriptors. Without this patch, when the hardware writes a dummy descriptor, iavf would not free the page allocated for the previous receive buffer. This is an unlikely event but can still happen. [Jesse: massaged commit message] Fixes: efa14c398582 ("iavf: allow null RX descriptors") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-07-18iavf: Disallow changing rx/tx-frames and rx/tx-frames-irqPrzemyslaw Patynowski4-13/+1
Remove from supported_coalesce_params ETHTOOL_COALESCE_MAX_FRAMES and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. As tx-frames-irq allowed user to change budget for iavf_clean_tx_irq, remove work_limit and use define for budget. Without this patch there would be possibility to change rx/tx-frames and rx/tx-frames-irq, which for rx/tx-frames did nothing, while for rx/tx-frames-irq it changed rx/tx-frames and only changed budget for cleaning NAPI poll. Fixes: fbb7ddfef253 ("i40evf: core ethtool functionality") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Jun Zhang <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-07-18iavf: Fix VLAN_V2 addition/rejectionPrzemyslaw Patynowski3-10/+74
Fix VLAN addition, so that PF driver does not reject whole VLAN batch. Add VLAN reject handling, so rejected VLANs, won't litter VLAN filter list. Fix handling of active_(c/s)vlans, so it will be possible to re-add VLAN filters for user. Without this patch, after changing trust to off, with VLAN filters saturated, no VLAN is added, due to PF rejecting addition. Fixes: 92fc50859872 ("iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-07-18drm/amdgpu: Remove one duplicated ef removalxinhui pan1-6/+0
That has been done in BO release notify. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2074 Signed-off-by: xinhui pan <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-07-18x86/amd: Use IBPB for firmware callsPeter Zijlstra3-1/+13
On AMD IBRS does not prevent Retbleed; as such use IBPB before a firmware call to flush the branch history state. And because in order to do an EFI call, the kernel maps a whole lot of the kernel page table into the EFI page table, do an IBPB just in case in order to prevent the scenario of poisoning the BTB and causing an EFI call using the unprotected RET there. [ bp: Massage. ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-07-18Merge branch 'dsa-docs'David S. Miller1-60/+303
Vladimir Oltean says: ==================== Update DSA documentation These are some updates of dsa.rst, since it hasn't kept up with development (in some cases, even since 2017). I've added Fixes: tags as I thought was appropriate. ==================== Signed-off-by: David S. Miller <[email protected]>
2022-07-18docs: net: dsa: mention that VLANs are now refcounted on shared portsVladimir Oltean1-1/+7
The blamed commit updated the way in which VLANs are handled at the cross-chip notifier layer and didn't update the documentation to say that. Fix it. Fixes: 134ef2388e7f ("net: dsa: add explicit support for host bridge VLANs") Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-07-18docs: net: dsa: delete misinformation about -EOPNOTSUPP for FDB/MDB/VLANVladimir Oltean1-9/+3
Returning -EOPNOTSUPP does *NOT* mean anything special. port_vlan_add() is actually called from 2 code paths, one is vlan_vid_add() from 8021q module and the other is br_switchdev_port_vlan_add() from switchdev. The bridge has a wrapper __vlan_vid_add() which first tries via switchdev, then if that returns -EOPNOTSUPP, tries again via the VLAN RX filters in the 8021q module. But DSA doesn't distinguish between one call path and the other when calling the driver's port_vlan_add(), so if the driver returns -EOPNOTSUPP to switchdev, it also returns -EOPNOTSUPP to the 8021q module. And the latter is a hard error. port_fdb_add() is called from the deferred dsa_owq only, so obviously its return code isn't propagated anywhere, and cannot be interpreted in any way. The return code from port_mdb_add() is propagated to the bridge, but again, this doesn't do anything special when -EOPNOTSUPP is returned, but rather, br_switchdev_mdb_notify() returns void. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-07-18docs: net: dsa: re-explain what port_fdb_dump actually doesVladimir Oltean1-3/+6
Switchdev has changed radically from its initial implementation, and the currently provided definition is incorrect and very confusing. Rewrite it in light of what it actually does. Fixes: 2bedde1abbef ("net: dsa: Move FDB dump implementation inside DSA") Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>