aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-06-05Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds10-102/+1301
Pull VFIO updates from Alex Williamson: - Block accesses to disabled MMIO space (Alex Williamson) - VFIO device migration API (Kirti Wankhede) - type1 IOMMU dirty bitmap API and implementation (Kirti Wankhede) - PCI NULL capability masking (Alex Williamson) - Memory leak fixes (Qian Cai) - Reference leak fix (Qiushi Wu) * tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio: vfio iommu: typecast corrections vfio iommu: Use shift operation for 64-bit integer division vfio/mdev: Fix reference count leak in add_mdev_supported_type vfio: Selective dirty page tracking if IOMMU backed device pins pages vfio iommu: Add migration capability to report supported features vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap vfio iommu: Implementation of ioctl for dirty pages tracking vfio iommu: Add ioctl definition for dirty pages tracking vfio iommu: Cache pgsize_bitmap in struct vfio_iommu vfio iommu: Remove atomicity of ref_count of pinned pages vfio: UAPI for migration interface for device state vfio/pci: fix memory leaks of eventfd ctx vfio/pci: fix memory leaks in alloc_perm_bits() vfio-pci: Mask cap zero vfio-pci: Invalidate mmaps and block MMIO access on disabled memory vfio-pci: Fault mmaps to enable vma tracking vfio/type1: Support faulting PFNMAP vmas
2020-06-05Merge tag 'core_core_updates_for_5.8' of ↵Linus Torvalds4-6/+71
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull READ_IMPLIES_EXEC changes from Borislav Petkov: "Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK now that toolchains long support PT_GNU_STACK marking and there's no need anymore to force modern programs into having all its user mappings executable instead of only the stack and the PROT_EXEC ones. Disable that automatic READ_IMPLIES_EXEC forcing on x86-64 and arm64. Add tables documenting how READ_IMPLIES_EXEC is handled on x86-64, arm and arm64. By Kees Cook" * tag 'core_core_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/elf: Disable automatic READ_IMPLIES_EXEC for 64-bit address spaces arm32/64/elf: Split READ_IMPLIES_EXEC from executable PT_GNU_STACK arm32/64/elf: Add tables to document READ_IMPLIES_EXEC x86/elf: Disable automatic READ_IMPLIES_EXEC on 64-bit x86/elf: Split READ_IMPLIES_EXEC from executable PT_GNU_STACK x86/elf: Add table to document READ_IMPLIES_EXEC
2020-06-05cxgb4: Use kfree() instead kvfree() where appropriateDenis Efremov1-3/+3
Use kfree(buf) in blocked_fl_read() because the memory is allocated with kzalloc(). Use kfree(t) in blocked_fl_write() because the memory is allocated with kcalloc(). Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: qed: fixes crash while running driver in kdump kernelAlok Prasad3-8/+8
This fixes a crash introduced by recent is_kdump_kernel() check. The source of the crash is that kdump kernel can be loaded on a system with already created VFs. But for such VFs, it will follow a logic path of PF and eventually crash. Thus, we are partially reverting back previous changes and instead use is_kdump_kernel is a single init point of PF init, where we disable SRIOV explicitly. Fixes: 37d4f8a6b41f ("net: qed: Disable SRIOV functionality inside kdump kernel") Cc: Bhupesh Sharma <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: Alok Prasad <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05vsock/vmci: make vmci_vsock_transport_cb() staticStefano Garzarella1-1/+1
Fix the following gcc-9.3 warning when building with 'make W=1': net/vmw_vsock/vmci_transport.c:2058:6: warning: no previous prototype for ‘vmci_vsock_transport_cb’ [-Wmissing-prototypes] 2058 | void vmci_vsock_transport_cb(bool is_host) | ^~~~~~~~~~~~~~~~~~~~~~~ Fixes: b1bba80a4376 ("vsock/vmci: register vmci_transport only when VMCI guest/host are active") Reported-by: kernel test robot <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: ethtool: Fix comment mentioning typo in IS_ENABLED()Kees Cook1-1/+1
This has no code changes, but it's a typo noticed in other clean-ups, so we might as well fix it. IS_ENABLED() takes full names, and should have the "CONFIG_" prefix. Reported-by: Joe Perches <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: phy: mscc: fix Serdes configuration in vsc8584_config_initAntoine Tenart1-2/+2
When converting the MSCC PHY driver to shared PHY packages, the Serdes configuration in vsc8584_config_init was modified to use 'base_addr' instead of 'base' as the port number. But 'base_addr' isn't equal to 'addr' for all PHYs inside the package, which leads to the Serdes still being enabled on those ports. This patch fixes it. Fixes: deb04e9c0ff2 ("net: phy: mscc: use phy_package_shared") Signed-off-by: Antoine Tenart <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05Merge branch 'Fixes-for-OF_MDIO-flag'David S. Miller5-6/+6
Dan Murphy says: ==================== Fixes for OF_MDIO flag There are some residual drivers that check the CONFIG_OF_MDIO flag using the if defs. Using this check does not work when the OF_MDIO is configured as a module. Using the IS_ENABLED macro checks if the flag is declared as built-in or as a module. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: mscc: Fix OF_MDIO config checkDan Murphy2-3/+3
When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 4f58e6dceb0e4 ("net: phy: Cleanup the Edge-Rate feature in Microsemi PHYs.") Signed-off-by: Dan Murphy <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: marvell: Fix OF_MDIO config checkDan Murphy1-1/+1
When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: cf41a51db8985 ("of/phylib: Use device tree properties to initialize Marvell PHYs.") Signed-off-by: Dan Murphy <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: dp83867: Fix OF_MDIO config checkDan Murphy1-1/+1
When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 2a10154abcb75 ("net: phy: dp83867: Add TI dp83867 phy") Signed-off-by: Dan Murphy <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: dp83869: Fix OF_MDIO config checkDan Murphy1-1/+1
When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 01db923e83779 ("net: phy: dp83869: Add TI dp83869 phy") Signed-off-by: Dan Murphy <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05net: ethernet: mvneta: fix MVNETA_SKB_HEADROOM alignmentAlexander Lobakin1-1/+1
Commit ca23cb0bc50f ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero") added headroom alignment check against 8. Hovewer (if we imagine that NET_SKB_PAD or XDP_PACKET_HEADROOM is not aligned to cacheline size), it actually aligns headroom down, while skb/xdp_buff headroom should be *at least* equal to one of the values (depending on XDP prog presence). So, fix the check to align the value up. This satisfies both hardware/driver and network stack requirements. Fixes: ca23cb0bc50f ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero") Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05ethtool: linkinfo: remove an unnecessary NULL checkDan Carpenter1-2/+1
This code generates a Smatch warning: net/ethtool/linkinfo.c:143 ethnl_set_linkinfo() warn: variable dereferenced before check 'info' (see line 119) Fortunately, the "info" pointer is never NULL so the check can be removed. Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-06-05Merge tag 'powerpc-5.8-1' of ↵Linus Torvalds343-8589/+10395
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits) powerpc/pseries: Make vio and ibmebus initcalls pseries specific cxl: Remove dead Kconfig options powerpc: Add POWER10 architected mode powerpc/dt_cpu_ftrs: Add MMA feature powerpc/dt_cpu_ftrs: Enable Prefixed Instructions powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected powerpc: Add support for ISA v3.1 powerpc: Add new HWCAP bits powerpc/64s: Don't set FSCR bits in INIT_THREAD powerpc/64s: Save FSCR to init_task.thread.fscr after feature init powerpc/64s: Don't let DT CPU features set FSCR_DSCR powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations powerpc/module_64: Consolidate ftrace code powerpc/32: Disable KASAN with pages bigger than 16k powerpc/uaccess: Don't set KUEP by default on book3s/32 powerpc/uaccess: Don't set KUAP by default on book3s/32 powerpc/8xx: Reduce time spent in allow_user_access() and friends ...
2020-06-05Merge tag 'modules-for-v5.8' of ↵Linus Torvalds1-10/+40
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: - Harden CONFIG_STRICT_MODULE_RWX by rejecting any module that has SHF_WRITE|SHF_EXECINSTR sections - Remove and clean up nested #ifdefs, as it makes code hard to read * tag 'modules-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: Harden STRICT_MODULE_RWX module: break nested ARCH_HAS_STRICT_MODULE_RWX and STRICT_MODULE_RWX #ifdefs
2020-06-05Merge branch 'gfs2-iopen' into for-nextAndreas Gruenbacher10-38/+295
2020-06-05gfs2: fix use-after-free on transaction ail listsBob Peterson1-2/+9
Before this patch, transactions could be merged into the system transaction by function gfs2_merge_trans(), but the transaction ail lists were never merged. Because the ail flushing mechanism can run separately, bd elements can be attached to the transaction's buffer list during the transaction (trans_add_meta, etc) but quickly moved to its ail lists. Later, in function gfs2_trans_end, the transaction can be freed (by gfs2_trans_end) while it still has bd elements queued to its ail lists, which can cause it to either lose track of the bd elements altogether (memory leak) or worse, reference the bd elements after the parent transaction has been freed. Although I've not seen any serious consequences, the problem becomes apparent with the previous patch's addition of: gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list)); to function gfs2_trans_free(). This patch adds logic into gfs2_merge_trans() to move the merged transaction's ail lists to the sdp transaction. This prevents the use-after-free. To do this properly, we need to hold the ail lock, so we pass sdp into the function instead of the transaction itself. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: new slab for transactionsBob Peterson6-8/+32
This patch adds a new slab for gfs2 transactions. That allows us to reduce kernel memory fragmentation, have better organization of data for analysis of vmcore dumps. A new centralized function is added to free the slab objects, and it exposes use-after-free by giving warnings if a transaction is freed while it still has bd elements attached to its buffers or ail lists. We make sure to initialize those transaction ail lists so we can check their integrity when freeing. At a later time, we should add a slab initialization function to make it more efficient, but for this initial patch I wanted to minimize the impact. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: initialize transaction tr_ailX_lists earlierBob Peterson3-2/+4
Since transactions may be freed shortly after they're created, before a log_flush occurs, we need to initialize their ail1 and ail2 lists earlier. Before this patch, the ail1 list was initialized in gfs2_log_flush(). This moves the initialization to the point when the transaction is first created. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05dm crypt: avoid truncating the logical block sizeEric Biggers1-1/+1
queue_limits::logical_block_size got changed from unsigned short to unsigned int, but it was forgotten to update crypt_io_hints() to use the new type. Fix it. Fixes: ad6bf88a6c19 ("block: fix an integer overflow in logical block size") Cc: [email protected] Signed-off-by: Eric Biggers <[email protected]> Reviewed-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm mpath: add DM device name to Failing/Reinstating path log messagesMike Snitzer1-2/+6
When there are many DM multipath devices it really helps to have additional context for which DM device a failed or reinstated path is part of. Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm mpath: enhance queue_if_no_path debuggingMike Snitzer1-7/+23
Add more DMDEBUG that shows arguments passed and caller, and another that shows state of related flags at end of queue_if_no_path(). Also add queue_if_no_path DMDEBUG to multipath_resume(). Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm mpath: restrict queue_if_no_path state machineMike Snitzer1-10/+28
Do not allow saving disabled queue_if_no_path if already saved as enabled; implies multiple suspends (which shouldn't ever happen). Log if this unlikely scenario is ever triggered. Also, only write MPATHF_SAVED_QUEUE_IF_NO_PATH during presuspend or if "fail_if_no_path" message. MPATHF_SAVED_QUEUE_IF_NO_PATH is no longer always modified, e.g.: even if queue_if_no_path()'s save_old_value argument wasn't set. This just implies a bit tighter control over the management of MPATHF_SAVED_QUEUE_IF_NO_PATH. Side-effect is multipath_resume() doesn't reset MPATHF_QUEUE_IF_NO_PATH unless MPATHF_SAVED_QUEUE_IF_NO_PATH was set (during presuspend); and at that time the MPATHF_SAVED_QUEUE_IF_NO_PATH bit gets cleared. So MPATHF_SAVED_QUEUE_IF_NO_PATH's use is much more narrow in scope. Last, but not least, do _not_ disable queue_if_no_path during noflush suspend. There is no need/benefit to saving off queue_if_no_path via MPATHF_SAVED_QUEUE_IF_NO_PATH and clearing MPATHF_QUEUE_IF_NO_PATH for noflush suspend -- by avoiding this needless queue_if_no_path flag churn there is less potential for MPATHF_QUEUE_IF_NO_PATH to get lost. Which avoids potential for IOs to be errored back up to userspace during DM multipath's handling of path failures. That said, this last change papers over a reported issue concerning request-based dm-multipath's interaction with blk-mq, relative to suspend and resume: multipath_endio is being called _before_ multipath_resume. This should never happen if DM suspend's blk_mq_quiesce_queue() + dm_wait_for_completion() is genuinely waiting for all inflight blk-mq requests to complete. Similarly: drivers/md/dm.c:__dm_resume() clearly calls dm_table_resume_targets() _before_ dm_start_queue()'s blk_mq_unquiesce_queue() is called. If the queue isn't even restarted until after multipath_resume(); the BIG question that still needs answering is: how can multipath_end_io beat multipath_resume in a race!? Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm mpath: simplify __must_push_backMike Snitzer1-23/+5
Remove micro-optimization that infers device is between presuspend and resume (was done purely to avoid call to dm_noflush_suspending, which isn't expensive anyway). Remove flags argument since they are no longer checked. And remove must_push_back_bio() since it was simply a call to __must_push_back(). Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: check superblock locationHannes Reinecke1-1/+9
When specifying several devices the superblock location must be checked to ensure the devices are specified in the correct order. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: prefer full zones for reclaimHannes Reinecke1-1/+8
Prefer full zones when selecting the next zone for reclaim. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: select reclaim zone based on device indexHannes Reinecke4-32/+27
per-device reclaim should select zones on that device only. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: allocate zone by device indexHannes Reinecke3-8/+15
When allocating a zone, pass in an indicator on which device the zone should be allocated; this increases performance for a multi-device setup because reclaim will now allocate zones on the device for which reclaim is running. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: support arbitrary number of devicesHannes Reinecke2-45/+74
Remove the hard-coded limit of two devices and support an unlimited number of additional zoned devices. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: move random and sequential zones into struct dmz_devHannes Reinecke4-78/+119
Random and sequential zones should be part of the respective device structure to make arbitration between devices possible. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: per-device reclaimHannes Reinecke3-57/+88
Instead of having one reclaim workqueue for the entire set we should be allocating a reclaim workqueue per device; doing so will reduce contention and should boost performance for a multi-device setup. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: add metadata pointer to struct dmz_devHannes Reinecke2-8/+13
Add a metadata pointer within struct dmz_dev and use it as argument for blkdev_report_zones() instead of the metadata itself. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: add device pointer to struct dm_zoneHannes Reinecke4-39/+19
Add a pointer, to the containing device, within struct dm_zone and kill dmz_zone_to_dev(). Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: allocate temporary superblock for tertiary devicesHannes Reinecke1-48/+61
Checking the tertiary superblock just consists of validating UUIDs, crcs, and the generation number; it doesn't have contents which would be required during the actual operation. So allocate a temporary superblock when checking tertiary devices to avoid having to store it together with the 'real' superblocks. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: convert to xarrayHannes Reinecke1-32/+90
The zones array is getting really large, and large arrays tend to wreak havoc with the CPU caches. So convert it to xarray to become more cache friendly. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Colin Ian King <[email protected]> # fix leak in dmz_insert Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: add a 'reserved' zone flagHannes Reinecke2-2/+4
Instead of counting the number of reserved zones in dmz_free_zone(), mark the zone as 'reserved' during allocation and simplify dmz_free_zone(). Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: improve logging messages for reclaimHannes Reinecke1-3/+10
Instead of just reporting the errno, add some more verbose debugging message in the reclaim path. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: avoid unnecessary device recalulation for secondary superblockHannes Reinecke1-3/+2
The secondary superblock must reside on the same device as the primary superblock, so there is no need to re-calculate the device. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm zoned: add debugging message for reading superblocksHannes Reinecke1-0/+4
Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm ebs: use dm_bufio_forget_buffersMikulas Patocka1-2/+2
Use dm_bufio_forget_buffers instead of a block-by-block loop that calls dm_bufio_forget. dm_bufio_forget_buffers is faster than the loop because it searches for used buffers using rb-tree. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm bufio: introduce forget_buffer_lockedMikulas Patocka2-4/+63
Introduce a function forget_buffer_locked that forgets a range of buffers. It is more efficient than calling forget_buffer in a loop. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm bufio: clean up rbtree block orderingMikulas Patocka1-3/+3
dm-bufio uses unnatural ordering in the rb-tree - blocks with smaller numbers were put to the right node and blocks with bigger numbers were put to the left node. Reverse that logic so that it's natural. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05dm integrity: add status line documentationMikulas Patocka1-0/+8
Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-06-05gfs2: Smarter iopen glock waitingAndreas Gruenbacher2-5/+40
When trying to upgrade the iopen glock from a shared to an exclusive lock in gfs2_evict_inode, abort the wait if there is contention on the corresponding inode glock: in that case, the inode must still be in active use on another node, and we're not guaranteed to get the iopen glock anytime soon. To make this work even better, when we notice contention on the iopen glock and we can't evict the corresponsing inode and release the iopen glock immediately, poke the inode glock. The other node(s) trying to acquire the lock can then abort instead of timing out. Thanks to Heinz Mauelshagen for pointing out a locking bug in a previous version of this patch. Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: Wake up when setting GLF_DEMOTEAndreas Gruenbacher1-4/+14
Wake up the sdp->sd_async_glock_wait wait queue when setting the GLF_DEMOTE flag. Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: Check inode generation number in delete_work_funcAndreas Gruenbacher2-2/+7
In delete_work_func, if the iopen glock still has an inode attached, limit the inode lookup to that specific generation number: in the likely case that the inode was deleted on the node on which the inode's link count dropped to zero, we can skip verifying the on-disk block type and reading in the inode. The same applies if another node that had the inode open managed to delete the inode before us. Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: Move inode generation number check into gfs2_inode_lookupAndreas Gruenbacher1-9/+24
Move the inode generation number check from gfs2_lookup_by_inum into gfs2_inode_lookup: gfs2_inode_lookup may be able to decide that an inode with the given inode generation number cannot exist without having to verify the block type or reading the inode from disk. Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: Minor gfs2_lookup_by_inum cleanupAndreas Gruenbacher4-5/+14
Use a zero no_formal_ino instead of a NULL pointer to indicate that any inode generation number will qualify: a valid inode never has a zero no_formal_ino. Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-06-05gfs2: Try harder to delete inodes locallyAndreas Gruenbacher1-6/+47
When an inode's link count drops to zero and the inode is cached on other nodes, the current behavior of gfs2 is to immediately give up and to rely on the other node(s) to delete the inode if there is iopen glock contention. This leads to resource group glock bouncing and the loss of caching. With the previous patches in place, we can fix that by not giving up immediately. When the inode is still open on other nodes, those nodes won't be able to evict the inode and give up the iopen glock. In that case, our lock conversion request will time out. The unlink system call will block for the duration of the iopen lock conversion request. We're also holding the inode glock in EX mode for an extended duration, so other nodes won't be able to make progress on the inode, either. This is worse than what we had before, but we can prevent other nodes from getting stuck by aborting our iopen locking request if there is contention on the inode glock. This will the the subject of a future patch. Signed-off-by: Andreas Gruenbacher <[email protected]>