aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2020-09-10block: add a bdev_check_media_change helperChristoph Hellwig1-1/+1
Like check_disk_changed, except that it does not call ->revalidate_disk but leaves that to the caller. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-07block: Do not discard buffers under a mounted filesystemJan Kara1-0/+7
Discarding blocks and buffers under a mounted filesystem is hardly anything admin wants to do. Usually it will confuse the filesystem and sometimes the loss of buffer_head state (including b_private field) can even cause crashes like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G O --------- - - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1 Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015 RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2] ... Call Trace: __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2] jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2] kjournald2+0xbd/0x270 [jbd2] So if we don't have block device open with O_EXCL already, claim the block device while we truncate buffer cache. This makes sure any exclusive block device user (such as filesystem) cannot operate on the device while we are discarding buffer cache. Reported-by: Ye Bin <[email protected]> Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> [axboe: fix !CONFIG_BLOCK error in truncate_bdev_range()] Signed-off-by: Jens Axboe <[email protected]>
2020-09-03blk-mq, elevator: Count requests per hctx to improve performanceKashyap Desai1-0/+4
High CPU utilization on "native_queued_spin_lock_slowpath" due to lock contention is possible for mq-deadline and bfq IO schedulers when nr_hw_queues is more than one. It is because kblockd work queue can submit IO from all online CPUs (through blk_mq_run_hw_queues()) even though only one hctx has pending commands. The elevator callback .has_work for mq-deadline and bfq scheduler considers pending work if there are any IOs on request queue but it does not account hctx context. Add a per-hctx 'elevator_queued' count to the hctx to avoid triggering the elevator even though there are no requests queued. [jpg: Relocated atomic_dec() in dd_dispatch_request(), update commit message per Kashyap] Signed-off-by: Kashyap Desai <[email protected]> Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: John Garry <[email protected]> Tested-by: Douglas Gilbert <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-03blk-mq: Record active_queues_shared_sbitmap per tag_set for when using ↵John Garry2-0/+2
shared sbitmap For when using a shared sbitmap, no longer should the number of active request queues per hctx be relied on for when judging how to share the tag bitmap. Instead maintain the number of active request queues per tag_set, and make the judgement based on that. Originally-from: Kashyap Desai <[email protected]> Signed-off-by: John Garry <[email protected]> Tested-by: Don Brace<[email protected]> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-03blk-mq: Record nr_active_requests per queue for when using shared sbitmapJohn Garry1-0/+2
The per-hctx nr_active value can no longer be used to fairly assign a share of tag depth per request queue for when using a shared sbitmap, as it does not consider that the tags are shared tags over all hctx's. For this case, record the nr_active_requests per request_queue, and make the judgement based on that value. Co-developed-with: Kashyap Desai <[email protected]> Signed-off-by: John Garry <[email protected]> Tested-by: Don Brace<[email protected]> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-03blk-mq: Facilitate a shared sbitmap per tagsetJohn Garry1-0/+6
Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support multiple reply queues with single hostwide tags. In addition, these drivers want to use interrupt assignment in pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0], CPU hotplug may cause in-flight IO completion to not be serviced when an interrupt is shutdown. That problem is solved in commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline"). However, to take advantage of that blk-mq feature, the HBA HW queuess are required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW queues need to be exposed to the upper layer. In making that transition, the per-SCSI command request tags are no longer unique per Scsi host - they are just unique per hctx. As such, the HBA LLDD would have to generate this tag internally, which has a certain performance overhead. However another problem is that blk-mq assumes the host may accept (Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy counter was removed, which would stop the LLDD being sent more than .can_queue commands; however, it should still be ensured that the block layer does not issue more than .can_queue commands to the Scsi host. To solve this problem, introduce a shared sbitmap per blk_mq_tag_set, which may be requested at init time. New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the tagset to indicate whether the shared sbitmap should be used. Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests are still allocated per hctx; the reason for this is that if tags and requests were only allocated for a single hctx - like hctx0 - it may break block drivers which expect a request be associated with a specific hctx, i.e. not always hctx0. This will introduce extra memory usage. This change is based on work originally from Ming Lei in [1] and from Bart's suggestion in [2]. [0] https://lore.kernel.org/linux-block/[email protected]/ [1] https://lore.kernel.org/linux-block/[email protected]/ [2] https://lore.kernel.org/linux-block/[email protected]/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be Signed-off-by: John Garry <[email protected]> Tested-by: Don Brace<[email protected]> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-03blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHAREDMing Lei1-1/+1
BLK_MQ_F_TAG_SHARED actually means that tags is shared among request queues, all of which should belong to LUNs attached to same HBA. So rename it to make the point explicitly. [jpg: rebase a few times, add rnbd-clt.c change] Suggested-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: John Garry <[email protected]> Tested-by: Douglas Gilbert <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-02block: remove revalidate_disk()Christoph Hellwig1-1/+0
Remove the now unused helper. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-02block: use revalidate_disk_size in set_capacity_revalidate_and_notifyChristoph Hellwig1-2/+2
Only virtio_blk and xen-blkfront set the revalidate argument to true, and both do not implement the ->revalidate_disk method. So switch to the helper that just updates the size instead. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-02block: add a new revalidate_disk_size helperChristoph Hellwig1-0/+1
revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-02block: rename bd_invalidatedChristoph Hellwig1-1/+3
Replace bd_invalidate with a new BDEV_NEED_PART_SCAN flag in a bd_flags variable to better describe the condition. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: grant IOPRIO_CLASS_RT to CAP_SYS_NICEKhazhismel Kumykov1-0/+2
CAP_SYS_ADMIN is too broad, and ionice fits into CAP_SYS_NICE's grouping. Retain CAP_SYS_ADMIN permission for backwards compatibility. Signed-off-by: Khazhismel Kumykov <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01blk-iocost: restore inuse update tracepointsTejun Heo1-3/+3
Update and restore the inuse update tracepoints. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01blk-iocost: decouple vrate adjustment from surplus transfersTejun Heo1-9/+4
Budget donations are inaccurate and could take multiple periods to converge. To prevent triggering vrate adjustments while surplus transfers were catching up, vrate adjustment was suppressed if donations were increasing, which was indicated by non-zero nr_surpluses. This entangling won't be necessary with the scheduled rewrite of donation mechanism which will make it precise and immediate. Let's decouple the two in preparation. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_usTejun Heo1-5/+2
Currently, iocg->usages[] which are used to guide inuse adjustments are calculated from vtime deltas. This, however, assumes that the hierarchical inuse weight at the time of calculation held for the entire period, which often isn't true and can lead to significant errors. Now that we have absolute usage information collected, we can derive iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment decisions are made based on actual absolute usage. The calculated usage is clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal saturation regardless of the current hierarchical inuse weight. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: remove an outdated comment on the bd_dev fieldChristoph Hellwig1-1/+1
kdev_t is long gone, so we don't need to comment a field isn't one.. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: remove the discard_alignment field from struct hd_structChristoph Hellwig2-3/+2
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: remove the alignment_offset field from struct hd_structChristoph Hellwig2-4/+2
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: Move blk_mq_bio_list_merge() into blk-merge.cBaolin Wang1-2/+0
Move the blk_mq_bio_list_merge() into blk-merge.c and rename it as a generic name. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: remove the BIO_USER_MAPPED flagChristoph Hellwig1-1/+0
Just check if there is private data, in which case the bio must have originated from bio_copy_user_iov. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: remove the BIO_NULL_MAPPED flagChristoph Hellwig1-1/+0
We can simply use a boolean flag in the bio_map_data data structure instead. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: fix locking for struct block_device size updatesChristoph Hellwig1-0/+1
Two different callers use two different mutexes for updating the block device size, which obviously doesn't help to actually protect against concurrent updates from the different callers. In addition one of the locks, bd_mutex is rather prone to deadlocks with other parts of the block stack that use it for high level synchronization. Switch to using a new spinlock protecting just the size updates, as that is all we need, and make sure everyone does the update through the proper helper. This fixes a bug reported with the nvme revalidating disks during a hot removal operation, which can currently deadlock on bd_mutex. Reported-by: Xianting Tian <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: replace bd_set_size with bd_set_nr_sectorsChristoph Hellwig1-1/+1
Replace bd_set_size with a version that takes the number of sectors instead, as that fits most of the current and future callers much better. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-01block: Make request_queue.rpm_status an enumGeert Uytterhoeven1-1/+2
request_queue.rpm_status is assigned values of the rpm_status enum only, so reflect that in its type. Note that including <linux/pm.h> is (currently) a no-op, as it is already included through <linux/genhd.h> and <linux/device.h>, but it is better to play it safe. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-08-30Merge tag 'irq-urgent-2020-08-30' of ↵Linus Torvalds1-0/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Revert the platform driver conversion of interrupt chip drivers as it turned out to create more problems than it solves. - Fix a trivial typo in the new module helpers which made probing reliably fail. - Small fixes in the STM32 and MIPS Ingenic drivers - The TI firmware rework which had badly managed dependencies and had to wait post rc1" * tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/ingenic: Leave parent IRQ unmasked on suspend irqchip/stm32-exti: Avoid losing interrupts due to clearing pending bits by mistake irqchip: Revert modular support for drivers using IRQCHIP_PLATFORM_DRIVER helperse irqchip: Fix probing deferal when using IRQCHIP_PLATFORM_DRIVER helpers arm64: dts: k3-am65: Update the RM resource types arm64: dts: k3-am65: ti-sci-inta/intr: Update to latest bindings arm64: dts: k3-j721e: ti-sci-inta/intr: Update to latest bindings irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC irqchip/ti-sci-inta: Do not store TISCI device id in platform device id field dt-bindings: irqchip: Convert ti, sci-inta bindings to yaml dt-bindings: irqchip: ti, sci-inta: Update docs to support different parent. irqchip/ti-sci-intr: Add support for INTR being a parent to INTR dt-bindings: irqchip: Convert ti, sci-intr bindings to yaml dt-bindings: irqchip: ti, sci-intr: Update bindings to drop the usage of gic as parent firmware: ti_sci: Add support for getting resource with subtype firmware: ti_sci: Drop unused structure ti_sci_rm_type_map firmware: ti_sci: Drop the device id to resource type translation
2020-08-30Merge tag 'sched-urgent-2020-08-30' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "A single fix for the scheduler: - Make is_idle_task() __always_inline to prevent the compiler from putting it out of line into the wrong section because it's used inside noinstr sections" * tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Use __always_inline on is_idle_task()
2020-08-30Merge tag 'locking-urgent-2020-08-30' of ↵Linus Torvalds4-45/+64
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of fixes for lockdep, tracing and RCU: - Prevent recursion by using raw_cpu_* operations - Fixup the interrupt state in the cpu idle code to be consistent - Push rcu_idle_enter/exit() invocations deeper into the idle path so that the lock operations are inside the RCU watching sections - Move trace_cpu_idle() into generic code so it's called before RCU goes idle. - Handle raw_local_irq* vs. local_irq* operations correctly - Move the tracepoints out from under the lockdep recursion handling which turned out to be fragile and inconsistent" * tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep,trace: Expose tracepoints lockdep: Only trace IRQ edges mips: Implement arch_irqs_disabled() arm64: Implement arch_irqs_disabled() nds32: Implement arch_irqs_disabled() locking/lockdep: Cleanup x86/entry: Remove unused THUNKs cpuidle: Move trace_cpu_idle() into generic code cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic sched,idle,rcu: Push rcu_idle deeper into the idle path cpuidle: Fixup IRQ state lockdep: Use raw_cpu_*() for per-cpu variables
2020-08-30Merge tag 'powerpc-5.9-4' of ↵Linus Torvalds2-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Revert our removal of PROT_SAO, at least one user expressed an interest in using it on Power9. Instead don't allow it to be used in guests unless enabled explicitly at compile time. - A fix for a crash introduced by a recent change to FP handling. - Revert a change to our idle code that left Power10 with no idle support. - One minor fix for the new scv system call path to set PPR. - Fix a crash in our "generic" PMU if branch stack events were enabled. - A fix for the IMC PMU, to correctly identify host kernel samples. - The ADB_PMU powermac code was found to be incompatible with VMAP_STACK, so make them incompatible in Kconfig until the code can be fixed. - A build fix in drivers/video/fbdev/controlfb.c, and a documentation fix. Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy, Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin, Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan Srinivasan. * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check" powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc powerpc/perf: Fix crashes with generic_compat_pmu & BHRB powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode powerpc/64s: scv entry should set PPR Documentation/powerpc: fix malformed table in syscall64-abi video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n selftests/powerpc: Update PROT_SAO test to skip ISA 3.1 powerpc/64s: Disallow PROT_SAO in LPARs by default Revert "powerpc/64s: Remove PROT_SAO support"
2020-08-29Merge tag 'for-linus-5.9-rc3-tag' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two fixes for Xen: one needed for ongoing work to support virtio with Xen, and one for a corner case in IRQ handling with Xen" * tag 'for-linus-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: arm/xen: Add misuse warning to virt_to_gfn xen/xenbus: Fix granting of vmalloc'd memory XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.
2020-08-28Merge tag 'pm-5.9-rc3' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the recently added Tegra194 cpufreq driver and the handling of devices using runtime PM during system-wide suspend, improve the intel_pstate driver documentation and clean up the cpufreq core. Specifics: - Make the recently added Tegra194 cpufreq driver use read_cpuid_mpir() instead of cpu_logical_map() to avoid exporting logical_cpu_map (Sumit Gupta). - Drop the automatic system wakeup event reporting for devices with pending runtime-resume requests during system-wide suspend to avoid spurious aborts of the suspend flow (Rafael Wysocki). - Fix build warning in the intel_pstate driver documentation and improve the wording in there (Randy Dunlap). - Clean up two pieces of code in the cpufreq core (Viresh Kumar)" * tag 'pm-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Use WARN_ON_ONCE() for invalid relation cpufreq: No need to verify cpufreq_driver in show_scaling_cur_freq() PM: sleep: core: Fix the handling of pending runtime resume requests Documentation: fix pm/intel_pstate build warning and wording cpufreq: replace cpu_logical_map() with read_cpuid_mpir()
2020-08-28Merge branch 'pm-cpufreq'Rafael J. Wysocki1-2/+2
* pm-cpufreq: cpufreq: Use WARN_ON_ONCE() for invalid relation cpufreq: No need to verify cpufreq_driver in show_scaling_cur_freq() Documentation: fix pm/intel_pstate build warning and wording cpufreq: replace cpu_logical_map() with read_cpuid_mpir()
2020-08-28kernel.h: Silence sparse warning in lower_32_bitsHerbert Xu1-1/+1
I keep getting sparse warnings in crypto such as: CHECK drivers/crypto/ccree/cc_hash.c drivers/crypto/ccree/cc_hash.c:49:9: warning: cast truncates bits from constant value (47b5481dbefa4fa4 becomes befa4fa4) drivers/crypto/ccree/cc_hash.c:49:26: warning: cast truncates bits from constant value (db0c2e0d64f98fa7 becomes 64f98fa7) [.. many more ..] This patch removes the warning by adding a mask to keep sparse happy. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-08-28Merge tag 'writeback_for_v5.9-rc3' of ↵Linus Torvalds2-10/+11
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull writeback fixes from Jan Kara: "Fixes for writeback code occasionally skipping writeback of some inodes or livelocking sync(2)" * tag 'writeback_for_v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: Drop I_DIRTY_TIME_EXPIRE writeback: Fix sync livelock due to b_dirty_time processing writeback: Avoid skipping inode writeback writeback: Protect inode->i_io_list with inode->i_lock
2020-08-28Merge tag 'ceph-for-5.9-rc3' of git://github.com/ceph/ceph-clientLinus Torvalds1-4/+4
Pull ceph fixes from Ilya Dryomov: "We have an inode number handling change, prompted by s390x which is a 64-bit architecture with a 32-bit ino_t, a patch to disallow leases to avoid potential data integrity issues when CephFS is re-exported via NFS or CIFS and a fix for the bulk of W=1 compilation warnings" * tag 'ceph-for-5.9-rc3' of git://github.com/ceph/ceph-client: ceph: don't allow setlease on cephfs ceph: fix inode number handling on arches with 32-bit ino_t libceph: add __maybe_unused to DEFINE_CEPH_FEATURE
2020-08-28Merge tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-2/+7
Pull drm fixes from Dave Airlie: "As expected a bit of an rc3 uptick, amdgpu and msm are the main ones, one msm patch was from the merge window, but had dependencies and we dropped it until the other tree had landed. Otherwise it's a couple of fixes for core, and etnaviv, and single i915, exynos, omap fixes. I'm still tracking the Sandybridge gpu relocations issue, if we don't see much movement I might just queue up the reverts. I'll talk to Daniel next week once he's back from holidays. core: - Take modeset bkl for legacy drivers dp_mst: - Allow null crtc in dp_mst i915: - Fix command parser desc matching with masks amdgpu: - Misc display fixes - Backlight fixes - MPO fix for DCN1 - Fixes for Sienna Cichlid - Fixes for Navy Flounder - Vega SW CTF fixes - SMU fix for Raven - Fix a possible overflow in INFO ioctl - Gfx10 clockgating fix msm: - opp/bw scaling patch followup - frequency restoring fux - vblank in atomic commit fix - dpu modesetting fixes - fencing fix etnaviv: - scheduler interaction fix - gpu init regression fix exynos: - Just drop __iommu annotation to fix sparse warning omap: - locking state fix" * tag 'drm-fixes-2020-08-28' of git://anongit.freedesktop.org/drm/drm: (41 commits) drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init drm/amdgpu: disable runtime pm for navy_flounder drm/amd/display: Retry AUX write when fail occurs drm/amdgpu: Fix buffer overflow in INFO ioctl drm/amd/powerplay: Fix hardmins not being sent to SMU for RV drm/amdgpu: use MODE1 reset for navy_flounder by default drm/amd/pm: correct the thermal alert temperature limit settings drm/amdgpu: add asd fw check before loading asd drm/amd/display: Keep current gain when ABM disable immediately drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation drm/amd/display: Revert HDCP disable sequence change drm/amd/display: Send DISPLAY_OFF after power down on boot drm/amdgpu/gfx10: refine mgcg setting drm/amd/pm: correct Vega20 swctf limit setting drm/amd/pm: correct Vega12 swctf limit setting drm/amd/pm: correct Vega10 swctf limit setting drm/amd/pm: set VCN pg per instances drm/amd/pm: enable run_btc callback for sienna_cichlid drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps drm/amd/display: Reject overlay plane configurations in multi-display scenarios ...
2020-08-27cpufreq: Use WARN_ON_ONCE() for invalid relationViresh Kumar1-2/+2
The relation can't be invalid here, so if it turns out to be invalid, just WARN_ON_ONCE() and return 0. Signed-off-by: Viresh Kumar <[email protected]> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2020-08-27arm/xen: Add misuse warning to virt_to_gfnSimon Leiner1-1/+5
As virt_to_gfn uses virt_to_phys, it will return invalid addresses when used with vmalloc'd addresses. This patch introduces a warning, when virt_to_gfn is used in this way. Signed-off-by: Simon Leiner <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-08-26lockdep: Only trace IRQ edgesNicholas Piggin1-8/+7
Problem: raw_local_irq_save(); // software state on local_irq_save(); // software state off ... local_irq_restore(); // software state still off, because we don't enable IRQs raw_local_irq_restore(); // software state still off, *whoopsie* existing instances: - lock_acquire() raw_local_irq_save() __lock_acquire() arch_spin_lock(&graph_lock) pv_wait() := kvm_wait() (same or worse for Xen/HyperV) local_irq_save() - trace_clock_global() raw_local_irq_save() arch_spin_lock() pv_wait() := kvm_wait() local_irq_save() - apic_retrigger_irq() raw_local_irq_save() apic->send_IPI() := default_send_IPI_single_phys() local_irq_save() Possible solutions: A) make it work by enabling the tracing inside raw_*() B) make it work by keeping tracing disabled inside raw_*() C) call it broken and clean it up now Now, given that the only reason to use the raw_* variant is because you don't want tracing. Therefore A) seems like a weird option (although it can be done). C) is tempting, but OTOH it ends up converting a _lot_ of code to raw just because there is one raw user, this strips the validation/tracing off for all the other users. So we pick B) and declare any code that ends up doing: raw_local_irq_save() local_irq_save() lockdep_assert_irqs_disabled(); broken. AFAICT this problem has existed forever, the only reason it came up is because commit: 859d069ee1dd ("lockdep: Prepare for NMI IRQ state tracking") changed IRQ tracing vs lockdep recursion and the first instance is fairly common, the other cases hardly ever happen. Signed-off-by: Nicholas Piggin <[email protected]> [rewrote changelog] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Marco Elver <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-26locking/lockdep: CleanupPeter Zijlstra1-24/+30
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Marco Elver <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-26cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED genericPeter Zijlstra2-6/+12
This allows moving the leave_mm() call into generic code before rcu_idle_enter(). Gets rid of more trace_*_rcuidle() users. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Marco Elver <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-26lockdep: Use raw_cpu_*() for per-cpu variablesPeter Zijlstra2-8/+16
Sven reported that commit a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables") caused trouble on s390 because their this_cpu_*() primitives disable preemption which then lands back tracing. On the one hand, per-cpu ops should use preempt_*able_notrace() and raw_local_irq_*(), on the other hand, we can trivialy use raw_cpu_*() ops for this. Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables") Reported-by: Sven Schnelle <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-26sched: Use __always_inline on is_idle_task()Marco Elver1-1/+1
is_idle_task() may be used from noinstr functions such as irqentry_enter(). Since the compiler is free to not inline regular inline functions, switch to using __always_inline. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-25Merge tag 'v5.9-rc2' into drm-misc-fixesMaarten Lankhorst841-8950/+18287
Backmerge requested by Tomi for a fix to omap inconsistent locking state issue, and because we need at least v5.9-rc2 now. Signed-off-by: Maarten Lankhorst <[email protected]>
2020-08-24libceph: add __maybe_unused to DEFINE_CEPH_FEATUREIlya Dryomov1-4/+4
Avoid -Wunused-const-variable warnings for "make W=1". Reported-by: Leon Romanovsky <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]>
2020-08-24Revert "powerpc/64s: Remove PROT_SAO support"Shawn Anastasio2-0/+4
This reverts commit 5c9fa16e8abd342ce04dc830c1ebb2a03abf6c05. Since PROT_SAO can still be useful for certain classes of software, reintroduce it. Concerns about guest migration for LPARs using SAO will be addressed next. Signed-off-by: Shawn Anastasio <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva7-37/+40
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <[email protected]>
2020-08-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2-6/+7
Pull networking fixes from David Miller: "Nothing earth shattering here, lots of small fixes (f.e. missing RCU protection, bad ref counting, missing memset(), etc.) all over the place: 1) Use get_file_rcu() in task_file iterator, from Yonghong Song. 2) There are two ways to set remote source MAC addresses in macvlan driver, but only one of which validates things properly. Fix this. From Alvin Šipraga. 3) Missing of_node_put() in gianfar probing, from Sumera Priyadarsini. 4) Preserve device wanted feature bits across multiple netlink ethtool requests, from Maxim Mikityanskiy. 5) Fix rcu_sched stall in task and task_file bpf iterators, from Yonghong Song. 6) Avoid reset after device destroy in ena driver, from Shay Agroskin. 7) Missing memset() in netlink policy export reallocation path, from Johannes Berg. 8) Fix info leak in __smc_diag_dump(), from Peilin Ye. 9) Decapsulate ECN properly for ipv6 in ipv4 tunnels, from Mark Tomlinson. 10) Fix number of data stream negotiation in SCTP, from David Laight. 11) Fix double free in connection tracker action module, from Alaa Hleihel. 12) Don't allow empty NHA_GROUP attributes, from Nikolay Aleksandrov" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits) net: nexthop: don't allow empty NHA_GROUP bpf: Fix two typos in uapi/linux/bpf.h net: dsa: b53: check for timeout tipc: call rcu_read_lock() in tipc_aead_encrypt_done() net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow net: sctp: Fix negotiation of the number of data streams. dt-bindings: net: renesas, ether: Improve schema validation gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit() hv_netvsc: Remove "unlikely" from netvsc_select_queue bpf: selftests: global_funcs: Check err_str before strstr bpf: xdp: Fix XDP mode when no mode flags specified selftests/bpf: Remove test_align leftovers tools/resolve_btfids: Fix sections with wrong alignment net/smc: Prevent kernel-infoleak in __smc_diag_dump() sfc: fix build warnings on 32-bit net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified" libbpf: Fix map index used in error message net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe() net: atlantic: Use readx_poll_timeout() for large timeout ...
2020-08-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-5/+5
Alexei Starovoitov says: ==================== pull-request: bpf 2020-08-21 The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 5 day(s) which contain a total of 12 files changed, 78 insertions(+), 24 deletions(-). The main changes are: 1) three fixes in BPF task iterator logic, from Yonghong. 2) fix for compressed dwarf sections in vmlinux, from Jiri. 3) fix xdp attach regression, from Andrii. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-21Merge tag 'riscv-for-linus-5.9-rc2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - The CLINT driver has been split in two: one to handle the M-mode CLINT (memory mapped and used on NOMMU systems) and one to handle the S-mode CLINT (via SBI). - The addition of SiFive's drivers to rv32_defconfig * tag 'riscv-for-linus-5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Add SiFive drivers to rv32_defconfig dt-bindings: timer: Add CLINT bindings RISC-V: Remove CLINT related code from timer and arch clocksource/drivers: Add CLINT timer driver RISC-V: Add mechanism to provide custom IPI operations
2020-08-21bpf: Fix two typos in uapi/linux/bpf.hTobias Klauser1-5/+5
Also remove trailing whitespaces in bpf_skb_get_tunnel_key example code. Signed-off-by: Tobias Klauser <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]