aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski5-49/+12
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net coming late in the 5.17-rc process: 1) Revert port remap to mitigate shadowing service ports, this is causing problems in existing setups and this mitigation can be achieved with explicit ruleset, eg. ... tcp sport < 16386 tcp dport >= 32768 masquerade random This patches provided a built-in policy similar to the one described above. 2) Disable register tracking infrastructure in nf_tables. Florian reported two issues: - Existing expressions with no implemented .reduce interface that causes data-store on register should cancel the tracking. - Register clobbering might be possible storing data on registers that are larger than 32-bits. This might lead to generating incorrect ruleset bytecode. These two issues are scheduled to be addressed in the next release cycle. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: disable register tracking Revert "netfilter: conntrack: tag conntracks picked up in local out hook" Revert "netfilter: nat: force port remap to prevent shadowing well-known ports" ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-14net: phy: marvell: Fix invalid comparison in the resume and suspend functionsKurt Cancemi1-4/+4
This bug resulted in only the current mode being resumed and suspended when the PHY supported both fiber and copper modes and when the PHY only supported copper mode the fiber mode would incorrectly be attempted to be resumed and suspended. Fixes: 3758be3dc162 ("Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links.") Signed-off-by: Kurt Cancemi <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-14block: fix rq-qos breakage from skipping rq_qos_done_bio()Tejun Heo4-13/+15
a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked") made bio_endio() skip rq_qos_done_bio() if BIO_TRACKED is not set. While this fixed a potential oops, it also broke blk-iocost by skipping the done_bio callback for merged bios. Before, whether a bio goes through rq_qos_throttle() or rq_qos_merge(), rq_qos_done_bio() would be called on the bio on completion with BIO_TRACKED distinguishing the former from the latter. rq_qos_done_bio() is not called for bios which wenth through rq_qos_merge(). This royally confuses blk-iocost as the merged bios never finish and are considered perpetually in-flight. One reliably reproducible failure mode is an intermediate cgroup geting stuck active preventing its children from being activated due to the leaf-only rule, leading to loss of control. The following is from resctl-bench protection scenario which emulates isolating a web server like workload from a memory bomb run on an iocost configuration which should yield a reasonable level of protection. # cat /sys/block/nvme2n1/device/model Samsung SSD 970 PRO 512GB # cat /sys/fs/cgroup/io.cost.model 259:0 ctrl=user model=linear rbps=834913556 rseqiops=93622 rrandiops=102913 wbps=618985353 wseqiops=72325 wrandiops=71025 # cat /sys/fs/cgroup/io.cost.qos 259:0 enable=1 ctrl=user rpct=95.00 rlat=18776 wpct=95.00 wlat=8897 min=60.00 max=100.00 # resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1 ... Memory Hog Summary ================== IO Latency: R p50=242u:336u/2.5m p90=794u:1.4m/7.5m p99=2.7m:8.0m/62.5m max=8.0m:36.4m/350m W p50=221u:323u/1.5m p90=709u:1.2m/5.5m p99=1.5m:2.5m/9.5m max=6.9m:35.9m/350m Isolation and Request Latency Impact Distributions: min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev isol% 15.90 15.90 15.90 40.05 57.24 59.07 60.01 74.63 74.63 90.35 90.35 58.12 15.82 lat-imp% 0 0 0 0 0 4.55 14.68 15.54 233.5 548.1 548.1 53.88 143.6 Result: isol=58.12:15.82% lat_imp=53.88%:143.6 work_csv=100.0% missing=3.96% The isolation result of 58.12% is close to what this device would show without any IO control. Fix it by introducing a new flag BIO_QOS_MERGED to mark merged bios and calling rq_qos_done_bio() on them too. For consistency and clarity, rename BIO_TRACKED to BIO_QOS_THROTTLED. The flag checks are moved into rq_qos_done_bio() so that it's next to the code paths that set the flags. With the patch applied, the above same benchmark shows: # resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1 ... Memory Hog Summary ================== IO Latency: R p50=123u:84.4u/985u p90=322u:256u/2.5m p99=1.6m:1.4m/9.5m max=11.1m:36.0m/350m W p50=429u:274u/995u p90=1.7m:1.3m/4.5m p99=3.4m:2.7m/11.5m max=7.9m:5.9m/26.5m Isolation and Request Latency Impact Distributions: min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev isol% 84.91 84.91 89.51 90.73 92.31 94.49 96.36 98.04 98.71 100.0 100.0 94.42 2.81 lat-imp% 0 0 0 0 0 2.81 5.73 11.11 13.92 17.53 22.61 4.10 4.68 Result: isol=94.42:2.81% lat_imp=4.10%:4.68 work_csv=58.34% missing=0% Signed-off-by: Tejun Heo <[email protected]> Fixes: a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked") Cc: [email protected] # v5.15+ Cc: Ming Lei <[email protected]> Cc: Yu Kuai <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-03-14block: release rq qos structures for queue without diskMing Lei1-0/+4
blkcg_init_queue() may add rq qos structures to request queue, previously blk_cleanup_queue() calls rq_qos_exit() to release them, but commit 8e141f9eb803 ("block: drain file system I/O on del_gendisk") moves rq_qos_exit() into del_gendisk(), so memory leak is caused because queues may not have disk, such as un-present scsi luns, nvme admin queue, ... Fixes the issue by adding rq_qos_exit() to blk_cleanup_queue() back. BTW, v5.18 won't need this patch any more since we move blkcg_init_queue()/blkcg_exit_queue() into disk allocation/release handler, and patches have been in for-5.18/block. Cc: Christoph Hellwig <[email protected]> Cc: [email protected] Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk") Reported-by: [email protected] Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-03-14Merge branch 'for-next/spectre-bhb' into for-next/coreWill Deacon28-78/+832
Merge in the latest Spectre mess to fix up conflicts with what was already queued for 5.18 when the embargo finally lifted. * for-next/spectre-bhb: (21 commits) arm64: Do not include __READ_ONCE() block in assembly files arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting arm64: Use the clearbhb instruction in mitigations KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated arm64: Mitigate spectre style branch history side channels arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 arm64: Add percpu vectors for EL1 arm64: entry: Add macro for reading symbol addresses from the trampoline arm64: entry: Add vectors that have the bhb mitigation sequences arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations arm64: entry: Allow the trampoline text to occupy multiple pages arm64: entry: Make the kpti trampoline's kpti sequence optional arm64: entry: Move trampoline macros out of ifdef'd section arm64: entry: Don't assume tramp_vectors is the start of the vectors arm64: entry: Allow tramp_alias to access symbols after the 4K boundary arm64: entry: Move the trampoline data page before the text page arm64: entry: Free up another register on kpti's tramp_exit path arm64: entry: Make the trampoline cleanup optional KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit ...
2022-03-14Merge branch 'for-next/fpsimd' into for-next/coreWill Deacon7-81/+126
* for-next/fpsimd: arm64: cpufeature: Warn if we attempt to read a zero width field arm64: cpufeature: Add missing .field_width for GIC system registers arm64: signal: nofpsimd: Do not allocate fp/simd context when not available arm64: cpufeature: Always specify and use a field width for capabilities arm64: Always use individual bits in CPACR floating point enables arm64: Define CPACR_EL1_FPEN similarly to other floating point controls
2022-03-14Merge branch 'for-next/strings' into for-next/coreWill Deacon4-213/+269
* for-next/strings: Revert "arm64: Mitigate MTE issues with str{n}cmp()" arm64: lib: Import latest version of Arm Optimized Routines' strncmp arm64: lib: Import latest version of Arm Optimized Routines' strcmp
2022-03-14Merge branch 'for-next/rng' into for-next/coreWill Deacon1-6/+39
* for-next/rng: arm64: random: implement arch_get_random_int/_long based on RNDR
2022-03-14Merge branch 'for-next/perf' into for-next/coreWill Deacon23-198/+1800
* for-next/perf: (25 commits) perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driver drivers/perf: Add Apple icestorm/firestorm CPU PMU driver drivers/perf: arm_pmu: Handle 47 bit counters arm64: perf: Consistently make all event numbers as 16-bits arm64: perf: Expose some Armv9 common events under sysfs perf/marvell: cn10k DDR perf event core ownership perf/marvell: cn10k DDR perfmon event overflow handling perf/marvell: CN10k DDR performance monitor support dt-bindings: perf: marvell: cn10k ddr performance monitor perf/arm-cmn: Update watchpoint format perf/arm-cmn: Hide XP PUB events for CMN-600 perf: replace bitmap_weight with bitmap_empty where appropriate perf: Replace acpi_bus_get_device() perf/marvell_cn10k: Fix unused variable warning when W=1 and CONFIG_OF=n perf/arm-cmn: Make arm_cmn_debugfs static perf: MARVELL_CN10K_TAD_PMU should depend on ARCH_THUNDER perf/arm-ccn: Use platform_get_irq() to get the interrupt irqchip/apple-aic: Move PMU-specific registers to their own include file arm64: dts: apple: Add t8303 PMU nodes arm64: dts: apple: Add t8103 PMU interrupt affinities ...
2022-03-14Merge branch 'for-next/pauth' into for-next/coreWill Deacon11-13/+110
* for-next/pauth: arm64: Add support of PAuth QARMA3 architected algorithm arm64: cpufeature: Mark existing PAuth architected algorithm as QARMA5 arm64: cpufeature: Account min_field_value when cheking secondaries for PAuth
2022-03-14Merge branch 'for-next/mte' into for-next/coreWill Deacon15-39/+127
* for-next/mte: docs: sysfs-devices-system-cpu: document "asymm" value for mte_tcf_preferred arm64/mte: Remove asymmetric mode from the prctl() interface kasan: fix a missing header include of static_keys.h arm64/mte: Add userspace interface for enabling asymmetric mode arm64/mte: Add hwcap for asymmetric mode arm64/mte: Add a little bit of documentation for mte_update_sctlr_user() arm64/mte: Document ABI for asymmetric mode arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary kasan: split kasan_*enabled() functions into a separate header
2022-03-14Merge branch 'for-next/mm' into for-next/coreWill Deacon8-38/+53
* for-next/mm: Documentation: vmcoreinfo: Fix htmldocs warning arm64/mm: Drop use_1G_block() arm64: avoid flushing icache multiple times on contiguous HugeTLB arm64: crash_core: Export MODULES, VMALLOC, and VMEMMAP ranges arm64/hugetlb: Define __hugetlb_valid_size() arm64/mm: avoid fixmap race condition when create pud mapping arm64/mm: Consolidate TCR_EL1 fields
2022-03-14Merge branch 'for-next/misc' into for-next/coreWill Deacon11-32/+80
* for-next/misc: arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition arm64: clean up tools Makefile arm64: drop unused includes of <linux/personality.h> arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones arm64: prevent instrumentation of bp hardening callbacks arm64: cpufeature: Remove cpu_has_fwb() check arm64: atomics: remove redundant static branch arm64: entry: Save some nops when CONFIG_ARM64_PSEUDO_NMI is not set
2022-03-14Merge branch 'for-next/linkage' into for-next/coreWill Deacon27-163/+172
* for-next/linkage: arm64: module: remove (NOLOAD) from linker script linkage: remove SYM_FUNC_{START,END}_ALIAS() x86: clean up symbol aliasing arm64: clean up symbol aliasing linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()
2022-03-14Merge branch 'for-next/kselftest' into for-next/coreWill Deacon7-56/+190
* for-next/kselftest: kselftest/arm64: Log the PIDs of the parent and child in sve-ptrace kselftest/arm64: signal: Allow tests to be incompatible with features kselftest/arm64: mte: user_mem: test a wider range of values kselftest/arm64: mte: user_mem: add more test types kselftest/arm64: mte: user_mem: add test type enum kselftest/arm64: mte: user_mem: check different offsets and sizes kselftest/arm64: mte: user_mem: rework error handling kselftest/arm64: mte: user_mem: introduce tag_offset and tag_len kselftest/arm64: Remove local definitions of MTE prctls kselftest/arm64: Remove local ARRAY_SIZE() definitions
2022-03-14Merge branch 'for-next/insn' into for-next/coreWill Deacon5-36/+268
* for-next/insn: arm64: insn: add encoders for atomic operations arm64: move AARCH64_BREAK_FAULT into insn-def.h arm64: insn: Generate 64 bit mask immediates correctly
2022-03-14Merge branch 'for-next/errata' into for-next/coreWill Deacon5-8/+59
* for-next/errata: arm64: Add cavium_erratum_23154_cpus missing sentinel irqchip/gic-v3: Workaround Marvell erratum 38545 when reading IAR
2022-03-14Merge branch 'for-next/docs' into for-next/coreWill Deacon2-7/+8
* for-next/docs: arm64/mte: Clarify mode reported by PR_GET_TAGGED_ADDR_CTRL arm64: booting.rst: Clarify on requiring non-secure EL2
2022-03-14Merge branch 'for-next/coredump' into for-next/coreWill Deacon12-5/+173
* for-next/coredump: arm64: Change elfcore for_each_mte_vma() to use VMA iterator arm64: mte: Document the core dump file format arm64: mte: Dump the MTE tags in the core file arm64: mte: Define the number of bytes for storing the tags in a page elf: Introduce the ARM MTE ELF segment type elfcore: Replace CONFIG_{IA64, UML} checks with a new option
2022-03-14Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-1/+3
Pull virtio fix from Michael Tsirkin: "A last minute regression fix. I thought we did a lot of testing, but a regression still managed to sneak in. The fix seems trivial" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: allow batching hint without size
2022-03-14Merge tag 'v5.17-rc8' into irq/core, to fix conflictsIngo Molnar689-2988/+7397
Conflicts: drivers/pinctrl/pinctrl-starfive.c Signed-off-by: Ingo Molnar <[email protected]>
2022-03-14esp6: fix check on ipv6_skip_exthdr's return valueSabrina Dubroca1-2/+1
Commit 5f9c55c8066b ("ipv6: check return value of ipv6_skip_exthdr") introduced an incorrect check, which leads to all ESP packets over either TCPv6 or UDPv6 encapsulation being dropped. In this particular case, offset is negative, since skb->data points to the ESP header in the following chain of headers, while skb->network_header points to the IPv6 header: IPv6 | ext | ... | ext | UDP | ESP | ... That doesn't seem to be a problem, especially considering that if we reach esp6_input_done2, we're guaranteed to have a full set of headers available (otherwise the packet would have been dropped earlier in the stack). However, it means that the return value will (intentionally) be negative. We can make the test more specific, as the expected return value of ipv6_skip_exthdr will be the (negated) size of either a UDP header, or a TCP header with possible options. In the future, we should probably either make ipv6_skip_exthdr explicitly accept negative offsets (and adjust its return value for error cases), or make ipv6_skip_exthdr only take non-negative offsets (and audit all callers). Fixes: 5f9c55c8066b ("ipv6: check return value of ipv6_skip_exthdr") Reported-by: Xiumei Mu <[email protected]> Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2022-03-14net: dsa: microchip: add spi_device_id tablesClaudiu Beznea2-0/+23
Add spi_device_id tables to avoid logs like "SPI driver ksz9477-switch has no spi_device_id". Signed-off-by: Claudiu Beznea <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-03-14Merge tag 'irqchip-5.18' of ↵Thomas Gleixner46-444/+1618
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Add support for the STM32MP13 variant - Move parent device away from struct irq_chip - Remove all instances of non-const strings assigned to struct irq_chip::name, enabling a nice cleanup for VIC and GIC) - Simplify the Qualcomm PDC driver - A bunch of SiFive PLIC cleanups - Add support for a new variant of the Meson GPIO block - Add support for the irqchip side of the Apple M1 PMU - Add support for the Apple M1 Pro/Max AICv2 irqchip - Add support for the Qualcomm MPM wakeup gadget - Move the Xilinx driver over to the generic irqdomain handling - Tiny speedup for IPIs on GICv3 systems - The usual odd cleanups Link: https://lore.kernel.org/all/[email protected]
2022-03-14Merge tag 'timers-v5.18-rc1' of ↵Thomas Gleixner15-134/+205
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/events updates from Daniel Lezcano: - Fix return error code check for the timer-of layer when getting the base address (Guillaume Ranquet) - Remove MMIO dependency, add notrace annotation for sched_clock and increase the timer resolution for the Microchip PIT64b (Claudiu Beznea) - Convert DT bindings to yaml for the Tegra timer (David Heidelberg) - Fix compilation error on architecture other than ARM for the i.MX TPM (Nathan Chancellor) - Add support for the event stream scaling for 1GHz counter on the arch ARM timer (Marc Zyngier) - Support a higher number of interrupts by the Exynos MCT timer driver (Alim Akhtar) - Detect and prevent memory corruption when the specified number of interrupts in the DTS is greater than the array size in the code for the Exynos MCT timer (Krzysztof Kozlowski) - Fix regression from a previous errata fix on the TI DM timer (Drew Fustini) - Several fixes and code improvements for the i.MX TPM driver (Peng Fan) Link: https://lore.kernel.org/all/[email protected]
2022-03-14Merge branch 'timers/core' of ↵Thomas Gleixner6-22/+78
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core Pull tick/NOHZ updates from Frederic Weisbecker: - A fix for rare jiffies update stalls that were reported by Paul McKenney - Tick side cleanups after RCU_FAST_NO_HZ removal - Handle softirqs on idle more gracefully Link: https://lore.kernel.org/all/[email protected]
2022-03-14nvmet: use snprintf() with PAGE_SIZE in configfsChaitanya Kulkarni1-5/+8
Instead of using sprintf, use snprintf with buffer size limited to PAGE_SIZE just like what we have for the rest of the file. Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvmet: don't fold linesChaitanya Kulkarni1-9/+5
Don't fold line that can fit into 80 char limit. No functional change in this patch. Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removalChaitanya Kulkarni1-1/+1
This fixes following kernel-doc warning:- drivers/nvme/target/rdma.c:1722: warning: expecting prototype for nvme_rdma_device_removal(). Prototype was for nvmet_rdma_device_removal() instead Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetportChaitanya Kulkarni1-1/+1
This fixes following kernel-doc warning:- drivers/nvme/target/fc.c:1619: warning: expecting prototype for nvme_fc_unregister_targetport(). Prototype was for nvmet_fc_unregister_targetport() instead Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetportChaitanya Kulkarni1-1/+1
This fixes following kernel-doc warning :- drivers/nvme/target/fc.c:1365: warning: expecting prototype for nvme_fc_register_targetport(). Prototype was for nvmet_fc_register_targetport() instead Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvme-tcp: lockdep: annotate in-kernel socketsChris Leech1-0/+40
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger certain code paths that the general socket API exposes. Lockdep complains about a circular dependency between the socket and filesystem locks, because setsockopt can trigger a page fault with a socket lock held, but nvme-tcp sends requests on the socket while file system locks are held. ====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc3 #1 Not tainted ------------------------------------------------------ fio/1496 is trying to acquire lock: (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80 but task is already holding lock: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] which lock already depends on the new lock. other info that might help us debug this: chain exists of: sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&xfs_dir_ilock_class/5); lock(sb_internal); lock(&xfs_dir_ilock_class/5); lock(sk_lock-AF_INET); *** DEADLOCK *** 6 locks held by fio/1496: #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20 #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20 #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs] #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0 #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp] This annotation lets lockdep analyze nvme-tcp controlled sockets independently of what the user-space sockets API does. Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/ Signed-off-by: Chris Leech <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvme-tcp: don't fold the lineChaitanya Kulkarni1-2/+1
The call to nvme_tcp_alloc_queue() fits perfectly in one line without exceeding 80 char limit for the line. Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvme-tcp: don't initialize ret variableChaitanya Kulkarni1-1/+1
No point in initializing ret variable to 0 in nvme_tcp_start_io_queue() since it gets overwritten by a call to nvme_tcp_start_queue(). Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvme-multipath: call bio_io_error in nvme_ns_head_submit_bioGuoqing Jiang1-2/+1
Use bio_io_error() here since bio_io_error does the same thing. Signed-off-by: Guoqing Jiang <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14nvme-multipath: use vmalloc for ANA log bufferHannes Reinecke1-2/+3
The ANA log buffer can get really large, as it depends on the controller configuration. So to avoid an out-of-memory issue during scanning use kvmalloc() instead of the kmalloc(). Signed-off-by: Hannes Reinecke <[email protected]> Tested-by: Daniel Wagner <[email protected]> Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-03-14crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TESTHerbert Xu1-2/+2
This patch turns the new SHA driver into a tristate and also allows compile testing. Signed-off-by: Herbert Xu <[email protected]>
2022-03-14MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers listLongfang Liu1-3/+3
Zaibo moved projects and is not looking into crypto stuff. I am responsible for checking the patches of these modules. so the maintainers list needs to be updated. I take care of HPRE, Qian Weili take care of TRNG, Ye Kai and me take care of SEC2. Signed-off-by: Longfang Liu <[email protected]> Signed-off-by: Kai Ye <[email protected]> Signed-off-by: Weili Qian <[email protected]> Signed-off-by: Zaibo Xu <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2022-03-14crypto: dh - Remove the unused function dh_safe_prime_dh_alg()Jiapeng Chong1-6/+0
Fix the following W=1 kernel warnings: crypto/dh.c:311:31: warning: unused function 'dh_safe_prime_dh_alg' [-Wunused-function] Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2022-03-14hwrng: nomadik - Change clk_disable to clk_disable_unprepareMiaoqian Lin1-2/+2
The corresponding API for clk_prepare_enable is clk_disable_unprepare, other than clk_disable_unprepare. Fix this by changing clk_disable to clk_disable_unprepare. Fixes: beca35d05cc2 ("hwrng: nomadik - use clk_prepare_enable()") Signed-off-by: Miaoqian Lin <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2022-03-14crypto: qcom-rng - ensure buffer for generate is completely filledBrian Masney1-7/+10
The generate function in struct rng_alg expects that the destination buffer is completely filled if the function returns 0. qcom_rng_read() can run into a situation where the buffer is partially filled with randomness and the remaining part of the buffer is zeroed since qcom_rng_generate() doesn't check the return value. This issue can be reproduced by running the following from libkcapi: kcapi-rng -b 9000000 > OUTFILE The generated OUTFILE will have three huge sections that contain all zeros, and this is caused by the code where the test 'val & PRNG_STATUS_DATA_AVAIL' fails. Let's fix this issue by ensuring that qcom_rng_read() always returns with a full buffer if the function returns success. Let's also have qcom_rng_generate() return the correct value. Here's some statistics from the ent project (https://www.fourmilab.ch/random/) that shows information about the quality of the generated numbers: $ ent -c qcom-random-before Value Char Occurrences Fraction 0 606748 0.067416 1 33104 0.003678 2 33001 0.003667 ... 253 � 32883 0.003654 254 � 33035 0.003671 255 � 33239 0.003693 Total: 9000000 1.000000 Entropy = 7.811590 bits per byte. Optimum compression would reduce the size of this 9000000 byte file by 2 percent. Chi square distribution for 9000000 samples is 9329962.81, and randomly would exceed this value less than 0.01 percent of the times. Arithmetic mean value of data bytes is 119.3731 (127.5 = random). Monte Carlo value for Pi is 3.197293333 (error 1.77 percent). Serial correlation coefficient is 0.159130 (totally uncorrelated = 0.0). Without this patch, the results of the chi-square test is 0.01%, and the numbers are certainly not random according to ent's project page. The results improve with this patch: $ ent -c qcom-random-after Value Char Occurrences Fraction 0 35432 0.003937 1 35127 0.003903 2 35424 0.003936 ... 253 � 35201 0.003911 254 � 34835 0.003871 255 � 35368 0.003930 Total: 9000000 1.000000 Entropy = 7.999979 bits per byte. Optimum compression would reduce the size of this 9000000 byte file by 0 percent. Chi square distribution for 9000000 samples is 258.77, and randomly would exceed this value 42.24 percent of the times. Arithmetic mean value of data bytes is 127.5006 (127.5 = random). Monte Carlo value for Pi is 3.141277333 (error 0.01 percent). Serial correlation coefficient is 0.000468 (totally uncorrelated = 0.0). This change was tested on a Nexus 5 phone (msm8974 SoC). Signed-off-by: Brian Masney <[email protected]> Fixes: ceec5f5b5988 ("crypto: qcom-rng - Add Qcom prng driver") Cc: [email protected] # 4.19+ Reviewed-by: Bjorn Andersson <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2022-03-13Linux 5.17-rc8Linus Torvalds1-1/+1
2022-03-13drm/mgag200: Fix PLL setup for g200wb and g200ewJocelyn Falempe1-3/+3
commit f86c3ed55920 ("drm/mgag200: Split PLL setup into compute and update functions") introduced a regression for g200wb and g200ew. The PLLs are not set up properly, and VGA screen stays black, or displays "out of range" message. MGA1064_WB_PIX_PLLC_N/M/P was mistakenly replaced with MGA1064_PIX_PLLC_N/M/P which have different addresses. Patch tested on a Dell T310 with g200wb Fixes: f86c3ed55920 ("drm/mgag200: Split PLL setup into compute and update functions") Cc: [email protected] Signed-off-by: Jocelyn Falempe <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-03-13Merge tag 'x86_urgent_for_v5.17_rc8' of ↵Linus Torvalds8-63/+254
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Free shmem backing storage for SGX enclave pages when those are swapped back into EPC memory - Prevent do_int3() from being kprobed, to avoid recursion - Remap setup_data and setup_indirect structures properly when accessing their members - Correct the alternatives patching order for modules too * tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx: Free backing memory after faulting the enclave page x86/traps: Mark do_int3() NOKPROBE_SYMBOL x86/boot: Add setup_indirect support in early_memremap_is_setup_data() x86/boot: Fix memremap of setup_indirect structures x86/module: Fix the paravirt vs alternative order
2022-03-12random: check for signal and try earlier when generating entropyJason A. Donenfeld1-2/+3
Rather than waiting a full second in an interruptable waiter before trying to generate entropy, try to generate entropy first and wait second. While waiting one second might give an extra second for getting entropy from elsewhere, we're already pretty late in the init process here, and whatever else is generating entropy will still continue to contribute. This has implications on signal handling: we call try_to_generate_entropy() from wait_for_random_bytes(), and wait_for_random_bytes() always uses wait_event_interruptible_timeout() when waiting, since it's called by userspace code in restartable contexts, where signals can pend. Since try_to_generate_entropy() now runs first, if a signal is pending, it's necessary for try_to_generate_entropy() to check for signals, since it won't hit the wait until after try_to_generate_entropy() has returned. And even before this change, when entering a busy loop in try_to_generate_entropy(), we should have been checking to see if any signals are pending, so that a process doesn't get stuck in that loop longer than expected. Cc: Theodore Ts'o <[email protected]> Reviewed-by: Dominik Brodowski <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-03-12random: reseed more often immediately after bootingJason A. Donenfeld1-3/+25
In order to chip away at the "premature first" problem, we augment our existing entropy accounting with more frequent reseedings at boot. The idea is that at boot, we're getting entropy from various places, and we're not very sure which of early boot entropy is good and which isn't. Even when we're crediting the entropy, we're still not totally certain that it's any good. Since boot is the one time (aside from a compromise) that we have zero entropy, it's important that we shepherd entropy into the crng fairly often. At the same time, we don't want a "premature next" problem, whereby an attacker can brute force individual bits of added entropy. In lieu of going full-on Fortuna (for now), we can pick a simpler strategy of just reseeding more often during the first 5 minutes after boot. This is still bounded by the 256-bit entropy credit requirement, so we'll skip a reseeding if we haven't reached that, but in case entropy /is/ coming in, this ensures that it makes its way into the crng rather rapidly during these early stages. Ordinarily we reseed if the previous reseeding is 300 seconds old. This commit changes things so that for the first 600 seconds of boot time, we reseed if the previous reseeding is uptime / 2 seconds old. That means that we'll reseed at the very least double the uptime of the previous reseeding. Cc: Theodore Ts'o <[email protected]> Reviewed-by: Eric Biggers <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-03-12random: make consistent usage of crng_ready()Jason A. Donenfeld1-12/+7
Rather than sometimes checking `crng_init < 2`, we should always use the crng_ready() macro, so that should we change anything later, it's consistent. Additionally, that macro already has a likely() around it, which means we don't need to open code our own likely() and unlikely() annotations. Cc: Theodore Ts'o <[email protected]> Reviewed-by: Dominik Brodowski <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-03-12random: use SipHash as interrupt entropy accumulatorJason A. Donenfeld1-39/+55
The current fast_mix() function is a piece of classic mailing list crypto, where it just sort of sprung up by an anonymous author without a lot of real analysis of what precisely it was accomplishing. As an ARX permutation alone, there are some easily searchable differential trails in it, and as a means of preventing malicious interrupts, it completely fails, since it xors new data into the entire state every time. It can't really be analyzed as a random permutation, because it clearly isn't, and it can't be analyzed as an interesting linear algebraic structure either, because it's also not that. There really is very little one can say about it in terms of entropy accumulation. It might diffuse bits, some of the time, maybe, we hope, I guess. But for the most part, it fails to accomplish anything concrete. As a reminder, the simple goal of add_interrupt_randomness() is to simply accumulate entropy until ~64 interrupts have elapsed, and then dump it into the main input pool, which uses a cryptographic hash. It would be nice to have something cryptographically strong in the interrupt handler itself, in case a malicious interrupt compromises a per-cpu fast pool within the 64 interrupts / 1 second window, and then inside of that same window somehow can control its return address and cycle counter, even if that's a bit far fetched. However, with a very CPU-limited budget, actually doing that remains an active research project (and perhaps there'll be something useful for Linux to come out of it). And while the abundance of caution would be nice, this isn't *currently* the security model, and we don't yet have a fast enough solution to make it our security model. Plus there's not exactly a pressing need to do that. (And for the avoidance of doubt, the actual cluster of 64 accumulated interrupts still gets dumped into our cryptographically secure input pool.) So, for now we are going to stick with the existing interrupt security model, which assumes that each cluster of 64 interrupt data samples is mostly non-malicious and not colluding with an infoleaker. With this as our goal, we have a few more choices, simply aiming to accumulate entropy, while discarding the least amount of it. We know from <https://eprint.iacr.org/2019/198> that random oracles, instantiated as computational hash functions, make good entropy accumulators and extractors, which is the justification for using BLAKE2s in the main input pool. As mentioned, we don't have that luxury here, but we also don't have the same security model requirements, because we're assuming that there aren't malicious inputs. A pseudorandom function instance can approximately behave like a random oracle, provided that the key is uniformly random. But since we're not concerned with malicious inputs, we can pick a fixed key, which is not secret, knowing that "nature" won't interact with a sufficiently chosen fixed key by accident. So we pick a PRF with a fixed initial key, and accumulate into it continuously, dumping the result every 64 interrupts into our cryptographically secure input pool. For this, we make use of SipHash-1-x on 64-bit and HalfSipHash-1-x on 32-bit, which are already in use in the kernel's hsiphash family of functions and achieve the same performance as the function they replace. It would be nice to do two rounds, but we don't exactly have the CPU budget handy for that, and one round alone is already sufficient. As mentioned, we start with a fixed initial key (zeros is fine), and allow SipHash's symmetry breaking constants to turn that into a useful starting point. Also, since we're dumping the result (or half of it on 64-bit so as to tax our hash function the same amount on all platforms) into the cryptographically secure input pool, there's no point in finalizing SipHash's output, since it'll wind up being finalized by something much stronger. This means that all we need to do is use the ordinary round function word-by-word, as normal SipHash does. Simplified, the flow is as follows: Initialize: siphash_state_t state; siphash_init(&state, key={0, 0, 0, 0}); Update (accumulate) on interrupt: siphash_update(&state, interrupt_data_and_timing); Dump into input pool after 64 interrupts: blake2s_update(&input_pool, &state, sizeof(state) / 2); The result of all of this is that the security model is unchanged from before -- we assume non-malicious inputs -- yet we now implement that model with a stronger argument. I would like to emphasize, again, that the purpose of this commit is to improve the existing design, by making it analyzable, without changing any fundamental assumptions. There may well be value down the road in changing up the existing design, using something cryptographically strong, or simply using a ring buffer of samples rather than having a fast_mix() at all, or changing which and how much data we collect each interrupt so that we can use something linear, or a variety of other ideas. This commit does not invalidate the potential for those in the future. For example, in the future, if we're able to characterize the data we're collecting on each interrupt, we may be able to inch toward information theoretic accumulators. <https://eprint.iacr.org/2021/523> shows that `s = ror32(s, 7) ^ x` and `s = ror64(s, 19) ^ x` make very good accumulators for 2-monotone distributions, which would apply to timestamp counters, like random_get_entropy() or jiffies, but would not apply to our current combination of the two values, or to the various function addresses and register values we mix in. Alternatively, <https://eprint.iacr.org/2021/1002> shows that max-period linear functions with no non-trivial invariant subspace make good extractors, used in the form `s = f(s) ^ x`. However, this only works if the input data is both identical and independent, and obviously a collection of address values and counters fails; so it goes with theoretical papers. Future directions here may involve trying to characterize more precisely what we actually need to collect in the interrupt handler, and building something specific around that. However, as mentioned, the morass of data we're gathering at the interrupt handler presently defies characterization, and so we use SipHash for now, which works well and performs well. Cc: Theodore Ts'o <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Reviewed-by: Jean-Philippe Aumasson <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-03-12wireguard: device: clear keys on VM forkJason A. Donenfeld1-11/+27
When a virtual machine forks, it's important that WireGuard clear existing sessions so that different plaintexts are not transmitted using the same key+nonce, which can result in catastrophic cryptographic failure. To accomplish this, we simply hook into the newly added vmfork notifier. As a bonus, it turns out that, like the vmfork registration function, the PM registration function is stubbed out when CONFIG_PM_SLEEP is not set, so we can actually just remove the maze of ifdefs, which makes it really quite clean to support both notifiers at once. Cc: Dominik Brodowski <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Theodore Ts'o <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-03-12random: provide notifier for VM forkJason A. Donenfeld2-0/+20
Drivers such as WireGuard need to learn when VMs fork in order to clear sessions. This commit provides a simple notifier_block for that, with a register and unregister function. When no VM fork detection is compiled in, this turns into a no-op, similar to how the power notifier works. Cc: Dominik Brodowski <[email protected]> Cc: Theodore Ts'o <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>