aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-10-24fpga: Fix memory leak for fpga_region_test_class_find()Jinjie Ruan1-0/+2
fpga_region_class_find() in fpga_region_test_class_find() will call get_device() if the data is matched, which will increment refcount for dev->kobj, so it should call put_device() to decrement refcount for dev->kobj to free the region, because fpga_region_unregister() will call fpga_region_dev_release() only when the refcount for dev->kobj is zero but fpga_region_test_init() call device_register() in fpga_region_register_full(), which also increment refcount. So call put_device() after calling fpga_region_class_find() in fpga_region_test_class_find(). After applying this patch, the following memory leak is never detected. unreferenced object 0xffff88810c8ef000 (size 1024): comm "kunit_try_catch", pid 1875, jiffies 4294715298 (age 836.836s) hex dump (first 32 bytes): b8 d1 fb 05 81 88 ff ff 08 f0 8e 0c 81 88 ff ff ................ 08 f0 8e 0c 81 88 ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff817ebad7>] kmalloc_trace+0x27/0xa0 [<ffffffffa02385e1>] fpga_region_register_full+0x51/0x430 [fpga_region] [<ffffffffa0228e47>] 0xffffffffa0228e47 [<ffffffff829c479d>] kunit_try_run_case+0xdd/0x250 [<ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81238b85>] kthread+0x2b5/0x380 [<ffffffff81097ded>] ret_from_fork+0x2d/0x70 [<ffffffff810034d1>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff888105fbd1b8 (size 8): comm "kunit_try_catch", pid 1875, jiffies 4294715298 (age 836.836s) hex dump (first 8 bytes): 72 65 67 69 6f 6e 30 00 region0. backtrace: [<ffffffff817ec023>] __kmalloc_node_track_caller+0x53/0x150 [<ffffffff82995590>] kvasprintf+0xb0/0x130 [<ffffffff83f713b1>] kobject_set_name_vargs+0x41/0x110 [<ffffffff8304ac1b>] dev_set_name+0xab/0xe0 [<ffffffffa02388a2>] fpga_region_register_full+0x312/0x430 [fpga_region] [<ffffffffa0228e47>] 0xffffffffa0228e47 [<ffffffff829c479d>] kunit_try_run_case+0xdd/0x250 [<ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81238b85>] kthread+0x2b5/0x380 [<ffffffff81097ded>] ret_from_fork+0x2d/0x70 [<ffffffff810034d1>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff88810b3b8a00 (size 256): comm "kunit_try_catch", pid 1875, jiffies 4294715298 (age 836.836s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 8a 3b 0b 81 88 ff ff ..........;..... 08 8a 3b 0b 81 88 ff ff e0 ac 04 83 ff ff ff ff ..;............. backtrace: [<ffffffff817ebad7>] kmalloc_trace+0x27/0xa0 [<ffffffff83056d7a>] device_add+0xa2a/0x15e0 [<ffffffffa02388b1>] fpga_region_register_full+0x321/0x430 [fpga_region] [<ffffffffa0228e47>] 0xffffffffa0228e47 [<ffffffff829c479d>] kunit_try_run_case+0xdd/0x250 [<ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81238b85>] kthread+0x2b5/0x380 [<ffffffff81097ded>] ret_from_fork+0x2d/0x70 [<ffffffff810034d1>] ret_from_fork_asm+0x11/0x20 Fixes: 64a5f972c93d ("fpga: add an initial KUnit suite for the FPGA Region") Signed-off-by: Jinjie Ruan <[email protected]> Reviewed-by: Marco Pagani <[email protected]> Acked-by: Xu Yilun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [[email protected]: slightly changes the commit message] Signed-off-by: Xu Yilun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-24fpga: m10bmc-sec: Change contact for secure update driverRuss Weight2-8/+8
Change the maintainer for the Intel MAX10 BMC Secure Update driver from Russ Weight to Peter Colberg. Update the ABI documentation contact information as well. Signed-off-by: Russ Weight <[email protected]> Acked-by: Peter Colberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xu Yilun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-24x86/defconfig: Enable CONFIG_DEBUG_ENTRY=yIngo Molnar2-0/+2
A bug was recently found via CONFIG_DEBUG_ENTRY=y, and the x86 tree kinda is the main source of changes to the x86 entry code, so enable this debug option by default in our defconfigs. Signed-off-by: Ingo Molnar <[email protected]> Cc: [email protected]
2023-10-24drm/i915/perf: Determine context valid in OA reportsUmesh Nerlige Ramappa1-2/+2
When supporting OA for TGL, it was seen that the context valid bit in the report ID was not defined, however revisiting the spec seems to have this bit defined. The bit is used to determine if a context is valid on a context switch and is essential to determine active and idle periods for a context. Re-enable the context valid bit for gen12 platforms. BSpec: 52196 (description of report_id) v2: Include BSpec reference (Ashutosh) Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Signed-off-by: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 7eeaedf79989a8f131939782832e21e9218ed2a0) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-10-24perf/core: Fix potential NULL derefPeter Zijlstra1-1/+2
Smatch is awesome. Fixes: 32671e3799ca ("perf: Disallow mis-matched inherited group reads") Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2023-10-24Merge branch 'gtp-tunnel-driver-fixes'Paolo Abeni2-3/+4
Pablo Neira Ayuso says: ==================== GTP tunnel driver fixes The following patchset contains two fixes for the GTP tunnel driver: 1) Incorrect GTPA_MAX definition in UAPI headers. This is updating an existing UAPI definition but for a good reason, this is certainly broken. Similar fixes for incorrect _MAX definition in netlink headers were applied in the past too. 2) Fix GTP driver PMTU with GRO packets, add missing call to skb_gso_validate_network_len() to handle GRO packets. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-10-24gtp: fix fragmentation needed check with gsoPablo Neira Ayuso1-2/+3
Call skb_gso_validate_network_len() to check if packet is over PMTU. Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-10-24gtp: uapi: fix GTPA_MAXPablo Neira Ayuso1-1/+1
Subtract one to __GTPA_MAX, otherwise GTPA_MAX is off by 2. Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-10-24autofs: fix add autofs_parse_fd()Ian Kent2-4/+11
We are seeing systemd hang on its autofs direct mount at /proc/sys/fs/binfmt_misc. Historically this was due to a mismatch in the communication structure size between a 64 bit kernel and a 32 bit user space and was fixed by making the pipe communication record oriented. During autofs v5 development I decided to stay with the existing usage instead of changing to a packed structure for autofs <=> user space communications which turned out to be a mistake on my part. Problems arose and they were fixed by allowing for the 64 bit to 32 bit size difference in the automount(8) code. Along the way systemd started to use autofs and eventually encountered this problem too. systemd refused to compensate for the length difference insisting it be fixed in the kernel. Fortunately Linus implemented the packetized pipe which resolved the problem in a straight forward and simple way. In the autofs mount api conversion series I inadvertatly dropped the packet pipe flag settings when adding the autofs_parse_fd() function. This patch fixes that omission. Fixes: 546694b8f658 ("autofs: add autofs_parse_fd()") Signed-off-by: Ian Kent <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Anders Roxell <[email protected]> Cc: Bill O'Donnell <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Anders Roxell <[email protected]> Cc: Naresh Kamboju <[email protected]> Cc: Stephen Rothwell <[email protected]> Reported-by: Linux Kernel Functional Testing <[email protected]> Reported-by: Anders Roxell <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2023-10-24Fix NULL pointer dereference in cn_filter()Anjali Kulkarni1-1/+1
Check that sk_user_data is not NULL, else return from cn_filter(). Could not reproduce this issue, but Oliver Sang verified it has fixed the "Closes" problem below. Fixes: 2aa1f7a1f47c ("connector/cn_proc: Add filtering to fix some bugs") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected]/ Signed-off-by: Anjali Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-10-24vfs: Convert BUG_ON to WARN_ON_ONCE in open_last_lookupsBernd Schubert1-1/+2
The calling code actually handles -ECHILD, so this BUG_ON can be converted to WARN_ON_ONCE. Signed-off-by: Bernd Schubert <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christian Brauner <[email protected]> Cc: Al Viro <[email protected]> Cc: Amir Goldstein <[email protected]> Cc: Dharmendra Singh <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-24sched/fair: Remove SIS_PROPPeter Zijlstra5-59/+0
SIS_UTIL seems to work well, lets remove the old thing. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-10-24sched/fair: Use candidate prev/recent_used CPU if scanning failed for ↵Yicong Yang1-1/+16
cluster wakeup Chen Yu reports a hackbench regression of cluster wakeup when hackbench threads equal to the CPU number [1]. Analysis shows it's because we wake up more on the target CPU even if the prev_cpu is a good wakeup candidate and leads to the decrease of the CPU utilization. Generally if the task's prev_cpu is idle we'll wake up the task on it without scanning. On cluster machines we'll try to wake up the task in the same cluster of the target for better cache affinity, so if the prev_cpu is idle but not sharing the same cluster with the target we'll still try to find an idle CPU within the cluster. This will improve the performance at low loads on cluster machines. But in the issue above, if the prev_cpu is idle but not in the cluster with the target CPU, we'll try to scan an idle one in the cluster. But since the system is busy, we're likely to fail the scanning and use target instead, even if the prev_cpu is idle. Then leads to the regression. This patch solves this in 2 steps: o record the prev_cpu/recent_used_cpu if they're good wakeup candidates but not sharing the cluster with the target. o on scanning failure use the prev_cpu/recent_used_cpu if they're recorded as idle [1] https://lore.kernel.org/all/ZGzDLuVaHR1PAYDt@chenyu5-mobl1/ Closes: https://lore.kernel.org/all/ZGsLy83wPIpamy6x@chenyu5-mobl1/ Reported-by: Chen Yu <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Tested-and-reviewed-by: Chen Yu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-10-24sched/fair: Scan cluster before scanning LLC in wake-up pathBarry Song3-4/+49
For platforms having clusters like Kunpeng920, CPUs within the same cluster have lower latency when synchronizing and accessing shared resources like cache. Thus, this patch tries to find an idle cpu within the cluster of the target CPU before scanning the whole LLC to gain lower latency. This will be implemented in 2 steps in select_idle_sibling(): 1. When the prev_cpu/recent_used_cpu are good wakeup candidates, use them if they're sharing cluster with the target CPU. Otherwise trying to scan for an idle CPU in the target's cluster. 2. Scanning the cluster prior to the LLC of the target CPU for an idle CPU to wakeup. Testing has been done on Kunpeng920 by pinning tasks to one numa and two numa. On Kunpeng920, Each numa has 8 clusters and each cluster has 4 CPUs. With this patch, We noticed enhancement on tbench and netperf within one numa or cross two numa on top of tip-sched-core commit 9b46f1abc6d4 ("sched/debug: Print 'tgid' in sched_show_task()") tbench results (node 0): baseline patched 1: 327.2833 372.4623 ( 13.80%) 4: 1320.5933 1479.8833 ( 12.06%) 8: 2638.4867 2921.5267 ( 10.73%) 16: 5282.7133 5891.5633 ( 11.53%) 32: 9810.6733 9877.3400 ( 0.68%) 64: 7408.9367 7447.9900 ( 0.53%) 128: 6203.2600 6191.6500 ( -0.19%) tbench results (node 0-1): baseline patched 1: 332.0433 372.7223 ( 12.25%) 4: 1325.4667 1477.6733 ( 11.48%) 8: 2622.9433 2897.9967 ( 10.49%) 16: 5218.6100 5878.2967 ( 12.64%) 32: 10211.7000 11494.4000 ( 12.56%) 64: 13313.7333 16740.0333 ( 25.74%) 128: 13959.1000 14533.9000 ( 4.12%) netperf results TCP_RR (node 0): baseline patched 1: 76546.5033 90649.9867 ( 18.42%) 4: 77292.4450 90932.7175 ( 17.65%) 8: 77367.7254 90882.3467 ( 17.47%) 16: 78519.9048 90938.8344 ( 15.82%) 32: 72169.5035 72851.6730 ( 0.95%) 64: 25911.2457 25882.2315 ( -0.11%) 128: 10752.6572 10768.6038 ( 0.15%) netperf results TCP_RR (node 0-1): baseline patched 1: 76857.6667 90892.2767 ( 18.26%) 4: 78236.6475 90767.3017 ( 16.02%) 8: 77929.6096 90684.1633 ( 16.37%) 16: 77438.5873 90502.5787 ( 16.87%) 32: 74205.6635 88301.5612 ( 19.00%) 64: 69827.8535 71787.6706 ( 2.81%) 128: 25281.4366 25771.3023 ( 1.94%) netperf results UDP_RR (node 0): baseline patched 1: 96869.8400 110800.8467 ( 14.38%) 4: 97744.9750 109680.5425 ( 12.21%) 8: 98783.9863 110409.9637 ( 11.77%) 16: 99575.0235 110636.2435 ( 11.11%) 32: 95044.7250 97622.8887 ( 2.71%) 64: 32925.2146 32644.4991 ( -0.85%) 128: 12859.2343 12824.0051 ( -0.27%) netperf results UDP_RR (node 0-1): baseline patched 1: 97202.4733 110190.1200 ( 13.36%) 4: 95954.0558 106245.7258 ( 10.73%) 8: 96277.1958 105206.5304 ( 9.27%) 16: 97692.7810 107927.2125 ( 10.48%) 32: 79999.6702 103550.2999 ( 29.44%) 64: 80592.7413 87284.0856 ( 8.30%) 128: 27701.5770 29914.5820 ( 7.99%) Note neither Kunpeng920 nor x86 Jacobsville supports SMT, so the SMT branch in the code has not been tested but it supposed to work. Chen Yu also noticed this will improve the performance of tbench and netperf on a 24 CPUs Jacobsville machine, there are 4 CPUs in one cluster sharing L2 Cache. [https://lore.kernel.org/lkml/[email protected]] Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Barry Song <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Tim Chen <[email protected]> Reviewed-by: Chen Yu <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Tested-and-reviewed-by: Chen Yu <[email protected]> Tested-by: Yicong Yang <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-10-24sched: Add cpus_share_resources APIBarry Song5-1/+40
Add cpus_share_resources() API. This is the preparation for the optimization of select_idle_cpu() on platforms with cluster scheduler level. On a machine with clusters cpus_share_resources() will test whether two cpus are within the same cluster. On a non-cluster machine it will behaves the same as cpus_share_cache(). So we use "resources" here for cache resources. Signed-off-by: Barry Song <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Reviewed-by: Tim Chen <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Tested-and-reviewed-by: Chen Yu <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-10-24sched/core: Fix RQCF_ACT_SKIP leakHao Jia1-4/+1
Igor Raits and Bagas Sanjaya report a RQCF_ACT_SKIP leak warning. This warning may be triggered in the following situations: CPU0 CPU1 __schedule() *rq->clock_update_flags <<= 1;* unregister_fair_sched_group() pick_next_task_fair+0x4a/0x410 destroy_cfs_bandwidth() newidle_balance+0x115/0x3e0 for_each_possible_cpu(i) *i=0* rq_unpin_lock(this_rq, rf) __cfsb_csd_unthrottle() raw_spin_rq_unlock(this_rq) rq_lock(*CPU0_rq*, &rf) rq_clock_start_loop_update() rq->clock_update_flags & RQCF_ACT_SKIP <-- raw_spin_rq_lock(this_rq) The purpose of RQCF_ACT_SKIP is to skip the update rq clock, but the update is very early in __schedule(), but we clear RQCF_*_SKIP very late, causing it to span that gap above and triggering this warning. In __schedule() we can clear the RQCF_*_SKIP flag immediately after update_rq_clock() to avoid this RQCF_ACT_SKIP leak warning. And set rq->clock_update_flags to RQCF_UPDATED to avoid rq->clock_update_flags < RQCF_ACT_SKIP warning that may be triggered later. Fixes: ebb83d84e49b ("sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()") Closes: https://lore.kernel.org/all/[email protected] Reported-by: Igor Raits <[email protected]> Reported-by: Bagas Sanjaya <[email protected]> Suggested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Hao Jia <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected]
2023-10-23Merge tag 'pull-nfsd-fix' of ↵Linus Torvalds1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull nfsd fix from Al Viro: "Catch from lock_rename() audit; nfsd_rename() checked that both directories belonged to the same filesystem, but only after having done lock_rename(). Trivial fix, tested and acked by nfs folks" * tag 'pull-nfsd-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nfsd: lock_rename() needs both directories to live on the same fs
2023-10-23Merge tag 'urgent/nolibc.2023.10.16a' of ↵Linus Torvalds3-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc fixes from Paul McKenney: - tools/nolibc: i386: Fix a stack misalign bug on _start - MAINTAINERS: nolibc: update tree location - tools/nolibc: mark start_c as weak to avoid linker errors * tag 'urgent/nolibc.2023.10.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: tools/nolibc: mark start_c as weak MAINTAINERS: nolibc: update tree location tools/nolibc: i386: Fix a stack misalign bug on _start
2023-10-23sfc: cleanup and reduce netlink error messagesPieter Jansen van Vuuren1-19/+19
Reduce the length of netlink error messages as they are likely to be truncated anyway. Additionally, reword netlink error messages so they are more consistent with previous messages. Fixes: 9dbc8d2b9a02 ("sfc: add decrement ipv6 hop limit by offloading set hop limit actions") Fixes: 3c9561c0a5b9 ("sfc: support TC decap rules matching on enc_ip_tos") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Edward Cree <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-23Merge tag 'mvebu-fixes-6.6-1' of ↵Arnd Bergmann1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes mvebu fixes for 6.6 (part 1) Update MAINTAINERS for eDPU board * tag 'mvebu-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: MAINTAINERS: uDPU: add remaining Methode boards MAINTAINERS: uDPU: make myself maintainer of it Link: https://lore.kernel.org/r/875y32abqe.fsf@BL-laptop Signed-off-by: Arnd Bergmann <[email protected]>
2023-10-23Merge tag 'for-6.6-rc7-tag' of ↵Linus Torvalds5-15/+33
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix for a problem with snapshot of a newly created subvolume that can lead to inconsistent data under some circumstances. Kernel 6.5 added a performance optimization to skip transaction commit for subvolume creation but this could end up with newer data on disk but not linked to other structures. The fix itself is an added condition, the rest of the patch is a parameter added to several functions" * tag 'for-6.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix unwritten extent buffer after snapshotting a new subvolume
2023-10-23Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds11-30/+121
Pull virtio fixes from Michael Tsirkin: "A collection of small fixes that look like worth having in this release" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_pci: fix the common cfg map size virtio-crypto: handle config changed by work queue vhost: Allow null msg.size on VHOST_IOTLB_INVALIDATE vdpa/mlx5: Fix firmware error on creation of 1k VQs virtio_balloon: Fix endless deflation and inflation on arm64 vdpa/mlx5: Fix double release of debugfs entry virtio-mmio: fix memory leak of vm_dev vdpa_sim_blk: Fix the potential leak of mgmt_dev tools/virtio: Add dma sync api for virtio test
2023-10-23EDAC/versal: Add a Xilinx Versal memory controller driverShubhrajyoti Datta5-0/+1101
Add a EDAC driver for the RAS capabilities on the Xilinx integrated DDR Memory Controllers (DDRMCs) which support both DDR4 and LPDDR4/4X memory interfaces. It has four programmable Network-on-Chip (NoC) interface ports and is designed to handle multiple streams of traffic. The driver reports correctable and uncorrectable errors, and also creates debugfs entries for testing through error injection. [ bp: - Add a pointer to the documentation about the register unlock code. - Squash in a fix for a Smatch static checker issue as reported by Dan Carpenter: https://lore.kernel.org/r/[email protected] ] Co-developed-by: Sai Krishna Potthuri <[email protected]> Signed-off-by: Sai Krishna Potthuri <[email protected]> Signed-off-by: Shubhrajyoti Datta <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-10-23net/handshake: fix file ref count in handshake_nl_accept_doit()Moritz Wanzenböck1-25/+5
If req->hr_proto->hp_accept() fail, we call fput() twice: Once in the error path, but also a second time because sock->file is at that point already associated with the file descriptor. Once the task exits, as it would probably do after receiving an error reading from netlink, the fd is closed, calling fput() a second time. To fix, we move installing the file after the error path for the hp_accept() call. In the case of errors we simply put the unused fd. In case of success we can use fd_install() to link the sock->file to the reserved fd. Fixes: 7ea9c1ec66bc ("net/handshake: Fix handshake_dup() ref counting") Signed-off-by: Moritz Wanzenböck <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-23scripts/faddr2line: Skip over mapping symbols in output from readelfWill Deacon1-0/+5
Mapping symbols emitted in the readelf output can confuse the 'faddr2line' symbol size calculation, resulting in the erroneous rejection of valid offsets. This is especially prevalent when building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are prefixed with a 32-bit data value in a '$d.n' section. For example: 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process Adding a warning to do_one_initcall() results in: | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 Which 'faddr2line' refuses to accept: $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) no match for do_one_initcall+0xf4/0x260 Filter out these entries from readelf using a shell reimplementation of is_mapping_symbol(), so that the size of a symbol is calculated as a delta to the next symbol present in ksymtab. Suggested-by: Masahiro Yamada <[email protected]> Signed-off-by: Will Deacon <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Josh Poimboeuf <[email protected]>
2023-10-23scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1Will Deacon1-2/+15
GNU utilities cannot necessarily parse objects built by LLVM, which can result in confusing errors when using 'faddr2line': $ CROSS_COMPILE=aarch64-linux-gnu- ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn' aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25 do_one_initcall+0xf4/0x260: aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn' aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25 $x.73 at main.c:? Although this can be worked around by setting CROSS_COMPILE to "llvm=-", it's cleaner to follow the same syntax as the top-level Makefile and accept LLVM= as an indication to use the llvm- tools, optionally specifying their location or specific version number. Suggested-by: Masahiro Yamada <[email protected]> Signed-off-by: Will Deacon <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Josh Poimboeuf <[email protected]>
2023-10-23scripts/faddr2line: Don't filter out non-function symbols from readelfWill Deacon1-1/+1
As Josh points out in 20230724234734.zy67gm674vl3p3wv@treble: > Problem is, I think the kernel's symbol printing code prints the > nearest kallsyms symbol, and there are some valid non-FUNC code > symbols. For example, syscall_return_via_sysret. so we shouldn't be considering only 'FUNC'-type symbols in the output from readelf. Drop the function symbol type filtering from the faddr2line outer loop. Suggested-by: Josh Poimboeuf <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Link: https://lore.kernel.org/r/20230724234734.zy67gm674vl3p3wv@treble Signed-off-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Josh Poimboeuf <[email protected]>
2023-10-23btrfs: fix unwritten extent buffer after snapshotting a new subvolumeFilipe Manana5-15/+33
When creating a snapshot of a subvolume that was created in the current transaction, we can end up not persisting a dirty extent buffer that is referenced by the snapshot, resulting in IO errors due to checksum failures when trying to read the extent buffer later from disk. A sequence of steps that leads to this is the following: 1) At ioctl.c:create_subvol() we allocate an extent buffer, with logical address 36007936, for the leaf/root of a new subvolume that has an ID of 291. We mark the extent buffer as dirty, and at this point the subvolume tree has a single node/leaf which is also its root (level 0); 2) We no longer commit the transaction used to create the subvolume at create_subvol(). We used to, but that was recently removed in commit 1b53e51a4a8f ("btrfs: don't commit transaction for every subvol create"); 3) The transaction used to create the subvolume has an ID of 33, so the extent buffer 36007936 has a generation of 33; 4) Several updates happen to subvolume 291 during transaction 33, several files created and its tree height changes from 0 to 1, so we end up with a new root at level 1 and the extent buffer 36007936 is now a leaf of that new root node, which is extent buffer 36048896. The commit root remains as 36007936, since we are still at transaction 33; 5) Creation of a snapshot of subvolume 291, with an ID of 292, starts at ioctl.c:create_snapshot(). This triggers a commit of transaction 33 and we end up at transaction.c:create_pending_snapshot(), in the critical section of a transaction commit. There we COW the root of subvolume 291, which is extent buffer 36048896. The COW operation returns extent buffer 36048896, since there's no need to COW because the extent buffer was created in this transaction and it was not written yet. The we call btrfs_copy_root() against the root node 36048896. During this operation we allocate a new extent buffer to turn into the root node of the snapshot, copy the contents of the root node 36048896 into this snapshot root extent buffer, set the owner to 292 (the ID of the snapshot), etc, and then we call btrfs_inc_ref(). This will create a delayed reference for each leaf pointed by the root node with a reference root of 292 - this includes a reference for the leaf 36007936. After that we set the bit BTRFS_ROOT_FORCE_COW in the root's state. Then we call btrfs_insert_dir_item(), to create the directory entry in in the tree of subvolume 291 that points to the snapshot. This ends up needing to modify leaf 36007936 to insert the respective directory items. Because the bit BTRFS_ROOT_FORCE_COW is set for the root's state, we need to COW the leaf. We end up at btrfs_force_cow_block() and then at update_ref_for_cow(). At update_ref_for_cow() we call btrfs_block_can_be_shared() which returns false, despite the fact the leaf 36007936 is shared - the subvolume's root and the snapshot's root point to that leaf. The reason that it incorrectly returns false is because the commit root of the subvolume is extent buffer 36007936 - it was the initial root of the subvolume when we created it. So btrfs_block_can_be_shared() which has the following logic: int btrfs_block_can_be_shared(struct btrfs_root *root, struct extent_buffer *buf) { if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) && buf != root->node && buf != root->commit_root && (btrfs_header_generation(buf) <= btrfs_root_last_snapshot(&root->root_item) || btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC))) return 1; return 0; } Returns false (0) since 'buf' (extent buffer 36007936) matches the root's commit root. As a result, at update_ref_for_cow(), we don't check for the number of references for extent buffer 36007936, we just assume it's not shared and therefore that it has only 1 reference, so we set the local variable 'refs' to 1. Later on, in the final if-else statement at update_ref_for_cow(): static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *buf, struct extent_buffer *cow, int *last_ref) { (...) if (refs > 1) { (...) } else { (...) btrfs_clear_buffer_dirty(trans, buf); *last_ref = 1; } } So we mark the extent buffer 36007936 as not dirty, and as a result we don't write it to disk later in the transaction commit, despite the fact that the snapshot's root points to it. Attempting to access the leaf or dumping the tree for example shows that the extent buffer was not written: $ btrfs inspect-internal dump-tree -t 292 /dev/sdb btrfs-progs v6.2.2 file tree key (292 ROOT_ITEM 33) node 36110336 level 1 items 2 free space 119 generation 33 owner 292 node 36110336 flags 0x1(WRITTEN) backref revision 1 checksum stored a8103e3e checksum calced a8103e3e fs uuid 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79 chunk uuid e8c9c885-78f4-4d31-85fe-89e5f5fd4a07 key (256 INODE_ITEM 0) block 36007936 gen 33 key (257 EXTENT_DATA 0) block 36052992 gen 33 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 total bytes 107374182400 bytes used 38572032 uuid 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79 The respective on disk region is full of zeroes as the device was trimmed at mkfs time. Obviously 'btrfs check' also detects and complains about this: $ btrfs check /dev/sdb Opening filesystem to check... Checking filesystem on /dev/sdb UUID: 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79 generation: 33 (33) [1/7] checking root items [2/7] checking extents checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 bad tree block 36007936, bytenr mismatch, want=36007936, have=0 owner ref check failed [36007936 4096] ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space tree [4/7] checking fs roots checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29 bad tree block 36007936, bytenr mismatch, want=36007936, have=0 The following tree block(s) is corrupted in tree 292: tree block bytenr: 36110336, level: 1, node key: (256, 1, 0) root 292 root dir 256 not found ERROR: errors found in fs roots found 38572032 bytes used, error(s) found total csum bytes: 16048 total tree bytes: 1265664 total fs tree bytes: 1118208 total extent tree bytes: 65536 btree space waste bytes: 562598 file data blocks allocated: 65978368 referenced 36569088 Fix this by updating btrfs_block_can_be_shared() to consider that an extent buffer may be shared if it matches the commit root and if its generation matches the current transaction's generation. This can be reproduced with the following script: $ cat test.sh #!/bin/bash MNT=/mnt/sdi DEV=/dev/sdi # Use a filesystem with a 64K node size so that we have the same node # size on every machine regardless of its page size (on x86_64 default # node size is 16K due to the 4K page size, while on PPC it's 64K by # default). This way we can make sure we are able to create a btree for # the subvolume with a height of 2. mkfs.btrfs -f -n 64K $DEV mount $DEV $MNT btrfs subvolume create $MNT/subvol # Create a few empty files on the subvolume, this bumps its btree # height to 2 (root node at level 1 and 2 leaves). for ((i = 1; i <= 300; i++)); do echo -n > $MNT/subvol/file_$i done btrfs subvolume snapshot -r $MNT/subvol $MNT/subvol/snap umount $DEV btrfs check $DEV Running it on a 6.5 kernel (or any 6.6-rc kernel at the moment): $ ./test.sh Create subvolume '/mnt/sdi/subvol' Create a readonly snapshot of '/mnt/sdi/subvol' in '/mnt/sdi/subvol/snap' Opening filesystem to check... Checking filesystem on /dev/sdi UUID: bbdde2ff-7d02-45ca-8a73-3c36f23755a1 [1/7] checking root items [2/7] checking extents parent transid verify failed on 30539776 wanted 7 found 5 parent transid verify failed on 30539776 wanted 7 found 5 parent transid verify failed on 30539776 wanted 7 found 5 Ignoring transid failure owner ref check failed [30539776 65536] ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space tree [4/7] checking fs roots parent transid verify failed on 30539776 wanted 7 found 5 Ignoring transid failure Wrong key of child node/leaf, wanted: (256, 1, 0), have: (2, 132, 0) Wrong generation of child node/leaf, wanted: 5, have: 7 root 257 root dir 256 not found ERROR: errors found in fs roots found 917504 bytes used, error(s) found total csum bytes: 0 total tree bytes: 851968 total fs tree bytes: 393216 total extent tree bytes: 65536 btree space waste bytes: 736550 file data blocks allocated: 0 referenced 0 A test case for fstests will follow soon. Fixes: 1b53e51a4a8f ("btrfs: don't commit transaction for every subvol create") CC: [email protected] # 6.5+ Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2023-10-23Merge branches 'rcu/torture', 'rcu/fixes', 'rcu/docs', 'rcu/refscale', ↵Frederic Weisbecker35-229/+488
'rcu/tasks' and 'rcu/stall' into rcu/next rcu/torture: RCU torture, locktorture and generic torture infrastructure rcu/fixes: Generic and misc fixes rcu/docs: RCU documentation updates rcu/refscale: RCU reference scalability test updates rcu/tasks: RCU tasks updates rcu/stall: Stall detection updates
2023-10-23drm/amdkfd: reserve a fence slot while locking the BOChristian König1-1/+1
Looks like the KFD still needs this. Signed-off-by: Christian König <[email protected]> Fixes: 8abc1eb2987a ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <[email protected]> Acked-by: Felix Kuehling <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-23powerpc/mm: Fix boot crash with FLATMEMMichael Ellerman2-1/+2
Erhard reported that his G5 was crashing with v6.6-rc kernels: mpic: Setting up HT PICs workarounds for U3/U4 BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe Faulting instruction address: 0xc00000000005dc40 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G T 6.6.0-rc3-PMacGS #1 Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac NIP: c00000000005dc40 LR: c000000000066660 CTR: c000000000007730 REGS: c0000000022bf510 TRAP: 0380 Tainted: G T (6.6.0-rc3-PMacGS) MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 44004242 XER: 00000000 IRQMASK: 3 GPR00: 0000000000000000 c0000000022bf7b0 c0000000010c0b00 00000000000001ac GPR04: 0000000003c80000 0000000000000300 c0000000f20001ae 0000000000000300 GPR08: 0000000000000006 feffbb62ffec65ff 0000000000000001 0000000000000000 GPR12: 9000000000001032 c000000002362000 c000000000f76b80 000000000349ecd8 GPR16: 0000000002367ba8 0000000002367f08 0000000000000006 0000000000000000 GPR20: 00000000000001ac c000000000f6f920 c0000000022cd985 000000000000000c GPR24: 0000000000000300 00000003b0a3691d c0003e008030000e 0000000000000000 GPR28: c00000000000000c c0000000f20001ee feffbb62ffec65fe 00000000000001ac NIP hash_page_do_lazy_icache+0x50/0x100 LR __hash_page_4K+0x420/0x590 Call Trace: hash_page_mm+0x364/0x6f0 do_hash_fault+0x114/0x2b0 data_access_common_virt+0x198/0x1f0 --- interrupt: 300 at mpic_init+0x4bc/0x10c4 NIP: c000000002020a5c LR: c000000002020a04 CTR: 0000000000000000 REGS: c0000000022bf9f0 TRAP: 0300 Tainted: G T (6.6.0-rc3-PMacGS) MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 24004248 XER: 00000000 DAR: c0003e008030000e DSISR: 40000000 IRQMASK: 1 ... NIP mpic_init+0x4bc/0x10c4 LR mpic_init+0x464/0x10c4 --- interrupt: 300 pmac_setup_one_mpic+0x258/0x2dc pmac_pic_init+0x28c/0x3d8 init_IRQ+0x90/0x140 start_kernel+0x57c/0x78c start_here_common+0x1c/0x20 A bisect pointed to the breakage beginning with commit 9fee28baa601 ("powerpc: implement the new page table range API"). Analysis of the oops pointed to a struct page with a corrupted compound_head being loaded via page_folio() -> _compound_head() in hash_page_do_lazy_icache(). The access by the mpic code is to an MMIO address, so the expectation is that the struct page for that address would be initialised by init_unavailable_range(), as pointed out by Aneesh. Instrumentation showed that was not the case, which eventually lead to the realisation that pfn_valid() was returning false for that address, causing the struct page to not be initialised. Because the system is using FLATMEM, the version of pfn_valid() in memory_model.h is used: static inline int pfn_valid(unsigned long pfn) { ... return pfn >= pfn_offset && (pfn - pfn_offset) < max_mapnr; } Which relies on max_mapnr being initialised. Early in boot max_mapnr is zero meaning no PFNs are valid. max_mapnr is initialised in mem_init() called via: start_kernel() mm_core_init() # init/main.c:928 mem_init() But that is too late for the usage in init_unavailable_range() called via: start_kernel() setup_arch() # init/main.c:893 paging_init() free_area_init() init_unavailable_range() Although max_mapnr is currently set in mem_init(), the value is actually already available much earlier, as soon as mem_topology_setup() has completed, which is also before paging_init() is called. So move the initialisation there, which causes paging_init() to correctly initialise the struct page and fixes the bug. This bug seems to have been lurking for years, but went unnoticed because the pre-folio code was inspecting the uninitialised page->flags but not dereferencing it. Thanks to Erhard and Aneesh for help debugging. Reported-by: Erhard Furtner <[email protected]> Closes: https://lore.kernel.org/all/20230929132750.3cd98452@yea/ Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
2023-10-23wifi: mac80211: don't drop all unprotected public action framesAvraham Stern2-2/+30
Not all public action frames have a protected variant. When MFP is enabled drop only public action frames that have a dual protected variant. Fixes: 76a3059cf124 ("wifi: mac80211: drop some unprotected action frames") Signed-off-by: Avraham Stern <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20231016145213.2973e3c8d3bb.I6198b8d3b04cf4a97b06660d346caec3032f232a@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-10-23wifi: cfg80211: fix assoc response warning on failed linksJohannes Berg1-1/+2
The warning here shouldn't be done before we even set the bss field (or should've used the input data). Move the assignment before the warning to fix it. We noticed this now because of Wen's bugfix, where the bug fixed there had previously hidden this other bug. Fixes: 53ad07e9823b ("wifi: cfg80211: support reporting failed links") Signed-off-by: Johannes Berg <[email protected]>
2023-10-23wifi: cfg80211: pass correct pointer to rdev_inform_bss()Ben Greear1-1/+1
Confusing struct member names here resulted in passing the wrong pointer, causing crashes. Pass the correct one. Fixes: eb142608e2c4 ("wifi: cfg80211: use a struct for inform_single_bss data") Signed-off-by: Ben Greear <[email protected]> Link: https://lore.kernel.org/r/[email protected] [rewrite commit message, add fixes] Signed-off-by: Johannes Berg <[email protected]>
2023-10-23Merge tag 'v6.6-rc7' into sched/core, to pick up fixesIngo Molnar831-3953/+8728
Pick up recent sched/urgent fixes merged upstream. Signed-off-by: Ingo Molnar <[email protected]>
2023-10-23isdn: mISDN: hfcsusb: Spelling fix in commentKunwu Chan1-1/+1
protocoll -> protocol Signed-off-by: Kunwu Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-22Linux 6.6-rc7Linus Torvalds1-1/+1
2023-10-22bcachefs: Refactor memcpy into direct assignmentKees Cook1-6/+4
The memcpy() in bch2_bkey_append_ptr() is operating on an embedded fake flexible array which looks to the compiler like it has 0 size. This causes W=1 builds to emit warnings due to -Wstringop-overflow: In file included from include/linux/string.h:254, from include/linux/bitmap.h:11, from include/linux/cpumask.h:12, from include/linux/smp.h:13, from include/linux/lockdep.h:14, from include/linux/radix-tree.h:14, from include/linux/backing-dev-defs.h:6, from fs/bcachefs/bcachefs.h:182: fs/bcachefs/extents.c: In function 'bch2_bkey_append_ptr': include/linux/fortify-string.h:57:33: warning: writing 8 bytes into a region of size 0 [-Wstringop-overflow=] 57 | #define __underlying_memcpy __builtin_memcpy | ^ include/linux/fortify-string.h:648:9: note: in expansion of macro '__underlying_memcpy' 648 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ include/linux/fortify-string.h:693:26: note: in expansion of macro '__fortify_memcpy_chk' 693 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ fs/bcachefs/extents.c:235:17: note: in expansion of macro 'memcpy' 235 | memcpy((void *) &k->v + bkey_val_bytes(&k->k), | ^~~~~~ fs/bcachefs/bcachefs_format.h:287:33: note: destination object 'v' of size 0 287 | struct bch_val v; | ^ Avoid making any structure changes and just replace the u64 copy into a direct assignment, side-stepping the entire problem. Cc: Kent Overstreet <[email protected]> Cc: Brian Foster <[email protected]> Cc: [email protected] Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Fix drop_alloc_keys()Kent Overstreet1-15/+15
For consistency with the rest of the reconstruct_alloc option, we should be skipping all alloc keys. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: snapshot_create_lockKent Overstreet5-7/+44
Add a new lock for snapshot creation - this addresses a few races with logged operations and snapshot deletion. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Fix snapshot skiplists during snapshot deletionKent Overstreet1-0/+3
In snapshot deleion, we have to pick new skiplist nodes for entries that point to nodes being deleted. The function that finds a new skiplist node, skipping over entries being deleted, was incorrect: if n = 0, but the parent node is being deleted, we also need to skip over that node. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: bch2_sb_field_get() refactoringKent Overstreet13-82/+71
Instead of using token pasting to generate methods for each superblock section, just make the type a parameter to bch2_sb_field_get(). Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: KEY_TYPE_error now counts towards i_sectorsKent Overstreet1-0/+1
KEY_TYPE_error is used when all replicas in an extent are marked as failed; it indicates that data was present, but has been lost. So that i_sectors doesn't change when replacing extents with KEY_TYPE_error, we now have to count error keys as allocations - this fixes fsck errors later. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Fix handling of unknown bkey typesKent Overstreet1-1/+0
min_val_size was U8_MAX for unknown key types, causing us to flag any known key as invalid - it should have been 0. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Switch to unsafe_memcpy() in a few placesKent Overstreet2-5/+8
The new fortify checking doesn't work for us in all places; this switches to unsafe_memcpy() where appropriate to silence a few warnings/errors. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Use struct_size()Christophe JAILLET2-3/+2
Use struct_size() instead of hand writing it. This is less verbose and more robust. While at it, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Correctly initialize new buckets on device resizeKent Overstreet3-9/+27
bch2_dev_resize() was never updated for the allocator rewrite with persistent freelists, and it wasn't noticed because the tests weren't running fsck - oops. Fix this by running bch2_dev_freespace_init() for the new buckets. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Fix another smatch complaintKent Overstreet1-1/+1
This should be harmless, but initialize last_seq anyways. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Use strsep() in split_devs()Kent Overstreet1-4/+2
Minor refactoring to fix a smatch complaint. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-22bcachefs: Add iops fields to bch_memberHunter Shaffer4-7/+35
Signed-off-by: Hunter Shaffer <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>