Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix a memory leak in cachefiles
- Restrict aio cancellations to I/O submitted through the aio
interfaces as this is otherwise causing issues for I/O submitted
via io_uring
- Increase buffer for afs volume status to avoid overflow
- Fix a missing zero-length check in unbuffered writes in the
netfs library. If generic_write_checks() returns zero make
netfs_unbuffered_write_iter() return right away
- Prevent a leak in i_dio_count caused by netfs_begin_read() operating
past i_size. It will return early and leave i_dio_count incremented
- Account for ipv4 addresses as well as ipv6 addresses when processing
incoming callbacks in afs
* tag 'vfs-6.8-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio
afs: Increase buffer size in afs_update_volume_status()
afs: Fix ignored callbacks over ipv4
cachefiles: fix memory leak in cachefiles_add_cache()
netfs: Fix missing zero-length check in unbuffered write
netfs: Fix i_dio_count leak on DIO read past i_size
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - regressions:
- af_unix: fix another unix GC hangup
Previous releases - regressions:
- core: fix a possible AF_UNIX deadlock
- bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
- netfilter: nft_flow_offload: release dst in case direct xmit path
is used
- bridge: switchdev: ensure MDB events are delivered exactly once
- l2tp: pass correct message length to ip6_append_data
- dccp/tcp: unhash sk from ehash for tb2 alloc failure after
check_estalblished()
- tls: fixes for record type handling with PEEK
- devlink: fix possible use-after-free and memory leaks in
devlink_init()
Previous releases - always broken:
- bpf: fix an oops when attempting to read the vsyscall page through
bpf_probe_read_kernel
- sched: act_mirred: use the backlog for mirred ingress
- netfilter: nft_flow_offload: fix dst refcount underflow
- ipv6: sr: fix possible use-after-free and null-ptr-deref
- mptcp: fix several data races
- phonet: take correct lock to peek at the RX queue
Misc:
- handful of fixes and reliability improvements for selftests"
* tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
l2tp: pass correct message length to ip6_append_data
net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY
selftests: ioam: refactoring to align with the fix
Fix write to cloned skb in ipv6_hop_ioam()
phonet/pep: fix racy skb_queue_empty() use
phonet: take correct lock to peek at the RX queue
net: sparx5: Add spinlock for frame transmission from CPU
net/sched: flower: Add lock protection when remove filter handle
devlink: fix port dump cmd type
net: stmmac: Fix EST offset for dwmac 5.10
tools: ynl: don't leak mcast_groups on init error
tools: ynl: make sure we always pass yarg to mnl_cb_run
net: mctp: put sock on tag allocation failure
netfilter: nf_tables: use kzalloc for hook allocation
netfilter: nf_tables: register hooks last when adding new chain/flowtable
netfilter: nft_flow_offload: release dst in case direct xmit path is used
netfilter: nft_flow_offload: reset dst in route object after setting up flow
netfilter: nf_tables: set dormant flag on hook register failure
selftests: tls: add test for peeking past a record of a different type
selftests: tls: add test for merging of same-type control messages
...
|
|
On Tegra186, secure world applications may need to access host1x
during suspend/resume, and rely on the kernel to keep Host1x out
of reset during the suspend cycle. As such, as a quirk,
skip asserting Host1x's reset on Tegra186.
We don't need to keep the clocks enabled, as BPMP ensures the clock
stays on while Host1x is being used. On newer SoC's, the reset line
is inaccessible, so there is no need for the quirk.
Fixes: b7c00cdf6df5 ("gpu: host1x: Enable system suspend callbacks")
Signed-off-by: Mikko Perttunen <[email protected]>
Reviewed-by: Jon Hunter <[email protected]>
Tested-by: Jon Hunter <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Don't set power state flag when system enter runtime suspend,
or it may cause runtime resume failure issue.
Fixes: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend")
Signed-off-by: Ma Jun <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Use i2c adapter when there isn't aux_mode in dc_link to fix a
null-pointer derefence that happens when running
igt@kms_force_connector_basic in a system with DCN2.1 and HDMI connector
detected as below:
[ +0.178146] BUG: kernel NULL pointer dereference, address: 00000000000004c0
[ +0.000010] #PF: supervisor read access in kernel mode
[ +0.000005] #PF: error_code(0x0000) - not-present page
[ +0.000004] PGD 0 P4D 0
[ +0.000006] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000006] CPU: 15 PID: 2368 Comm: kms_force_conne Not tainted 6.5.0-asdn+ #152
[ +0.000005] Hardware name: HP HP ENVY x360 Convertible 13-ay1xxx/8929, BIOS F.01 07/14/2021
[ +0.000004] RIP: 0010:i2c_transfer+0xd/0x100
[ +0.000011] Code: ea fc ff ff 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 55 53 <48> 8b 47 10 48 89 fb 48 83 38 00 0f 84 b3 00 00 00 83 3d 2f 80 16
[ +0.000004] RSP: 0018:ffff9c4f89c0fad0 EFLAGS: 00010246
[ +0.000005] RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000080
[ +0.000003] RDX: 0000000000000002 RSI: ffff9c4f89c0fb20 RDI: 00000000000004b0
[ +0.000003] RBP: ffff9c4f89c0fb80 R08: 0000000000000080 R09: ffff8d8e0b15b980
[ +0.000003] R10: 00000000000380e0 R11: 0000000000000000 R12: 0000000000000080
[ +0.000002] R13: 0000000000000002 R14: ffff9c4f89c0fb0e R15: ffff9c4f89c0fb0f
[ +0.000004] FS: 00007f9ad2176c40(0000) GS:ffff8d90fe9c0000(0000) knlGS:0000000000000000
[ +0.000003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000004] CR2: 00000000000004c0 CR3: 0000000121bc4000 CR4: 0000000000750ee0
[ +0.000003] PKRU: 55555554
[ +0.000003] Call Trace:
[ +0.000006] <TASK>
[ +0.000006] ? __die+0x23/0x70
[ +0.000011] ? page_fault_oops+0x17d/0x4c0
[ +0.000008] ? preempt_count_add+0x6e/0xa0
[ +0.000008] ? srso_alias_return_thunk+0x5/0x7f
[ +0.000011] ? exc_page_fault+0x7f/0x180
[ +0.000009] ? asm_exc_page_fault+0x26/0x30
[ +0.000013] ? i2c_transfer+0xd/0x100
[ +0.000010] drm_do_probe_ddc_edid+0xc2/0x140 [drm]
[ +0.000067] ? srso_alias_return_thunk+0x5/0x7f
[ +0.000006] ? _drm_do_get_edid+0x97/0x3c0 [drm]
[ +0.000043] ? __pfx_drm_do_probe_ddc_edid+0x10/0x10 [drm]
[ +0.000042] edid_block_read+0x3b/0xd0 [drm]
[ +0.000043] _drm_do_get_edid+0xb6/0x3c0 [drm]
[ +0.000041] ? __pfx_drm_do_probe_ddc_edid+0x10/0x10 [drm]
[ +0.000043] drm_edid_read_custom+0x37/0xd0 [drm]
[ +0.000044] amdgpu_dm_connector_mode_valid+0x129/0x1d0 [amdgpu]
[ +0.000153] drm_connector_mode_valid+0x3b/0x60 [drm_kms_helper]
[ +0.000000] __drm_helper_update_and_validate+0xfe/0x3c0 [drm_kms_helper]
[ +0.000000] ? amdgpu_dm_connector_get_modes+0xb6/0x520 [amdgpu]
[ +0.000000] ? srso_alias_return_thunk+0x5/0x7f
[ +0.000000] drm_helper_probe_single_connector_modes+0x2ab/0x540 [drm_kms_helper]
[ +0.000000] status_store+0xb2/0x1f0 [drm]
[ +0.000000] kernfs_fop_write_iter+0x136/0x1d0
[ +0.000000] vfs_write+0x24d/0x440
[ +0.000000] ksys_write+0x6f/0xf0
[ +0.000000] do_syscall_64+0x60/0xc0
[ +0.000000] ? srso_alias_return_thunk+0x5/0x7f
[ +0.000000] ? syscall_exit_to_user_mode+0x2b/0x40
[ +0.000000] ? srso_alias_return_thunk+0x5/0x7f
[ +0.000000] ? do_syscall_64+0x6c/0xc0
[ +0.000000] ? do_syscall_64+0x6c/0xc0
[ +0.000000] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ +0.000000] RIP: 0033:0x7f9ad46b4b00
[ +0.000000] Code: 40 00 48 8b 15 19 b3 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d e1 3a 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[ +0.000000] RSP: 002b:00007ffcbd3bd6d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ +0.000000] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9ad46b4b00
[ +0.000000] RDX: 0000000000000002 RSI: 00007f9ad48a7417 RDI: 0000000000000009
[ +0.000000] RBP: 0000000000000002 R08: 0000000000000064 R09: 0000000000000000
[ +0.000000] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f9ad48a7417
[ +0.000000] R13: 0000000000000009 R14: 00007ffcbd3bd760 R15: 0000000000000001
[ +0.000000] </TASK>
[ +0.000000] Modules linked in: ctr ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel btmtk bluetooth uvcvideo videobuf2_vmalloc sha3_generic videobuf2_memops uvc jitterentropy_rng videobuf2_v4l2 videodev drbg videobuf2_common ansi_cprng mc ecdh_generic ecc qrtr binfmt_misc hid_sensor_accel_3d hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf industrialio snd_ctl_led joydev hid_sensor_iio_common rtw89_8852ae rtw89_8852a rtw89_pci snd_hda_codec_realtek rtw89_core snd_hda_codec_generic intel_rapl_msr ledtrig_audio intel_rapl_common snd_hda_codec_hdmi mac80211 snd_hda_intel snd_intel_dspcfg kvm_amd snd_hda_codec snd_soc_dmic snd_acp3x_rn snd_acp3x_pdm_dma libarc4 snd_hwdep snd_soc_core kvm snd_hda_core cfg80211 snd_pci_acp6x snd_pcm nls_ascii snd_timer hp_wmi snd_pci_acp5x nls_cp437 snd_rn_pci_acp3x ucsi_acpi sparse_keymap ccp snd platform_profile snd_acp_config typec_ucsi irqbypass vfat sp5100_tco
[ +0.000000] snd_soc_acpi fat rapl pcspkr wmi_bmof roles rfkill rng_core snd_pci_acp3x soundcore k10temp watchdog typec battery ac amd_pmc acpi_tad button hid_sensor_hub hid_multitouch evdev serio_raw msr parport_pc ppdev lp parport fuse loop efi_pstore configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic dm_crypt dm_mod efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c crc32c_generic xor raid6_pq raid1 raid0 multipath linear md_mod amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm crc32_pclmul crc32c_intel drm_exec gpu_sched drm_suballoc_helper nvme ghash_clmulni_intel drm_buddy drm_display_helper sha512_ssse3 nvme_core ahci xhci_pci sha512_generic hid_generic xhci_hcd libahci rtsx_pci_sdmmc t10_pi i2c_hid_acpi drm_kms_helper i2c_hid mmc_core libata aesni_intel crc64_rocksoft_generic crypto_simd amd_sfh crc64_rocksoft scsi_mod usbcore cryptd crc_t10dif cec drm crct10dif_generic hid rtsx_pci crct10dif_pclmul scsi_common rc_core crc64 i2c_piix4
[ +0.000000] usb_common crct10dif_common video wmi
[ +0.000000] CR2: 00000000000004c0
[ +0.000000] ---[ end trace 0000000000000000 ]---
Fixes: 0e859faf8670 ("drm/amd/display: Remove unwanted drm edid references")
Signed-off-by: Melissa Wen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After destroying dmub_srv, the memory associated with it is
not freed, causing a memory leak:
unreferenced object 0xffff896302b45800 (size 1024):
comm "(udev-worker)", pid 222, jiffies 4294894636
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 6265fd77):
[<ffffffff993495ed>] kmalloc_trace+0x29d/0x340
[<ffffffffc0ea4a94>] dm_dmub_sw_init+0xb4/0x450 [amdgpu]
[<ffffffffc0ea4e55>] dm_sw_init+0x15/0x2b0 [amdgpu]
[<ffffffffc0ba8557>] amdgpu_device_init+0x1417/0x24e0 [amdgpu]
[<ffffffffc0bab285>] amdgpu_driver_load_kms+0x15/0x190 [amdgpu]
[<ffffffffc0ba09c7>] amdgpu_pci_probe+0x187/0x4e0 [amdgpu]
[<ffffffff9968fd1e>] local_pci_probe+0x3e/0x90
[<ffffffff996918a3>] pci_device_probe+0xc3/0x230
[<ffffffff99805872>] really_probe+0xe2/0x480
[<ffffffff99805c98>] __driver_probe_device+0x78/0x160
[<ffffffff99805daf>] driver_probe_device+0x1f/0x90
[<ffffffff9980601e>] __driver_attach+0xce/0x1c0
[<ffffffff99803170>] bus_for_each_dev+0x70/0xc0
[<ffffffff99804822>] bus_add_driver+0x112/0x210
[<ffffffff99807245>] driver_register+0x55/0x100
[<ffffffff990012d1>] do_one_initcall+0x41/0x300
Fix this by freeing dmub_srv after destroying it.
Fixes: 743b9786b14a ("drm/amd/display: Hook up the DMUB service in DM")
Signed-off-by: Armin Wolf <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Currently there is an error while translating input clock sates into
output clock states. The highest fclk setting from output sates is
being dropped because of this error.
[How]
For dcn35 and dcn351, make output_states equal to input states.
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Swapnil Patel <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fixes potential null pointer dereference warnings in the
dc_dmub_srv_cmd_list_queue_execute() and dc_dmub_srv_is_hw_pwr_up()
functions.
In both functions, the 'dc_dmub_srv' variable was being dereferenced
before it was checked for null. This could lead to a null pointer
dereference if 'dc_dmub_srv' is null. The fix is to check if
'dc_dmub_srv' is null before dereferencing it.
Thus moving the null checks for 'dc_dmub_srv' to the beginning of the
functions to ensure that 'dc_dmub_srv' is not null when it is
dereferenced.
Found by smatch & thus fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:133 dc_dmub_srv_cmd_list_queue_execute() warn: variable dereferenced before check 'dc_dmub_srv' (see line 128)
drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:1167 dc_dmub_srv_is_hw_pwr_up() warn: variable dereferenced before check 'dc_dmub_srv' (see line 1164)
Fixes: 028bac583449 ("drm/amd/display: decouple dmcub execution to reduce lock granularity")
Fixes: 65138eb72e1f ("drm/amd/display: Add DCN35 DMUB")
Cc: JinZe.Xu <[email protected]>
Cc: Hersen Wu <[email protected]>
Cc: Josip Pavic <[email protected]>
Cc: Roman Li <[email protected]>
Cc: Qingqing Zhuo <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Rodrigo Siqueira <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Cc: Tom Chung <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Tom Chung <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
- While working on the ring buffer I noticed that the counter used for
knowing where the end of the data is on a sub-buffer was not a full
"int" but just 20 bits. It was masked out to 0xfffff.
With the new code that allows the user to change the size of the
sub-buffer, it is theoretically possible to ask for a size bigger
than 2^20. If that happens, unexpected results may occur as there's
no code checking if the counter overflowed the 20 bits of the write
mask. There are other checks to make sure events fit in the
sub-buffer, but if the sub-buffer itself is too big, that is not
checked.
Add a check in the resize of the sub-buffer to make sure that it
never goes beyond the size of the counter that holds how much data is
on it.
* tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not let subbuf be bigger than write mask
|
|
[Why]
The old asic only have 1 pwrseq hw.
We don't need to map the diginst to pwrseq inst in old asic.
[How]
1. Only mapping dig to pwrseq for new asic.
2. Move mapping function into dcn specific panel control component
Cc: Stable <[email protected]> # v6.6+
Cc: Mario Limonciello <[email protected]>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3122
Reviewed-by: Anthony Koo <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Lewis Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Observe error message "Can't retrieve aconnector in hpd_rx_irq_offload_work"
when boot up with a mst tbt4 dock connected. After analyzing, there are few
parts needed to be adjusted:
1. hpd_rx_offload_wq[].aconnector is not initialzed before the dmub outbox
hpd_irq handler get registered which causes the error message.
2. registeration of hpd and hpd_rx_irq event for usb4 dp tunneling is not
aligned with legacy interface sequence
[How]
Put DMUB_NOTIFICATION_HPD and DMUB_NOTIFICATION_HPD_IRQ handler
registration into register_hpd_handlers() to align other interfaces and
get hpd_rx_offload_wq[].aconnector initialized earlier than that.
Leave DMUB_NOTIFICATION_AUX_REPLY registered as it was since we need that
while calling dc_link_detect(). USB4 connection status will be proactively
detected by dc_link_detect_connection_type() in amdgpu_dm_initialize_drm_device()
Cc: Stable <[email protected]>
Reviewed-by: Aurabindo Pillai <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Wayne Lin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The s390 common I/O layer (CIO) returns an unexpected -EBUSY return code
when drivers try to start I/O while a path-verification (PV) process is
pending. This can lead to failed device initialization attempts with
symptoms like broken network connectivity after boot.
Fix this by replacing the -EBUSY return code with a deferred condition
code 1 reply to make path-verification handling consistent from a
driver's point of view.
The problem can be reproduced semi-regularly using the following process,
while repeating steps 2-3 as necessary (example assumes an OSA device
with bus-IDs 0.0.a000-0.0.a002 on CHPID 0.02):
1. echo 0.0.a000,0.0.a001,0.0.a002 >/sys/bus/ccwgroup/drivers/qeth/group
2. echo 0 > /sys/bus/ccwgroup/devices/0.0.a000/online
3. echo 1 > /sys/bus/ccwgroup/devices/0.0.a000/online ; \
echo on > /sys/devices/css0/chp0.02/status
Background information:
The common I/O layer starts path-verification I/Os when it receives
indications about changes in a device path's availability. This occurs
for example when hardware events indicate a change in channel-path
status, or when a manual operation such as a CHPID vary or configure
operation is performed.
If a driver attempts to start I/O while a PV is running, CIO reports a
successful I/O start (ccw_device_start() return code 0). Then, after
completion of PV, CIO synthesizes an interrupt response that indicates
an asynchronous status condition that prevented the start of the I/O
(deferred condition code 1).
If a PV indication arrives while a device is busy with driver-owned I/O,
PV is delayed until after I/O completion was reported to the driver's
interrupt handler. To ensure that PV can be started eventually, CIO
reports a device busy condition (ccw_device_start() return code -EBUSY)
if a driver tries to start another I/O while PV is pending.
In some cases this -EBUSY return code causes device drivers to consider
a device not operational, resulting in failed device initialization.
Note: The code that introduced the problem was added in 2003. Symptoms
started appearing with the following CIO commit that causes a PV
indication when a device is removed from the cio_ignore list after the
associated parent subchannel device was probed, but before online
processing of the CCW device has started:
2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
During boot, the cio_ignore list is modified by the cio_ignore dracut
module [1] as well as Linux vendor-specific systemd service scripts[2].
When combined, this commit and boot scripts cause a frequent occurrence
of the problem during boot.
[1] https://github.com/dracutdevs/dracut/tree/master/modules.d/81cio_ignore
[2] https://github.com/SUSE/s390-tools/blob/master/cio_ignore.service
Cc: [email protected] # v5.15+
Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
Tested-By: Thorsten Winkler <[email protected]>
Reviewed-by: Thorsten Winkler <[email protected]>
Signed-off-by: Peter Oberparleiter <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
event log cache to user triggers a kernel bug.
[ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
[ 1987.170845] ------------[ cut here ]------------
[ 1987.176086] kernel BUG at mm/usercopy.c:102!
[ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
[ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
[ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
[ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
[ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
[ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
[ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
[ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
[ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
[ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
[ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
[ 1987.284710] FS: 0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
[ 1987.293850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
[ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 1987.324527] PKRU: 55555554
[ 1987.327622] Call Trace:
[ 1987.330424] <TASK>
[ 1987.332826] ? show_regs+0x6e/0x80
[ 1987.336703] ? die+0x3c/0xa0
[ 1987.339988] ? do_trap+0xd4/0xf0
[ 1987.343662] ? do_error_trap+0x75/0xa0
[ 1987.347922] ? usercopy_abort+0x72/0x90
[ 1987.352277] ? exc_invalid_op+0x57/0x80
[ 1987.356634] ? usercopy_abort+0x72/0x90
[ 1987.360988] ? asm_exc_invalid_op+0x1f/0x30
[ 1987.365734] ? usercopy_abort+0x72/0x90
[ 1987.370088] __check_heap_object+0xb7/0xd0
[ 1987.374739] __check_object_size+0x175/0x2d0
[ 1987.379588] idxd_copy_cr+0xa9/0x130 [idxd]
[ 1987.384341] idxd_evl_fault_work+0x127/0x390 [idxd]
[ 1987.389878] process_one_work+0x13e/0x300
[ 1987.394435] ? __pfx_worker_thread+0x10/0x10
[ 1987.399284] worker_thread+0x2f7/0x420
[ 1987.403544] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 1987.409171] ? __pfx_worker_thread+0x10/0x10
[ 1987.414019] kthread+0x107/0x140
[ 1987.417693] ? __pfx_kthread+0x10/0x10
[ 1987.421954] ret_from_fork+0x3d/0x60
[ 1987.426019] ? __pfx_kthread+0x10/0x10
[ 1987.430281] ret_from_fork_asm+0x1b/0x30
[ 1987.434744] </TASK>
The issue arises because event log cache is created using
kmem_cache_create() which is not suitable for user copy.
Fix the issue by creating event log cache with
kmem_cache_create_usercopy(), ensuring safe user copy.
Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
Reported-by: Tony Zhu <[email protected]>
Tested-by: Tony Zhu <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Reviewed-by: Lijun Pan <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
The config fragment doesn't follow the correct format to enable those
config options which make the config options getting missed while
merging with other configs.
➜ merge_config.sh -m .config tools/testing/selftests/iommu/config
Using .config as base
Merging tools/testing/selftests/iommu/config
➜ make olddefconfig
.config:5295:warning: unexpected data: CONFIG_IOMMUFD
.config:5296:warning: unexpected data: CONFIG_IOMMUFD_TEST
While at it, add CONFIG_FAULT_INJECTION as well which is needed for
CONFIG_IOMMUFD_TEST. If CONFIG_FAULT_INJECTION isn't present in base
config (such as x86 defconfig), CONFIG_IOMMUFD_TEST doesn't get enabled.
Fixes: 57f0988706fe ("iommufd: Add a selftest")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
During syncobj_eventfd_entry_func, dma_fence_chain_find_seqno may set
the fence to NULL if the given seqno is signaled and a later seqno has
already been submitted. In that case, the eventfd should be signaled
immediately which currently does not happen.
This is a similar issue to the one addressed by commit b19926d4f3a6
("drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence.").
As a fix, if the return value of dma_fence_chain_find_seqno indicates
success but it sets the fence to NULL, we will assign a stub fence to
ensure the following code still signals the eventfd.
v1 -> v2: assign a stub fence instead of signaling the eventfd
Signed-off-by: Erik Kurzinger <[email protected]>
Fixes: c7a472297169 ("drm/syncobj: add IOCTL to register an eventfd")
Signed-off-by: Simon Ser <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
If the SMMU is configured to use a two level CD table then
arm_smmu_write_ctx_desc() allocates a CD table leaf internally using
GFP_KERNEL. Due to recent changes this is being done under a spinlock to
iterate over the device list - thus it will trigger a sleeping while
atomic warning:
arm_smmu_sva_set_dev_pasid()
mutex_lock(&sva_lock);
__arm_smmu_sva_bind()
arm_smmu_mmu_notifier_get()
spin_lock_irqsave()
arm_smmu_write_ctx_desc()
arm_smmu_get_cd_ptr()
arm_smmu_alloc_cd_leaf_table()
dmam_alloc_coherent(GFP_KERNEL)
This is a 64K high order allocation and really should not be done
atomically.
At the moment the rework of the SVA to follow the new API is half
finished. Recently the CD table memory was moved from the domain to the
master, however we have the confusing situation where the SVA code is
wrongly using the RID domains device's list to track which CD tables the
SVA is installed in.
Remove the logic to replicate the CD across all the domain's masters
during attach. We know which master and which CD table the PASID should be
installed in.
Right now SVA only works when dma-iommu.c is in control of the RID
translation, which means we have a single iommu_domain shared across the
entire group and that iommu_domain is not shared outside the group.
Critically this means that the iommu_group->devices list and RID's
smmu_domain->devices list describe the same set of masters.
For PCI cases the core code also insists on singleton groups so there is
only one entry in the smmu_domain->devices list that is equal to the
master being passed in to arm_smmu_sva_set_dev_pasid().
Only non-PCI cases may have multi-device groups. However, the core code
will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire
iommu_group->devices list.
Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the
devices in the group via the RID's smmu_domain, rely on
__arm_smmu_sva_bind() to be called for each device in the group and
install the repeated CD entry that way.
This avoids taking the spinlock to access the devices list and permits the
arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the
arm_smmu_mm_release() as a confusing situation, this requires tracking
attached masters inside the SVA domain.
Removing the loop allows arm_smmu_write_ctx_desc() to be called outside
the spinlock and thus is safe to use GFP_KERNEL.
Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that
arm_smmu_mmu_notifier_get/put() remain paired functions.
Fixes: 24503148c545 ("iommu/arm-smmu-v3: Refactor write_ctx_desc")
Reported-by: Dan Carpenter <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Michael Shavit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Each SPI controller is expected to call the spi_controller_suspend() and
spi_controller_resume() callbacks at system-wide suspend and resume.
It (1) handles the kthread worker for queued controllers and (2) marks
the controller as suspended to have spi_sync() fail while the
controller is unavailable.
Those two operations do not require the controller to be active, we do
not need to increment the runtime PM usage counter.
Signed-off-by: Théo Lebrun <[email protected]>
Link: https://msgid.link/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
Follow kernel naming convention with regards to power-management
callback function names.
The convention in the kernel is:
- prefix_suspend means the system-wide suspend callback;
- prefix_runtime_suspend means the runtime PM suspend callback.
The same applies to resume callbacks.
Signed-off-by: Théo Lebrun <[email protected]>
Reviewed-by: Dhruva Gole <[email protected]>
Link: https://msgid.link/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
The ->runtime_suspend() and ->runtime_resume() callbacks are not
expected to call spi_controller_suspend() and spi_controller_resume().
Remove calls to those in the cadence-qspi driver.
Those helpers have two roles currently:
- They stop/start the queue, including dealing with the kworker.
- They toggle the SPI controller SPI_CONTROLLER_SUSPENDED flag. It
requires acquiring ctlr->bus_lock_mutex.
Step one is irrelevant because cadence-qspi is not queued. Step two
however has two implications:
- A deadlock occurs, because ->runtime_resume() is called in a context
where the lock is already taken (in the ->exec_op() callback, where
the usage count is incremented).
- It would disallow all operations once the device is auto-suspended.
Here is a brief call tree highlighting the mutex deadlock:
spi_mem_exec_op()
...
spi_mem_access_start()
mutex_lock(&ctlr->bus_lock_mutex)
cqspi_exec_mem_op()
pm_runtime_resume_and_get()
cqspi_resume()
spi_controller_resume()
mutex_lock(&ctlr->bus_lock_mutex)
...
spi_mem_access_end()
mutex_unlock(&ctlr->bus_lock_mutex)
...
Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
Signed-off-by: Théo Lebrun <[email protected]>
Link: https://msgid.link/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
controller. Neither embed the other; this lead to memory corruption.
On a given platform (Mobileye EyeQ5) the memory corruption is hidden
inside cqspi->f_pdata. Also, this uninitialised memory is used as a
mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().
Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")
Reviewed-by: Dhruva Gole <[email protected]>
Signed-off-by: Théo Lebrun <[email protected]>
Link: https://msgid.link/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
When waiting for a syncobj timeline point whose fence has not yet been
submitted with the WAIT_FOR_SUBMIT flag, a callback is registered using
drm_syncobj_fence_add_wait and the thread is put to sleep until the
timeout expires. If the fence is submitted before then,
drm_syncobj_add_point will wake up the sleeping thread immediately which
will proceed to wait for the fence to be signaled.
However, if the WAIT_AVAILABLE flag is used instead,
drm_syncobj_fence_add_wait won't get called, meaning the waiting thread
will always sleep for the full timeout duration, even if the fence gets
submitted earlier. If it turns out that the fence *has* been submitted
by the time it eventually wakes up, it will still indicate to userspace
that the wait completed successfully (it won't return -ETIME), but it
will have taken much longer than it should have.
To fix this, we must call drm_syncobj_fence_add_wait if *either* the
WAIT_FOR_SUBMIT flag or the WAIT_AVAILABLE flag is set. The only
difference being that with WAIT_FOR_SUBMIT we will also wait for the
fence to be signaled after it has been submitted while with
WAIT_AVAILABLE we will return immediately.
IGT test patch: https://lists.freedesktop.org/archives/igt-dev/2024-January/067537.html
v1 -> v2: adjust lockdep_assert_none_held_once condition
(cherry picked from commit 8c44ea81634a4a337df70a32621a5f3791be23df)
Fixes: 01d6c3578379 ("drm/syncobj: add support for timeline point wait v8")
Signed-off-by: Erik Kurzinger <[email protected]>
Signed-off-by: Simon Ser <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
At btrfs_use_block_rsv() we read the size of a block reserve without
locking its spinlock, which makes KCSAN complain because the size of a
block reserve is always updated while holding its spinlock. The report
from KCSAN is the following:
[653.313148] BUG: KCSAN: data-race in btrfs_update_delayed_refs_rsv [btrfs] / btrfs_use_block_rsv [btrfs]
[653.314755] read to 0x000000017f5871b8 of 8 bytes by task 7519 on cpu 0:
[653.314779] btrfs_use_block_rsv+0xe4/0x2f8 [btrfs]
[653.315606] btrfs_alloc_tree_block+0xdc/0x998 [btrfs]
[653.316421] btrfs_force_cow_block+0x220/0xe38 [btrfs]
[653.317242] btrfs_cow_block+0x1ac/0x568 [btrfs]
[653.318060] btrfs_search_slot+0xda2/0x19b8 [btrfs]
[653.318879] btrfs_del_csums+0x1dc/0x798 [btrfs]
[653.319702] __btrfs_free_extent.isra.0+0xc24/0x2028 [btrfs]
[653.320538] __btrfs_run_delayed_refs+0xd3c/0x2390 [btrfs]
[653.321340] btrfs_run_delayed_refs+0xae/0x290 [btrfs]
[653.322140] flush_space+0x5e4/0x718 [btrfs]
[653.322958] btrfs_preempt_reclaim_metadata_space+0x102/0x2f8 [btrfs]
[653.323781] process_one_work+0x3b6/0x838
[653.323800] worker_thread+0x75e/0xb10
[653.323817] kthread+0x21a/0x230
[653.323836] __ret_from_fork+0x6c/0xb8
[653.323855] ret_from_fork+0xa/0x30
[653.323887] write to 0x000000017f5871b8 of 8 bytes by task 576 on cpu 3:
[653.323906] btrfs_update_delayed_refs_rsv+0x1a4/0x250 [btrfs]
[653.324699] btrfs_add_delayed_data_ref+0x468/0x6d8 [btrfs]
[653.325494] btrfs_free_extent+0x76/0x120 [btrfs]
[653.326280] __btrfs_mod_ref+0x6a8/0x6b8 [btrfs]
[653.327064] btrfs_dec_ref+0x50/0x70 [btrfs]
[653.327849] walk_up_proc+0x236/0xa50 [btrfs]
[653.328633] walk_up_tree+0x21c/0x448 [btrfs]
[653.329418] btrfs_drop_snapshot+0x802/0x1328 [btrfs]
[653.330205] btrfs_clean_one_deleted_snapshot+0x184/0x238 [btrfs]
[653.330995] cleaner_kthread+0x2b0/0x2f0 [btrfs]
[653.331781] kthread+0x21a/0x230
[653.331800] __ret_from_fork+0x6c/0xb8
[653.331818] ret_from_fork+0xa/0x30
So add a helper to get the size of a block reserve while holding the lock.
Reading the field while holding the lock instead of using the data_race()
annotation is used in order to prevent load tearing.
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
At space_info.c we have several places where we access the ->reserved
field of a block reserve without taking the block reserve's spinlock
first, which makes KCSAN warn about a data race since that field is
always updated while holding the spinlock.
The reports from KCSAN are like the following:
[117.193526] BUG: KCSAN: data-race in btrfs_block_rsv_release [btrfs] / need_preemptive_reclaim [btrfs]
[117.195148] read to 0x000000017f587190 of 8 bytes by task 6303 on cpu 3:
[117.195172] need_preemptive_reclaim+0x222/0x2f0 [btrfs]
[117.195992] __reserve_bytes+0xbb0/0xdc8 [btrfs]
[117.196807] btrfs_reserve_metadata_bytes+0x4c/0x120 [btrfs]
[117.197620] btrfs_block_rsv_add+0x78/0xa8 [btrfs]
[117.198434] btrfs_delayed_update_inode+0x154/0x368 [btrfs]
[117.199300] btrfs_update_inode+0x108/0x1c8 [btrfs]
[117.200122] btrfs_dirty_inode+0xb4/0x140 [btrfs]
[117.200937] btrfs_update_time+0x8c/0xb0 [btrfs]
[117.201754] touch_atime+0x16c/0x1e0
[117.201789] filemap_read+0x674/0x728
[117.201823] btrfs_file_read_iter+0xf8/0x410 [btrfs]
[117.202653] vfs_read+0x2b6/0x498
[117.203454] ksys_read+0xa2/0x150
[117.203473] __s390x_sys_read+0x68/0x88
[117.203495] do_syscall+0x1c6/0x210
[117.203517] __do_syscall+0xc8/0xf0
[117.203539] system_call+0x70/0x98
[117.203579] write to 0x000000017f587190 of 8 bytes by task 11 on cpu 0:
[117.203604] btrfs_block_rsv_release+0x2e8/0x578 [btrfs]
[117.204432] btrfs_delayed_inode_release_metadata+0x7c/0x1d0 [btrfs]
[117.205259] __btrfs_update_delayed_inode+0x37c/0x5e0 [btrfs]
[117.206093] btrfs_async_run_delayed_root+0x356/0x498 [btrfs]
[117.206917] btrfs_work_helper+0x160/0x7a0 [btrfs]
[117.207738] process_one_work+0x3b6/0x838
[117.207768] worker_thread+0x75e/0xb10
[117.207797] kthread+0x21a/0x230
[117.207830] __ret_from_fork+0x6c/0xb8
[117.207861] ret_from_fork+0xa/0x30
So add a helper to get the reserved amount of a block reserve while
holding the lock. The value may be not be up to date anymore when used by
need_preemptive_reclaim() and btrfs_preempt_reclaim_metadata_space(), but
that's ok since the worst it can do is cause more reclaim work do be done
sooner rather than later. Reading the field while holding the lock instead
of using the data_race() annotation is used in order to prevent load
tearing.
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
If we have a sparse file with a trailing hole (from the last extent's end
to i_size) and then create an extent in the file that ends before the
file's i_size, then when doing an incremental send we will issue a write
full of zeroes for the range that starts immediately after the new extent
ends up to i_size. While this isn't incorrect because the file ends up
with exactly the same data, it unnecessarily results in using extra space
at the destination with one or more extents full of zeroes instead of
having a hole. In same cases this results in using megabytes or even
gigabytes of unnecessary space.
Example, reproducer:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdh
MNT=/mnt/sdh
mkfs.btrfs -f $DEV
mount $DEV $MNT
# Create 1G sparse file.
xfs_io -f -c "truncate 1G" $MNT/foobar
# Create base snapshot.
btrfs subvolume snapshot -r $MNT $MNT/mysnap1
# Create send stream (full send) for the base snapshot.
btrfs send -f /tmp/1.snap $MNT/mysnap1
# Now write one extent at the beginning of the file and one somewhere
# in the middle, leaving a gap between the end of this second extent
# and the file's size.
xfs_io -c "pwrite -S 0xab 0 128K" \
-c "pwrite -S 0xcd 512M 128K" \
$MNT/foobar
# Now create a second snapshot which is going to be used for an
# incremental send operation.
btrfs subvolume snapshot -r $MNT $MNT/mysnap2
# Create send stream (incremental send) for the second snapshot.
btrfs send -p $MNT/mysnap1 -f /tmp/2.snap $MNT/mysnap2
# Now recreate the filesystem by receiving both send streams and
# verify we get the same content that the original filesystem had
# and file foobar has only two extents with a size of 128K each.
umount $MNT
mkfs.btrfs -f $DEV
mount $DEV $MNT
btrfs receive -f /tmp/1.snap $MNT
btrfs receive -f /tmp/2.snap $MNT
echo -e "\nFile fiemap in the second snapshot:"
# Should have:
#
# 128K extent at file range [0, 128K[
# hole at file range [128K, 512M[
# 128K extent file range [512M, 512M + 128K[
# hole at file range [512M + 128K, 1G[
xfs_io -r -c "fiemap -v" $MNT/mysnap2/foobar
# File should be using 256K of data (two 128K extents).
echo -e "\nSpace used by the file: $(du -h $MNT/mysnap2/foobar | cut -f 1)"
umount $MNT
Running the test, we can see with fiemap that we get an extent for the
range [512M, 1G[, while in the source filesystem we have an extent for
the range [512M, 512M + 128K[ and a hole for the rest of the file (the
range [512M + 128K, 1G[):
$ ./test.sh
(...)
File fiemap in the second snapshot:
/mnt/sdh/mysnap2/foobar:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..255]: 26624..26879 256 0x0
1: [256..1048575]: hole 1048320
2: [1048576..2097151]: 2156544..3205119 1048576 0x1
Space used by the file: 513M
This happens because once we finish processing an inode, at
finish_inode_if_needed(), we always issue a hole (write operations full
of zeros) if there's a gap between the end of the last processed extent
and the file's size, even if that range is already a hole in the parent
snapshot. Fix this by issuing the hole only if the range is not already
a hole.
After this change, running the test above, we get the expected layout:
$ ./test.sh
(...)
File fiemap in the second snapshot:
/mnt/sdh/mysnap2/foobar:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..255]: 26624..26879 256 0x0
1: [256..1048575]: hole 1048320
2: [1048576..1048831]: 26880..27135 256 0x1
3: [1048832..2097151]: hole 1048320
Space used by the file: 256K
A test case for fstests will follow soon.
CC: [email protected] # 6.1+
Reported-by: Dorai Ashok S A <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
Reviewed-by: Sweet Tea Dorminy <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
There's a syzbot report that device name buffers passed to device
replace are not properly checked for string termination which could lead
to a read out of bounds in getname_kernel().
Add a helper that validates both source and target device name buffers.
For devid as the source initialize the buffer to empty string in case
something tries to read it later.
This was originally analyzed and fixed in a different way by Edward Adam
Davis (see links).
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
CC: [email protected] # 4.19+
CC: Edward Adam Davis <[email protected]>
Reported-and-tested-by: [email protected]
Reviewed-by: Boris Burkov <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
On a zoned filesystem with conventional zones, we're skipping the block
group profile checks for the conventional zones.
This allows converting a zoned filesystem's data block groups to RAID when
all of the zones backing the chunk are on conventional zones. But this
will lead to problems, once we're trying to allocate chunks backed by
sequential zones.
So also check for conventional zones when loading a block group's profile
on them.
Reported-by: HAN Yuwei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/#t
Reviewed-by: Boris Burkov <[email protected]>
Reviewed-by: Naohiro Aota <[email protected]>
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
If caching mode change fails due to, for example, OOM we
free the allocated pages in a two-step process. First the pages
for which the caching change has already succeeded. Secondly
the pages for which a caching change did not succeed.
However the second step was incorrectly freeing the pages already
freed in the first step.
Fix.
Signed-off-by: Thomas Hellström <[email protected]>
Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path")
Cc: Christian König <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Christian Koenig <[email protected]>
Cc: Huang Rui <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v6.4+
Reviewed-by: Matthew Auld <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
make dtbs_check W=2:
arch/arm/boot/dts/renesas/r8a7790-lager.dts:444.11-458.5: Warning (interrupt_provider): /i2c-mux4/pmic@58: Missing '#interrupt-cells' in interrupt provider
...
Fix this by adding the missing #interrupt-cells properties.
Reported-by: Rob Herring <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/a351e503ea97fb1af68395843f513925ff1bdf26.1707922460.git.geert+renesas@glider.be
|
|
l2tp_ip6_sendmsg needs to avoid accounting for the transport header
twice when splicing more data into an already partially-occupied skbuff.
To manage this, we check whether the skbuff contains data using
skb_queue_empty when deciding how much data to append using
ip6_append_data.
However, the code which performed the calculation was incorrect:
ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0;
...due to C operator precedence, this ends up setting ulen to
transhdrlen for messages with a non-zero length, which results in
corrupted packets on the wire.
Add parentheses to correct the calculation in line with the original
intent.
Fixes: 9d4c75800f61 ("ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()")
Cc: David Howells <[email protected]>
Cc: [email protected]
Signed-off-by: Tom Parkin <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Commit eb9299beadbd ("ACPI: EC: Use a spin lock without disabing
interrupts") introduced an unexpected user-visible change in
behavior, which is a significant CPU load increase when the EC
is in use.
This most likely happens due to increased spinlock contention
and so reducing this effect would require a major rework of the
EC driver locking. There is no time for this in the current
cycle, so revert commit eb9299beadbd.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218511
Reported-by: Dieter Mummenschanz <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) If user requests to wake up a table and hook fails, restore the
dormant flag from the error path, from Florian Westphal.
2) Reset dst after transferring it to the flow object, otherwise dst
gets released twice from the error path.
3) Release dst in case the flowtable selects a direct xmit path, eg.
transmission to bridge port. Otherwise, dst is memleaked.
4) Register basechain and flowtable hooks at the end of the command.
Error path releases these datastructure without waiting for the
rcu grace period.
5) Use kzalloc() to initialize struct nft_hook to fix a KMSAN report
on access to hook type, also from Florian Westphal.
netfilter pull request 24-02-22
* tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: use kzalloc for hook allocation
netfilter: nf_tables: register hooks last when adding new chain/flowtable
netfilter: nft_flow_offload: release dst in case direct xmit path is used
netfilter: nft_flow_offload: reset dst in route object after setting up flow
netfilter: nf_tables: set dormant flag on hook register failure
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-02-22
The following pull-request contains BPF updates for your *net* tree.
We've added 11 non-merge commits during the last 24 day(s) which contain
a total of 15 files changed, 217 insertions(+), 17 deletions(-).
The main changes are:
1) Fix a syzkaller-triggered oops when attempting to read the vsyscall
page through bpf_probe_read_kernel and friends, from Hou Tao.
2) Fix a kernel panic due to uninitialized iter position pointer in
bpf_iter_task, from Yafang Shao.
3) Fix a race between bpf_timer_cancel_and_free and bpf_timer_cancel,
from Martin KaFai Lau.
4) Fix a xsk warning in skb_add_rx_frag() (under CONFIG_DEBUG_NET)
due to incorrect truesize accounting, from Sebastian Andrzej Siewior.
5) Fix a NULL pointer dereference in sk_psock_verdict_data_ready,
from Shigeru Yoshida.
6) Fix a resolve_btfids warning when bpf_cpumask symbol cannot be
resolved, from Hari Bathini.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()
selftests/bpf: Add negtive test cases for task iter
bpf: Fix an issue due to uninitialized bpf_iter_task
selftests/bpf: Test racing between bpf_timer_cancel_and_free and bpf_timer_cancel
bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel
selftest/bpf: Test the read of vsyscall page under x86-64
x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
bpf, scripts: Correct GPL license name
xsk: Add truesize to skb_add_rx_frag().
bpf: Fix warning for bpf_cpumask in verifier
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Commit bb726b753f75 ("net: phy: realtek: add support for
RTL8211F(D)(I)-VD-CG") extended support of the driver from the existing
support for RTL8211F(D)(I)-CG PHY to the newer RTL8211F(D)(I)-VD-CG PHY.
While that commit indicated that the RTL8211F_PHYCR2 register is not
supported by the "VD-CG" PHY model and therefore updated the corresponding
section in rtl8211f_config_init() to be invoked conditionally, the call to
"genphy_soft_reset()" was left as-is, when it should have also been invoked
conditionally. This is because the call to "genphy_soft_reset()" was first
introduced by the commit 0a4355c2b7f8 ("net: phy: realtek: add dt property
to disable CLKOUT clock") since the RTL8211F guide indicates that a PHY
reset should be issued after setting bits in the PHYCR2 register.
As the PHYCR2 register is not applicable to the "VD-CG" PHY model, fix the
rtl8211f_config_init() function by invoking "genphy_soft_reset()"
conditionally based on the presence of the "PHYCR2" register.
Fixes: bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG")
Signed-off-by: Siddharth Vadapalli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Justin Iurman says:
====================
ioam6: fix write to cloned skb's
Make sure the IOAM data insertion is not applied on cloned skb's. As a
consequence, ioam selftests needed a refactoring.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
ioam6_parser uses a packet socket. After the fix to prevent writing to
cloned skb's, the receiver does not see its IOAM data anymore, which
makes input/forward ioam-selftests to fail. As a workaround,
ioam6_parser now uses an IPv6 raw socket and leverages ancillary data to
get hop-by-hop options. As a consequence, the hook is "after" the IOAM
data insertion by the receiver and all tests are working again.
Signed-off-by: Justin Iurman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
ioam6_fill_trace_data() writes inside the skb payload without ensuring
it's writeable (e.g., not cloned). This function is called both from the
input and output path. The output path (ioam6_iptunnel) already does the
check. This commit provides a fix for the input path, inside
ipv6_hop_ioam(). It also updates ip6_parse_tlv() to refresh the network
header pointer ("nh") when returning from ipv6_hop_ioam().
Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace")
Reported-by: Paolo Abeni <[email protected]>
Signed-off-by: Justin Iurman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The receive queues are protected by their respective spin-lock, not
the socket lock. This could lead to skb_peek() unexpectedly
returning NULL or a pointer to an already dequeued socket buffer.
Fixes: 9641458d3ec4 ("Phonet: Pipe End Point for Phonet Pipes protocol")
Signed-off-by: Rémi Denis-Courmont <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The receive queue is protected by its embedded spin-lock, not the
socket lock, so we need the former lock here (and only that one).
Fixes: 107d0d9b8d9a ("Phonet: Phonet datagram transport protocol")
Reported-by: Luosili <[email protected]>
Signed-off-by: Rémi Denis-Courmont <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In erofs_find_target_block() when erofs_dirnamecmp() returns 0,
we do not assign the target metabuf. This causes the caller
erofs_namei()'s erofs_put_metabuf() at the end to be not effective
leaving the refcount on the page.
As the page from metabuf (buf->page) is never put, such page cannot be
migrated or reclaimed. Fix it now by putting the metabuf from
previous loop and assigning the current metabuf to target before
returning so caller erofs_namei() can do the final put as it was
intended.
Fixes: 500edd095648 ("erofs: use meta buffers for inode lookup")
Cc: <[email protected]> # 5.18+
Signed-off-by: Sandeep Dhavale <[email protected]>
Reviewed-by: Gao Xiang <[email protected]>
Reviewed-by: Jingbo Xu <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Gao Xiang <[email protected]>
|
|
Both registers used when doing manual injection or fdma injection are
shared between all the net devices of the switch. It was noticed that
when having two process which each of them trying to inject frames on
different ethernet ports, that the HW started to behave strange, by
sending out more frames then expected. When doing fdma injection it is
required to set the frame in the DCB and then make sure that the next
pointer of the last DCB is invalid. But because there is no locks for
this, then easily this pointer between the DCB can be broken and then it
would create a loop of DCBs. And that means that the HW will
continuously transmit these frames in a loop. Until the SW will break
this loop.
Therefore to fix this issue, add a spin lock for when accessing the
registers for manual or fdma injection.
Signed-off-by: Horatiu Vultur <[email protected]>
Reviewed-by: Daniel Machon <[email protected]>
Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
As IDR can't protect itself from the concurrent modification, place
idr_remove() under the protection of tp->lock.
Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
Signed-off-by: Jianbo Liu <[email protected]>
Reviewed-by: Cosmin Ratiu <[email protected]>
Reviewed-by: Gal Pressman <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Unlike other commands, due to a c&p error, port dump fills-up cmd with
wrong value, different from port-get request cmd, port-get doit reply
and port notification.
Fix it by filling cmd with value DEVLINK_CMD_PORT_NEW.
Skimmed through devlink userspace implementations, none of them cares
about this cmd value. Only ynl, for which, this is actually a fix, as it
expects doit and dumpit ops rsp_value to be the same.
Omit the fixes tag, even thought this is fix, better to target this for
next release.
Fixes: bfcd3a466172 ("Introduce devlink infrastructure")
Signed-off-by: Jiri Pirko <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix EST offset for dwmac 5.10.
Currently configuring Qbv doesn't work as expected. The schedule is
configured, but never confirmed:
|[ 128.250219] imx-dwmac 428a0000.ethernet eth1: configured EST
The reason seems to be the refactoring of the EST code which set the wrong
EST offset for the dwmac 5.10. After fixing this it works as before:
|[ 106.359577] imx-dwmac 428a0000.ethernet eth1: configured EST
|[ 128.430715] imx-dwmac 428a0000.ethernet eth1: EST: SWOL has been switched
Tested on imx93.
Fixes: c3f3b97238f6 ("net: stmmac: Refactor EST implementation")
Signed-off-by: Kurt Kanzenbach <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Jakub Kicinski says:
====================
tools: ynl: fix impossible errors
Fix bugs discovered while I was hacking in low level stuff in YNL
and kept breaking the socket, exercising the "impossible" error paths.
v1: https://lore.kernel.org/all/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Make sure to free the already-parsed mcast_groups if
we don't get an ack from the kernel when reading family info.
This is part of the ynl_sock_create() error path, so we won't
get a call to ynl_sock_destroy() to free them later.
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Acked-by: Nicolas Dichtel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There is one common error handler in ynl - ynl_cb_error().
It expects priv to be a pointer to struct ynl_parse_arg AKA yarg.
To avoid potential crashes if we encounter a stray NLMSG_ERROR
always pass yarg as priv (or a struct which has it as the first
member).
ynl_cb_null() has a similar problem directly - it expects yarg
but priv passed by the caller is ys.
Found by code inspection.
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Acked-by: Nicolas Dichtel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We may hold an extra reference on a socket if a tag allocation fails: we
optimistically allocate the sk_key, and take a ref there, but do not
drop if we end up not using the allocated key.
Ensure we're dropping the sock on this failure by doing a proper unref
rather than directly kfree()ing.
Fixes: de8a6b15d965 ("net: mctp: add an explicit reference from a mctp_sk_key to sock")
Signed-off-by: Jeremy Kerr <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ce9b61e44d1cdae7797be0c5e3141baf582d23a0.1707983487.git.jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
KMSAN reports unitialized variable when registering the hook,
reg->hook_ops_type == NF_HOOK_OP_BPF)
~~~~~~~~~~~ undefined
This is a small structure, just use kzalloc to make sure this
won't happen again when new fields get added to nf_hook_ops.
Fixes: 7b4b2fa37587 ("netfilter: annotate nf_tables base hook ops")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Register hooks last when adding chain/flowtable to ensure that packets do
not walk over datastructure that is being released in the error path
without waiting for the rcu grace period.
Fixes: 91c7b38dc9f0 ("netfilter: nf_tables: use new transaction infrastructure to handle chain")
Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Direct xmit does not use it since it calls dev_queue_xmit() to send
packets, hence it calls dst_release().
kmemleak reports:
unreferenced object 0xffff88814f440900 (size 184):
comm "softirq", pid 0, jiffies 4294951896
hex dump (first 32 bytes):
00 60 5b 04 81 88 ff ff 00 e6 e8 82 ff ff ff ff .`[.............
21 0b 50 82 ff ff ff ff 00 00 00 00 00 00 00 00 !.P.............
backtrace (crc cb2bf5d6):
[<000000003ee17107>] kmem_cache_alloc+0x286/0x340
[<0000000021a5de2c>] dst_alloc+0x43/0xb0
[<00000000f0671159>] rt_dst_alloc+0x2e/0x190
[<00000000fe5092c9>] __mkroute_output+0x244/0x980
[<000000005fb96fb0>] ip_route_output_flow+0xc0/0x160
[<0000000045367433>] nf_ip_route+0xf/0x30
[<0000000085da1d8e>] nf_route+0x2d/0x60
[<00000000d1ecd1cb>] nft_flow_route+0x171/0x6a0 [nft_flow_offload]
[<00000000d9b2fb60>] nft_flow_offload_eval+0x4e8/0x700 [nft_flow_offload]
[<000000009f447dbb>] expr_call_ops_eval+0x53/0x330 [nf_tables]
[<00000000072e1be6>] nft_do_chain+0x17c/0x840 [nf_tables]
[<00000000d0551029>] nft_do_chain_inet+0xa1/0x210 [nf_tables]
[<0000000097c9d5c6>] nf_hook_slow+0x5b/0x160
[<0000000005eccab1>] ip_forward+0x8b6/0x9b0
[<00000000553a269b>] ip_rcv+0x221/0x230
[<00000000412872e5>] __netif_receive_skb_one_core+0xfe/0x110
Fixes: fa502c865666 ("netfilter: flowtable: simplify route logic")
Reported-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|