Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull io_uring fixes from Jens Axboe:
- Fix an issue with discontig page checking for IORING_SETUP_NO_MMAP
- Fix an issue with not allowing IORING_SETUP_NO_MMAP also disallowing
mmap'ed buffer rings
- Fix an issue with deferred release of memory mapped pages
- Fix a lockdep issue with IORING_SETUP_NO_MMAP
- Use fget/fput consistently, even from our sync system calls. No real
issue here, but if we were ever to allow closing io_uring descriptors
it would be required. Let's play it safe and just use the full ref
counted versions upfront. Most uses of io_uring are threaded anyway,
and hence already doing the full version underneath.
* tag 'io_uring-6.7-2023-11-30' of git://git.kernel.dk/linux:
io_uring: use fget/fput consistently
io_uring: free io_buffer_list entries via RCU
io_uring/kbuf: prune deferred locked cache when tearing down
io_uring/kbuf: recycle freed mapped buffer ring entries
io_uring/kbuf: defer release of mapped buffer rings
io_uring: enable io_mem_alloc/free to be used in other parts
io_uring: don't guard IORING_OFF_PBUF_RING with SETUP_NO_MMAP
io_uring: don't allow discontig pages for IORING_SETUP_NO_MMAP
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Invalid namespace identification error handling (Marizio Ewan,
Keith)
- Fabrics keep-alive tuning (Mark)
- Fix for a bad error check regression in bcache (Markus)
- Fix for a performance regression with O_DIRECT (Ming)
- Fix for a flush related deadlock (Ming)
- Make the read-only warn on per-partition (Yu)
* tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux:
nvme-core: check for too small lba shift
blk-mq: don't count completed flush data request as inflight in case of quiesce
block: Document the role of the two attribute groups
block: warn once for each partition in bio_check_ro()
block: move .bd_inode into 1st cacheline of block_device
nvme: check for valid nvme_identify_ns() before using it
nvme-core: fix a memory leak in nvme_ns_info_from_identify()
nvme: fine-tune sending of first keep-alive
bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM verity target's FEC support to always initialize IO before it
frees it. Also fix alignment of struct dm_verity_fec_io within the
per-bio-data
- Fix DM verity target to not FEC failed readahead IO
- Update DM flakey target to use MAX_ORDER rather than MAX_ORDER - 1
* tag 'dm-6.7/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-flakey: start allocating with MAX_ORDER
dm-verity: align struct dm_verity_fec_io properly
dm verity: don't perform FEC for failed readahead IO
dm verity: initialize fec io before freeing it
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small fixes, one in drivers.
The core changes are to the internal representation of flags in
scsi_devices which removes space wasting bools in favour of single bit
flags and to add a flag to force a runtime resume which is used by ATA
devices"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Fix system start for ATA devices
scsi: Change SCSI device boolean fields to single bit flags
scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2 fix from Jan Kara:
"Fix an ext2 bug introduced by changes in ext2 & iomap stepping on each
other toes (apparently ext2 driver does not get much testing in
linux-next)"
* tag 'fs_for_v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: Fix ki_pos update for DIO buffered-io fallback case
|
|
Pull more bcachefs bugfixes from Kent Overstreet:
- bcache & bcachefs were broken with CFI enabled; patch for closures to
fix type punning
- mark erasure coding as extra-experimental; there are incompatible
disk space accounting changes coming for erasure coding, and I'm
still seeing checksum errors in some tests
- several fixes for durability-related issues (durability is a device
specific setting where we can tell bcachefs that data on a given
device should be counted as replicated x times)
- a fix for a rare livelock when a btree node merge then updates a
parent node that is almost full
- fix a race in the device removal path, where dropping a pointer in a
btree node to a device would be clobbered by an in flight btree write
updating the btree node key on completion
- fix one SRCU lock hold time warning in the btree gc code - ther's
still a bunch more of these to fix
- fix a rare race where we'd start copygc before initializing the "are
we rw" percpu refcount; copygc would think we were already ro and die
immediately
* tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits)
bcachefs: Extra kthread_should_stop() calls for copygc
bcachefs: Convert gc_alloc_start() to for_each_btree_key2()
bcachefs: Fix race between btree writes and metadata drop
bcachefs: move journal seq assertion
bcachefs: -EROFS doesn't count as move_extent_start_fail
bcachefs: trace_move_extent_start_fail() now includes errcode
bcachefs: Fix split_race livelock
bcachefs: Fix bucket data type for stripe buckets
bcachefs: Add missing validation for jset_entry_data_usage
bcachefs: Fix zstd compress workspace size
bcachefs: bpos is misaligned on big endian
bcachefs: Fix ec + durability calculation
bcachefs: Data update path won't accidentaly grow replicas
bcachefs: deallocate_extra_replicas()
bcachefs: Proper refcounting for journal_keys
bcachefs: preserve device path as device name
bcachefs: Fix an endianness conversion
bcachefs: Start gc, copygc, rebalance threads after initing writes ref
bcachefs: Don't stop copygc thread on device resize
bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops
...
|
|
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.7
- Invalid namespace identification error handling (Marizio Ewan, Keith)
- Fabrics keep-alive tuning (Mark)"
* tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme:
nvme-core: check for too small lba shift
nvme: check for valid nvme_identify_ns() before using it
nvme-core: fix a memory leak in nvme_ns_info_from_identify()
nvme: fine-tune sending of first keep-alive
|
|
The block layer doesn't support logical block sizes smaller than 512
bytes. The nvme spec doesn't support that small either, but the driver
isn't checking to make sure the device responded with usable data.
Failing to catch this will result in a kernel bug, either from a
division by zero when stacking, or a zero length bio.
Reviewed-by: Jens Axboe <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
|
|
Request queue quiesce may interrupt flush sequence, and the original request
may have been marked as COMPLETE, but can't get finished because of
queue quiesce.
This way is fine from driver viewpoint, because flush sequence is block
layer concept, and it isn't related with driver.
However, driver(such as dm-rq) can call blk_mq_queue_inflight() to count &
drain inflight requests, then the wait & drain never gets done because
the completed & not-finished flush request is counted as inflight.
Fix this issue by not counting completed flush data request as inflight in
case of quiesce.
Cc: Mike Snitzer <[email protected]>
Cc: David Jeffery <[email protected]>
Cc: John Pittman <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- struct_group: propagate attributes to top-level union (Dmitry
Antipov)
- gcc-plugins: randstruct: Update code comment in relayout_struct
(Gustavo A. R. Silva)
- MAINTAINERS: refresh LLVM support (Nick Desaulniers)
* tag 'hardening-v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: randstruct: Update code comment in relayout_struct()
uapi: propagate __struct_group() attributes to the container union
MAINTAINERS: refresh LLVM support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit fixes from Shuah Khan:
"Three fixes to warnings and run-time test behavior. With these fixes,
test suite counter will be reset correctly before running tests, kunit
will warn if tests are too slow, and eliminate warning when kfree() as
an action"
* tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: test: Avoid cast warning when adding kfree() as an action
kunit: Reset suite counter right before running tests
kunit: Warn if tests are slow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"Assorted build fixes including:
- fix compile errors in printf() with u64 on 32-bit systesm
- sync kernel headers to the tool copies
- update arm64 sysreg generation for tarballs
- disable compile warnings on __packed attribute"
* tag 'perf-tools-fixes-for-v6.7-1-2023-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools: Disable __packed attribute compiler warning due to -Werror=attributes
perf build: Ensure sysreg-defs Makefile respects output dir
tools perf: Add arm64 sysreg files to MANIFEST
tools/perf: Update tools's copy of mips syscall table
tools/perf: Update tools's copy of s390 syscall table
tools/perf: Update tools's copy of powerpc syscall table
tools/perf: Update tools's copy of x86 syscall table
tools headers: Update tools's copy of s390/asm headers
tools headers: Update tools's copy of arm64/asm headers
tools headers: Update tools's copy of x86/asm headers
tools headers: Update tools's copy of socket.h header
tools headers UAPI: Update tools's copy of unistd.h header
tools headers UAPI: Update tools's copy of vhost.h header
tools headers UAPI: Update tools's copy of mount.h header
tools headers UAPI: Update tools's copy of kvm.h header
tools headers UAPI: Update tools's copy of fscrypt.h header
tools headers UAPI: Update tools's copy of drm headers
perf lock contention: Fix a build error on 32-bit
perf kwork: Fix a build error on 32-bit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and wifi.
Current release - regressions:
- neighbour: fix __randomize_layout crash in struct neighbour
- r8169: fix deadlock on RTL8125 in jumbo mtu mode
Previous releases - regressions:
- wifi:
- mac80211: fix warning at station removal time
- cfg80211: fix CQM for non-range use
- tools: ynl-gen: fix unexpected response handling
- octeontx2-af: fix possible buffer overflow
- dpaa2: recycle the RX buffer only after all processing done
- rswitch: fix missing dev_kfree_skb_any() in error path
Previous releases - always broken:
- ipv4: fix uaf issue when receiving igmp query packet
- wifi: mac80211: fix debugfs deadlock at device removal time
- bpf:
- sockmap: af_unix stream sockets need to hold ref for pair sock
- netdevsim: don't accept device bound programs
- selftests: fix a char signedness issue
- dsa: mv88e6xxx: fix marvell 6350 probe crash
- octeontx2-pf: restore TC ingress police rules when interface is up
- wangxun: fix memory leak on msix entry
- ravb: keep reverse order of operations in ravb_remove()"
* tag 'net-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
net: ravb: Keep reverse order of operations in ravb_remove()
net: ravb: Stop DMA in case of failures on ravb_open()
net: ravb: Start TX queues after HW initialization succeeded
net: ravb: Make write access to CXR35 first before accessing other EMAC registers
net: ravb: Use pm_runtime_resume_and_get()
net: ravb: Check return value of reset_control_deassert()
net: libwx: fix memory leak on msix entry
ice: Fix VF Reset paths when interface in a failed over aggregate
bpf, sockmap: Add af_unix test with both sockets in map
bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
tools: ynl-gen: always construct struct ynl_req_state
ethtool: don't propagate EOPNOTSUPP from dumps
ravb: Fix races between ravb_tx_timeout_work() and net related ops
r8169: prevent potential deadlock in rtl8169_close
r8169: fix deadlock on RTL8125 in jumbo mtu mode
neighbour: Fix __randomize_layout crash in struct neighbour
octeontx2-pf: Restore TC ingress police rules when interface is up
octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64
net: stmmac: xgmac: Disable FPE MMC interrupts
octeontx2-af: Fix possible buffer overflow
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fix from Ulf Hansson:
- Avoid polling for the scmi_perf_domain on arm
* tag 'pmdomain-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: arm: Avoid polling for scmi_perf_domain
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix CQE error recovery path
MMC host:
- cqhci: Fix CQE error recovery path
- sdhci-pci-gli: Fix initialization of LPM
- sdhci-sprd: Fix enabling/disabling of the vqmmc regulator"
* tag 'mmc-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled
mmc: sdhci-pci-gli: Disable LPM during initialization
mmc: cqhci: Fix task clearing in CQE error recovery
mmc: cqhci: Warn of halt or task clear failure
mmc: block: Retry commands in CQE error recovery
mmc: block: Be sure to wait while busy in CQE error recovery
mmc: cqhci: Increase recovery halt timeout
mmc: block: Do not lose cache flush during CQE error recovery
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Pull LED fix from Lee Jones:
- Remove duplicate sysfs entry 'color' from LEDs class
* tag 'leds-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
leds: class: Don't expose color sysfs entry
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
- Fix for EFI unaccepted memory handling
* tag 'efi-urgent-for-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/unaccepted: Fix off-by-one when checking for overlapping ranges
|
|
Claudiu Beznea says:
====================
net: ravb: Fixes for the ravb driver
This series adds some fixes for ravb driver. Patches in this series
were initilly part of series at [1].
Changes in v2:
- in description of patch 1/6 documented the addition of
out_free_netdev goto label
- collected tags
- s/out_runtime_disable/out_rpm_disable in patch 2/6
- fixed typos in description of patch 6/6
Changes since [1]:
- addressed review comments
- added patch 6/6
[1] https://lore.kernel.org/all/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
On RZ/G3S SMARC Carrier II board having RGMII connections b/w Ethernet
MACs and PHYs it has been discovered that doing unbind/bind for ravb
driver in a loop leads to wrong speed and duplex for Ethernet links and
broken connectivity (the connectivity cannot be restored even with
bringing interface down/up). Before doing unbind/bind the Ethernet
interfaces were configured though systemd. The sh instructions used to
do unbind/bind were:
$ cd /sys/bus/platform/drivers/ravb/
$ while :; do echo 11c30000.ethernet > unbind ; \
echo 11c30000.ethernet > bind; done
It has been discovered that there is a race b/w IOCTLs initialized by
systemd at the response of success binding and the
"ravb_write(ndev, CCC_OPC_RESET, CCC)" call in ravb_remove() as
follows:
1/ as a result of bind success the user space open/configures the
interfaces tough an IOCTL; the following stack trace has been
identified on RZ/G3S:
Call trace:
dump_backtrace+0x9c/0x100
show_stack+0x20/0x38
dump_stack_lvl+0x48/0x60
dump_stack+0x18/0x28
ravb_open+0x70/0xa58
__dev_open+0xf4/0x1e8
__dev_change_flags+0x198/0x218
dev_change_flags+0x2c/0x80
devinet_ioctl+0x640/0x708
inet_ioctl+0x1e4/0x200
sock_do_ioctl+0x50/0x108
sock_ioctl+0x240/0x358
__arm64_sys_ioctl+0xb0/0x100
invoke_syscall+0x50/0x128
el0_svc_common.constprop.0+0xc8/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x34/0xb8
el0t_64_sync_handler+0xc0/0xc8
el0t_64_sync+0x190/0x198
2/ this call may execute concurrently with ravb_remove() as the
unbind/bind operation was executed in a loop
3/ if the operation mode is changed to RESET (through
ravb_write(ndev, CCC_OPC_RESET, CCC) call in ravb_remove())
while the above ravb_open() is in progress it may lead to MAC
(or PHY, or MAC-PHY connection, the right point hasn't been identified
at the moment) to be broken, thus the Ethernet connectivity fails to
restore.
The simple fix for this is to move ravb_write(ndev, CCC_OPC_RESET, CCC))
after unregister_netdev() to avoid resetting the controller while the
netdev interface is still registered.
To avoid future issues in ravb_remove(), the patch follows the proper order
of operations in ravb_remove(): reverse order compared with ravb_probe().
This avoids described races as the IOCTLs as well as unregister_netdev()
(called now at the beginning of ravb_remove()) calls rtnl_lock() before
continuing and IOCTLs check (though devinet_ioctl()) if device is still
registered just after taking the lock:
int devinet_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr)
{
// ...
rtnl_lock();
ret = -ENODEV;
dev = __dev_get_by_name(net, ifr->ifr_name);
if (!dev)
goto done;
// ...
done:
rtnl_unlock();
out:
return ret;
}
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In case ravb_phy_start() returns with error the settings applied in
ravb_dmac_init() are not reverted (e.g. config mode). For this call
ravb_stop_dma() on failure path of ravb_open().
Fixes: a0d2f20650e8 ("Renesas Ethernet AVB PTP clock driver")
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
ravb_phy_start() may fail. If that happens, the TX queues will remain
started. Thus, move the netif_tx_start_all_queues() after PHY is
successfully initialized.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
registers
Hardware manual of RZ/G3S (and RZ/G2L) specifies the following on the
description of CXR35 register (chapter "PHY interface select register
(CXR35)"): "After release reset, make write-access to this register before
making write-access to other registers (except MDIOMOD). Even if not need
to change the value of this register, make write-access to this register
at least one time. Because RGMII/MII MODE is recognized by accessing this
register".
The setup procedure for EMAC module (chapter "Setup procedure" of RZ/G3S,
RZ/G2L manuals) specifies the E-MAC.CXR35 register is the first EMAC
register that is to be configured.
Note [A] from chapter "PHY interface select register (CXR35)" specifies
the following:
[A] The case which CXR35 SEL_XMII is used for the selection of RGMII/MII
in APB Clock 100 MHz.
(1) To use RGMII interface, Set ‘H’03E8_0000’ to this register.
(2) To use MII interface, Set ‘H’03E8_0002’ to this register.
Take into account these indication.
Fixes: 1089877ada8d ("ravb: Add RZ/G2L MII interface support")
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
pm_runtime_get_sync() may return an error. In case it returns with an error
dev->power.usage_count needs to be decremented. pm_runtime_resume_and_get()
takes care of this. Thus use it.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
reset_control_deassert() could return an error. Some devices cannot work
if reset signal de-assert operation fails. To avoid this check the return
code of reset_control_deassert() in ravb_probe() and take proper action.
Along with it, the free_netdev() call from the error path was moved after
reset_control_assert() on its own label (out_free_netdev) to free
netdev in case reset_control_deassert() fails.
Fixes: 0d13a1a464a0 ("ravb: Add reset support")
Reviewed-by: Sergey Shtylyov <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Signed-off-by: Claudiu Beznea <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Since pci_free_irq_vectors() set pdev->msix_enabled as 0 in the
calling of pci_msix_shutdown(), wx->msix_entries is never freed.
Reordering the lines to fix the memory leak.
Cc: [email protected]
Fixes: 3f703186113f ("net: libwx: Add irq flow functions")
Signed-off-by: Jiawen Wu <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There is an error when an interface has the following conditions:
- PF is in an aggregate (bond)
- PF has VFs created on it
- bond is in a state where it is failed-over to the secondary interface
- A VF reset is issued on one or more of those VFs
The issue is generated by the originating PF trying to rebuild or
reconfigure the VF resources. Since the bond is failed over to the
secondary interface the queue contexts are in a modified state.
To fix this issue, have the originating interface reclaim its resources
prior to the tear-down and rebuild or reconfigure. Then after the process
is complete, move the resources back to the currently active interface.
There are multiple paths that can be used depending on what triggered the
event, so create a helper function to move the queues and use paired calls
to the helper (back to origin, process, then move back to active interface)
under the same lag_mutex lock.
Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Sujai Buvaneswaran <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
wireless fixes:
- debugfs had a deadlock (removal vs. use of files),
fixes going through wireless ACKed by Greg
- support for HT STAs on 320 MHz channels, even if it's
not clear that should ever happen (that's 6 GHz), best
not to WARN()
- fix for the previous CQM fix that broke most cases
- various wiphy locking fixes
- various small driver fixes
* tag 'wireless-2023-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: use wiphy locked debugfs for sdata/link
wifi: mac80211: use wiphy locked debugfs helpers for agg_status
wifi: cfg80211: add locked debugfs wrappers
debugfs: add API to allow debugfs operations cancellation
debugfs: annotate debugfs handlers vs. removal with lockdep
debugfs: fix automount d_fsdata usage
wifi: mac80211: handle 320 MHz in ieee80211_ht_cap_ie_to_sta_ht_cap
wifi: avoid offset calculation on NULL pointer
wifi: cfg80211: hold wiphy mutex for send_interface
wifi: cfg80211: lock wiphy mutex for rfkill poll
wifi: cfg80211: fix CQM for non-range use
wifi: mac80211: do not pass AP_VLAN vif pointer to drivers during flush
wifi: iwlwifi: mvm: fix an error code in iwl_mvm_mld_add_sta()
wifi: mt76: mt7925: fix typo in mt7925_init_he_caps
wifi: mt76: mt7921: fix 6GHz disabled by the missing default CLC config
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-11-30
We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 10 files changed, 66 insertions(+), 15 deletions(-).
The main changes are:
1) Fix AF_UNIX splat from use after free in BPF sockmap,
from John Fastabend.
2) Fix a syzkaller splat in netdevsim by properly handling offloaded
programs (and not device-bound ones), from Stanislav Fomichev.
3) Fix bpf_mem_cache_alloc_flags() to initialize the allocation hint,
from Hou Tao.
4) Fix netkit by rejecting IFLA_NETKIT_PEER_INFO in changelink,
from Daniel Borkmann.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Add af_unix test with both sockets in map
bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
netkit: Reject IFLA_NETKIT_PEER_INFO in netkit_change_link
bpf: Add missed allocation hint for bpf_mem_cache_alloc_flags()
netdevsim: Don't accept device bound programs
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This adds a test where both pairs of a af_unix paired socket are put into a
BPF map. This ensures that when we tear down the af_unix pair we don't have
any issues on sockmap side with ordering and reference counting.
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
will lookup the paired socket as part of the send operation. It is possible
however to put just one of the pairs in a BPF map. This currently increments
the refcnt on the sock in the sockmap to ensure it is not free'd by the
stack before sockmap cleans up its state and stops any skbs being sent/recv'd
to that socket.
But we missed a case. If the peer socket is closed it will be free'd by the
stack. However, the paired socket can still be referenced from BPF sockmap
side because we hold a reference there. Then if we are sending traffic through
BPF sockmap to that socket it will try to dereference the free'd pair in its
send logic creating a use after free. And following splat:
[59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
[59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
[...]
[59.905468] Call Trace:
[59.905787] <TASK>
[59.906066] dump_stack_lvl+0x130/0x1d0
[59.908877] print_report+0x16f/0x740
[59.910629] kasan_report+0x118/0x160
[59.912576] sk_wake_async+0x31/0x1b0
[59.913554] sock_def_readable+0x156/0x2a0
[59.914060] unix_stream_sendmsg+0x3f9/0x12a0
[59.916398] sock_sendmsg+0x20e/0x250
[59.916854] skb_send_sock+0x236/0xac0
[59.920527] sk_psock_backlog+0x287/0xaa0
To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
primarily problem with keeping this logic in bpf_unix was: In the sock close()
we could handle the deref by having a close handler. But, when we are destroying
the psock through a map delete operation we wouldn't have gotten any signal
thorugh the proto struct other than it being replaced. If we do the deref from
the proto replace its too early because we need to deref the sk_pair after the
backlog worker has been stopped.
Given all this it seems best to just cache it at the end of the psock and eat 8B
for the af_unix and vsock users. Notice dgram sockets are OK because they handle
locking already.
Fixes: 94531cfcbe79 ("af_unix: Add unix_stream_proto for sockmap")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely")
changed the meaning of MAX_ORDER from exclusive to inclusive. So, we
can allocate compound pages with up to 1 << MAX_ORDER pages.
Reflect this change in dm-flakey and start trying to allocate compound
pages with MAX_ORDER.
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
dm_verity_fec_io is placed after the end of two hash digests. If the hash
digest has unaligned length, struct dm_verity_fec_io could be unaligned.
This commit fixes the placement of struct dm_verity_fec_io, so that it's
aligned.
Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Fixes: a739ff3f543a ("dm verity: add support for forward error correction")
Signed-off-by: Mike Snitzer <[email protected]>
|
|
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs
DM tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, many readahead IOs are submitted to dm_verity
from filesystem. Then the kverity works are busy doing FEC process
which cost too much time to finish dm-verity IO. This causes needless
delay which feels like system is hung.
After adding debugging it was found that each readahead IO needed
around 10s to finish when this situation occurred. This is due to IO
amplification:
dm-snapshot suspend
erofs_readahead // 300+ io is submitted
dm_submit_bio (dm_verity)
dm_submit_bio (dm_snapshot)
bio return EIO
bio got nothing, it's empty
verity_end_io
verity_verify_io
forloop range(0, io->n_blocks) // each io->nblocks ~= 20
verity_fec_decode
fec_decode_rsb
fec_read_bufs
forloop range(0, v->fec->rsn) // v->fec->rsn = 253
new_read
submit_bio (dm_snapshot)
end loop
end loop
dm-snapshot resume
Readahead BIOs get nothing while dm-snapshot is suspended, so all of
them will cause verity's FEC.
Each readahead BIO needs to verify ~20 (io->nblocks) blocks.
Each block needs to do FEC, and every block needs to do 253
(v->fec->rsn) reads.
So during the suspend interval(~200ms), 300 readahead BIOs trigger
~1518000 (300*20*253) IOs to dm-snapshot.
As readahead IO is not required by userspace, and to fix this issue,
it is best to pass readahead errors to upper layer to handle it.
Cc: [email protected]
Fixes: a739ff3f543a ("dm verity: add support for forward error correction")
Signed-off-by: Wu Bo <[email protected]>
Reviewed-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
If BIO error, verity_end_io() can call verity_finish_io() before
verity_fec_init_io(). Therefore, fec_io->rs is not initialized and
may crash when doing memory freeing in verity_fec_finish_io().
Crash call stack:
die+0x90/0x2b8
__do_kernel_fault+0x260/0x298
do_bad_area+0x2c/0xdc
do_translation_fault+0x3c/0x54
do_mem_abort+0x54/0x118
el1_abort+0x38/0x5c
el1h_64_sync_handler+0x50/0x90
el1h_64_sync+0x64/0x6c
free_rs+0x18/0xac
fec_rs_free+0x10/0x24
mempool_free+0x58/0x148
verity_fec_finish_io+0x4c/0xb0
verity_end_io+0xb8/0x150
Cc: [email protected] # v6.0+
Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Wu Bo <[email protected]>
Reviewed-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
It is nontrivial to derive the role of the two attribute groups in source
file block/blk-sysfs.c. Hence add a comment that explains their roles. See
also commit 6d85ebf95c44 ("blk-sysfs: add a new attr_group for blk_mq").
Cc: Christoph Hellwig <[email protected]>
Cc: Yu Kuai <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
struct ynl_req_state carries reply-related info from generated code
into generic YNL code. While we don't need reply info to execute
a request without a reply, we still need to pass in the struct, because
it's also where we get the pointer to struct ynl_sock from. Passing NULL
results in crashes if kernel returns an error or an unexpected reply.
Fixes: dc0956c98f11 ("tools: ynl-gen: move the response reading logic into YNL")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The default dump handler needs to clear ret before returning.
Otherwise if the last interface returns an inconsequential
error this error will propagate to user space.
This may confuse user space (ethtool CLI seems to ignore it,
but YNL doesn't). It will also terminate the dump early
for mutli-skb dump, because netlink core treats EOPNOTSUPP
as a real error.
Fixes: 728480f12442 ("ethtool: default handlers for GET requests")
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Fix a really interesting potential core bug in the list iterator
requireing the use of READ_ONCE() discovered when testing kernel
compiles with clang.
- Check devm_kcalloc() return value and an array bounds in the STM32
driver.
- Fix an exotic string truncation issue in the s32cc driver, found by
the kernel test robot (impressive!)
- Fix an undocumented struct member in the cy8c95x0 driver.
- Fix a symbol overlap with MIPS in the Lochnagar driver, MIPS defines
a global symbol "RST" which is a bit too generic and collide with
stuff. OK this one should be renamed too, we will fix that as well.
- Fix erroneous branch taking in the Realtek driver.
- Fix the mail address in MAINTAINERS for the s32g2 driver.
* tag 'pinctrl-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
dt-bindings: pinctrl: s32g2: change a maintainer email address
pinctrl: realtek: Fix logical error when finding descriptor
pinctrl: lochnagar: Don't build on MIPS
pinctrl: avoid reload of p state in list iteration
pinctrl: cy8c95x0: Fix doc warning
pinctrl: s32cc: Avoid possible string truncation
pinctrl: stm32: fix array read out of bound
pinctrl: stm32: Add check for devm_kcalloc
|
|
This fixes a bug where going read-only was taking longer than it should
have due to copygc forgetting to check kthread_should_stop()
Additionally: fix a missing is_kthread check in bch2_move_ratelimit().
Signed-off-by: Kent Overstreet <[email protected]>
|
|
This eliminates some SRCU warnings: for_each_btree_key2() runs every
loop iteration in a distinct transaction context.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
btree writes update the btree node key after every write, in order to
update sectors_written, and they also might need to drop pointers if one
of the writes failed in a replicated btree node.
But the btree node might also have had a pointer dropped while the write
was in flight, by bch2_dev_metadata_drop(), and thus there was a bug
where the btree node write would ovewrite the btree node's key with what
it had at the start of the write.
Fix this by dropping pointers not currently in the btree node key.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
journal_cur_seq() can legitimately be used outside of the journal lock,
where this assert can race
Signed-off-by: Kent Overstreet <[email protected]>
|
|
The automated tests check if we've hit too many slowpath/error path
events and fail the test - if we're just shutting down, that naturally
shouldn't count.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Fix races between ravb_tx_timeout_work() and functions of net_device_ops
and ethtool_ops by using rtnl_trylock() and rtnl_unlock(). Note that
since ravb_close() is under the rtnl lock and calls cancel_work_sync(),
ravb_tx_timeout_work() should calls rtnl_trylock(). Otherwise, a deadlock
may happen in ravb_tx_timeout_work() like below:
CPU0 CPU1
ravb_tx_timeout()
schedule_work()
...
__dev_close_many()
// Under rtnl lock
ravb_close()
cancel_work_sync()
// Waiting
ravb_tx_timeout_work()
rtnl_lock()
// This is possible to cause a deadlock
If rtnl_trylock() fails, rescheduling the work with sleep for 1 msec.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Reviewed-by: Sergey Shtylyov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Renamed from trace_move_extent_alloc_mem_fail, because there are other
reasons we colud fail (disk space allocation failure).
Signed-off-by: Kent Overstreet <[email protected]>
|
|
bch2_btree_update_start() calculates which nodes are going to have to be
split/rewritten, so that we know how many nodes to reserve and how deep
in the tree we have to take locks.
But btree node merges require inserting two keys into the parent node,
not just splits.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Validation was completely missing for replicas entries in the journal
(not the superblock replicas section) - we can't have replicas entries
pointing to invalid devices.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
zstd apparently lies about the size of the compression workspace it
requires; if we double it compression succeeds.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few fixes and message updates:
- for simple quotas, handle the case when a snapshot is created and
the target qgroup already exists
- fix a warning when file descriptor given to send ioctl is not
writable
- fix off-by-one condition when checking chunk maps
- free pages when page array allocation fails during compression
read, other cases were handled
- fix memory leak on error handling path in ref-verify debugging
feature
- copy missing struct member 'version' in 64/32bit compat send ioctl
- tree-checker verifies inline backref ordering
- print messages to syslog on first mount and last unmount
- update error messages when reading chunk maps"
* tag 'for-6.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: send: ensure send_fd is writable
btrfs: free the allocated memory if btrfs_alloc_page_array() fails
btrfs: fix 64bit compat send ioctl arguments not initializing version member
btrfs: make error messages more clear when getting a chunk map
btrfs: fix off-by-one when checking chunk map includes logical address
btrfs: ref-verify: fix memory leaks in btrfs_ref_tree_mod()
btrfs: add dmesg output for first mount and last unmount of a filesystem
btrfs: do not abort transaction if there is already an existing qgroup
btrfs: tree-checker: add type and sequence check for inline backrefs
|