Age | Commit message (Collapse) | Author | Files | Lines |
|
When port partner responds "Not supported" to the DiscIdentity command,
VDM state machine can remain in NVDM_STATE_ERR_TMOUT and this causes
querying sink cap to be skipped indefinitely. Hence check for
vdm_sm_running instead of checking for VDM_STATE_DONE.
Fixes: 8dc4bd073663f ("usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)")
Acked-by: Heikki Krogerus <[email protected]>
Signed-off-by: Badhri Jagan Sridharan <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When data digest is enabled we should unmap pdu iovec before handling
the data digest pdu.
Signed-off-by: Elad Grupi <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
From the base spec, Figure 78:
"Controller Configuration, these fields are defined as parameters to
configure an "I/O Controller (IOC)" and not to configure a "Discovery
Controller (DC).
...
If the controller does not support I/O queues, then this field shall
be read-only with a value of 0h
Just perform this check for I/O controllers.
Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
Reported-by: Belanger, Martin <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues. If we
happen to fail setting the queue count for any reason, we should not allow
this to be a successful reconnect as I/O has no chance in going through.
Instead just fail and schedule another reconnect.
Reported-by: Chao Leng <[email protected]>
Fixes: 711023071960 ("nvme-rdma: add a NVMe over Fabrics RDMA host driver")
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chao Leng <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues. If we
happen to fail setting the queue count for any reason, we should not
allow this to be a successful reconnect as I/O has no chance in going
through. Instead just fail and schedule another reconnect.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
For our pure advisory use-case, we only rely on this call as a hint, so
fix the warning complaints of using the smp_processor_id variants with
preemption enabled.
Fixes: db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context")
Fixes: ada831772188 ("nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPT")
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Tested-by: Yi Zhang <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
When the controller sends us a 0-length r2t PDU we should not attempt to
try to set up a h2cdata PDU but rather conclude that this is a buggy
controller (forward progress is not possible) and simply fail it
immediately.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Belanger, Martin <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
We voluntarily limit the Write Zeroes sizes to the MDTS value provided by
the hardware, but currently get the units wrong, so fix that.
Fixes: 6e02318eaea5 ("nvme: add support for the Write Zeroes command")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Tested-by: Klaus Jensen <[email protected]>
Reviewed-by: Klaus Jensen <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>
|
|
To avoid an error recovery deadlock where the keep alive work is waiting
for a request and thus can't be flushed to make progress for tearing down
the controller. Also print the error code returned from
blk_mq_alloc_request to help debugging any future issues in this code.
Based on an earlier patch from Hannes Reinecke <[email protected]>.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Daniel Wagner <[email protected]>
|
|
Merge nvme_keep_alive into its only caller to prepare for additional
changes to this code.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Daniel Wagner <[email protected]>
|
|
Fabrics drivers currently reserve two tags on the admin queue. But
given that the connect command is only run on a freshly created queue
or after all commands have been force aborted we only need to reserve
a single tag.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Daniel Wagner <[email protected]>
|
|
For NAND Ecc layout, there is a dependency from old kernel's nand driver
setting and current. if old kernel use 4 bit ecc , we should use 4 bit
in new kernel either. else will run into following error at filesystem
mounting.
So, enable fsl,use-minimum-ecc from device tree, to fix this mismatch
[ 9.449265] ubi0: scanning is finished
[ 9.463968] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading
22528 bytes from PEB 513:4096, read only 22528 bytes, retry
[ 9.486940] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading
22528 bytes from PEB 513:4096, read only 22528 bytes, retry
[ 9.509906] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading
22528 bytes from PEB 513:4096, read only 22528 bytes, retry
[ 9.532845] ubi0 error: ubi_io_read: error -74 (ECC error) while reading
22528 bytes from PEB 513:4096, read 22528 bytes
Fixes: f9ecf10cb88c ("ARM: dts: imx6ull: add MYiR MYS-6ULX SBC")
Signed-off-by: dillon min <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
|
|
[Why?]
Should only reroute gamut remap to mpc unless 3D LUT is not used and all
planes are using the same src->dest.
[How?]
Remove DCN30 specific logic for rerouting gamut remap to mpc.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Reviewed-by: Krunoslav Kovac <[email protected]>
Acked-by: Aric Cyr <[email protected]>
Acked-by: Solomon Chiu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
[Why]
DCN30 needs to correctly program reversed gamma curve, which DCN20
already has.
Also needs to fix a bug that 252-255 values are clipped.
[How]
Apply two fixes into DCN30.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Calvin Hou <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Reviewed-by: Krunoslav Kovac <[email protected]>
Acked-by: Solomon Chiu <[email protected]>
Acked-by: Vladimir Stempen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2021-03-18
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 4 day(s) which contain
a total of 14 files changed, 336 insertions(+), 94 deletions(-).
The main changes are:
1) Fix fexit/fmod_ret trampoline for sleepable programs, and also fix a ftrace
splat in modify_ftrace_direct() on address change, from Alexei Starovoitov.
2) Fix two oob speculation possibilities that allows unprivileged to leak mem
via side-channel, from Piotr Krysiuk and Daniel Borkmann.
3) Fix libbpf's netlink handling wrt SOCK_CLOEXEC, from Kumar Kartikeya Dwivedi.
4) Fix libbpf's error handling on failure in getting section names, from Namhyung Kim.
5) Fix tunnel collect_md BPF selftest wrt Geneve option handling, from Hangbin Liu.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
If the flowtable has been previously removed in this batch, skip the
hook overlap checks. This fixes spurious EEXIST errors when removing and
adding the flowtable in the same batch.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Otherwise, there exists a small window between the opening and closing
of the socket fd where it may leak into processes launched by some other
thread.
Fixes: 949abbe88436 ("libbpf: add function to setup XDP")
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When it failed to get section names, it should call into
bpf_object__elf_finish() like others.
Fixes: 88a82120282b ("libbpf: Factor out common ELF operations and improve logging")
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Currently flowtable's GC work is initialized as deferrable, which
means GC cannot work on time when system is idle. So the hardware
offloaded flow may be deleted for timeout, since its used time is
not timely updated.
Resolve it by initializing the GC work as delayed work instead of
deferrable.
Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Yinjun Zhang <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Honor flowtable flags from the control update path. Disallow disabling
to toggle hardware offload support though.
Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Error was not set accordingly.
Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
This fix permits gre connections to be tracked within ip6tables rules
Signed-off-by: Ludovic Senecaux <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The fexit/fmod_ret programs can be attached to kernel functions that can sleep.
The synchronize_rcu_tasks() will not wait for such tasks to complete.
In such case the trampoline image will be freed and when the task
wakes up the return IP will point to freed memory causing the crash.
Solve this by adding percpu_ref_get/put for the duration of trampoline
and separate trampoline vs its image life times.
The "half page" optimization has to be removed, since
first_half->second_half->first_half transition cannot be guaranteed to
complete in deterministic time. Every trampoline update becomes a new image.
The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and
call_rcu_tasks. Together they will wait for the original function and
trampoline asm to complete. The trampoline is patched from nop to jmp to skip
fexit progs. They are freed independently from the trampoline. The image with
fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which
will wait for both sleepable and non-sleepable progs to complete.
Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline")
Reported-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Paul E. McKenney <[email protected]> # for RCU
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Currently, napi_thread_wait() checks for NAPI_STATE_SCHED bit to
determine if the kthread owns this napi and could call napi->poll() on
it. However, if socket busy poll is enabled, it is possible that the
busy poll thread grabs this SCHED bit (after the previous napi->poll()
invokes napi_complete_done() and clears SCHED bit) and tries to poll
on the same napi. napi_disable() could grab the SCHED bit as well.
This patch tries to fix this race by adding a new bit
NAPI_STATE_SCHED_THREADED in napi->state. This bit gets set in
____napi_schedule() if the threaded mode is enabled, and gets cleared
in napi_complete_done(), and we only poll the napi in kthread if this
bit is set. This helps distinguish the ownership of the napi between
kthread and other scenarios and fixes the race issue.
Fixes: 29863d41bb6e ("net: implement threaded-able napi poll loop support")
Reported-by: Martin Zaharinov <[email protected]>
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We have seen a couple cases where low memory situations cause something
bad to happen, followed by a flood of these messages obscuring the root
cause. Lets ratelimit the dmesg spam so that next time it happens we
don't lose the kernel traces leading up to this.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
|
|
Fix up test_verifier error messages for the case where the original error
message changed, or for the case where pointer alu errors differ between
privileged and unprivileged tests. Also, add alternative tests for keeping
coverage of the original verifier rejection error message (fp alu), and
newly reject map_ptr += rX where rX == 0 given we now forbid alu on these
types for unprivileged. All test_verifier cases pass after the change. The
test case fixups were kept separate to ease backporting of core changes.
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
|
|
Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
|
|
In the situations where the DWC3 gadget stops active transfers, once
calling the dwc3_gadget_giveback(), there is a chance where a function
driver can queue a new USB request in between the time where the dwc3
lock has been released and re-aquired. This occurs after we've already
issued an ENDXFER command. When the stop active transfers continues
to remove USB requests from all dep lists, the newly added request will
also be removed, while controller still has an active TRB for it.
This can lead to the controller accessing an unmapped memory address.
Fix this by ensuring parameters to prevent EP queuing are set before
calling the stop active transfers API.
Fixes: ae7e86108b12 ("usb: dwc3: Stop active transfers before halting the controller")
Signed-off-by: Wesley Cheng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
tcpm-source-psy- does not invoke power_supply_changed API when
one of the published power supply properties is changed.
power_supply_changed needs to be called to notify
userspace clients(uevents) and kernel clients.
Fixes: f2a8aa053c176 ("typec: tcpm: Represent source supply through power_supply")
Reviewed-by: Guenter Roeck <[email protected]>
Reviewed-by: Heikki Krogerus <[email protected]>
Signed-off-by: Badhri Jagan Sridharan <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Remove the unused "u32 vdo[3]" part in the tps6598x_rx_identity_reg
struct. This helps avoid "failed to register partner" errors which
happen when tps6598x_read_partner_identity() fails because the
amount of data read is 12 bytes smaller than the struct size.
Note that vdo[3] is already in usb_pd_identity and hence
shouldn't be added to tps6598x_rx_identity_reg as well.
Fixes: f6c56ca91b92 ("usb: typec: Add the Product Type VDOs to struct usb_pd_identity")
Reviewed-by: Heikki Krogerus <[email protected]>
Reviewed-by: Guido Günther <[email protected]>
Signed-off-by: Elias Rudberg <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Matthias reports that the Amazon Kindle automatically removes its
emulated media if it doesn't receive another SCSI command within about
one second after a SYNCHRONIZE CACHE. It does so even when the host
has sent a PREVENT MEDIUM REMOVAL command. The reason for this
behavior isn't clear, although it's not hard to make some guesses.
At any rate, the results can be unexpected for anyone who tries to
access the Kindle in an unusual fashion, and in theory they can lead
to data loss (for example, if one file is closed and synchronized
while other files are still in the middle of being written).
To avoid such problems, this patch creates a new usb-storage quirks
flag telling the driver always to issue a REQUEST SENSE following a
SYNCHRONIZE CACHE command, and adds an unusual_devs entry for the
Kindle with the flag set. This is sufficient to prevent the Kindle
from doing its automatic unload, without interfering with proper
operation.
Another possible way to deal with this would be to increase the
frequency of TEST UNIT READY polling that the kernel normally carries
out for removable-media storage devices. However that would increase
the overall load on the system and it is not as reliable, because the
user can override the polling interval. Changing the driver's
behavior is safer and has minimal overhead.
CC: <[email protected]>
Reported-and-tested-by: Matthias Schwarzott <[email protected]>
Signed-off-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When gadget is disconnected, running sequence is like this.
. composite_disconnect
. Call trace:
usb_string_copy+0xd0/0x128
gadget_config_name_configuration_store+0x4
gadget_config_name_attr_store+0x40/0x50
configfs_write_file+0x198/0x1f4
vfs_write+0x100/0x220
SyS_write+0x58/0xa8
. configfs_composite_unbind
. configfs_composite_bind
In configfs_composite_bind, it has
"cn->strings.s = cn->configuration;"
When usb_string_copy is invoked. it would
allocate memory, copy input string, release previous pointed memory space,
and use new allocated memory.
When gadget is connected, host sends down request to get information.
Call trace:
usb_gadget_get_string+0xec/0x168
lookup_string+0x64/0x98
composite_setup+0xa34/0x1ee8
If gadget is disconnected and connected quickly, in the failed case,
cn->configuration memory has been released by usb_string_copy kfree but
configfs_composite_bind hasn't been run in time to assign new allocated
"cn->configuration" pointer to "cn->strings.s".
When "strlen(s->s) of usb_gadget_get_string is being executed, the dangling
memory is accessed, "BUG: KASAN: use-after-free" error occurs.
Cc: [email protected]
Signed-off-by: Jim Lin <[email protected]>
Signed-off-by: Macpaul Lin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Currently udc->ud.tcp_rx is being assigned twice, the second assignment
is incorrect, it should be to udc->ud.tcp_tx instead of rx. Fix this.
Fixes: 46613c9dfa96 ("usbip: fix vudc usbip_sockfd_store races leading to gpf")
Acked-by: Shuah Khan <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Cc: stable <[email protected]>
Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
MODULE_SUPPORTED_DEVICE was added in pre-git era and never was
implemented. We can safely remove it, because the kernel has grown
to have many more reliable mechanisms to determine if device is
supported or not.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fix from Thomas Bogendoerfer:
"Fix for fdt alignment when image is compressed"
* tag 'mips-fixes_5.12_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: vmlinux.lds.S: Fix appended dtb not properly aligned
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal framework fix from Daniel Lezcano:
"Fix NULL pointer access when the cooling device transition stats
table failed to allocate due to a big number of states (Manaf
Meethalavalappu Pallikunhi)"
* tag 'thermal-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/core: Add NULL pointer check before using cooling device stats
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
First round of fixes for 5.12-rc:
* HE (802.11ax) elements can be extended, handle that
* fix locking in network namespace changes that was
broken due to the RTNL-redux work
* various other small fixes
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The ct_state validate should not only check the mask bit and also
check mask_bit & key_bit..
For the +new+est case example, The 'new' and 'est' bits should be
set in both state_mask and state flags. Or the -new-est case also
will be reject by kernel.
When Openvswitch with two flows
ct_state=+trk+new,action=commit,forward
ct_state=+trk+est,action=forward
A packet go through the kernel and the contrack state is invalid,
The ct_state will be +trk-inv. Upcall to the ovs-vswitchd, the
finally dp action will be drop with -new-est+trk.
Fixes: 1bcc51ac0731 ("net/sched: cls_flower: Reject invalid ct_state flags rules")
Fixes: 3aed8b63336c ("net/sched: cls_flower: validate ct_state for invalid and reply flags")
Signed-off-by: wenxu <[email protected]>
Reviewed-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
During the mount procedure we are calling btrfs_orphan_cleanup() against
the root tree, which will find all orphans items in this tree. When an
orphan item corresponds to a deleted subvolume/snapshot (instead of an
inode space cache), it must not delete the orphan item, because that will
cause btrfs_find_orphan_roots() to not find the orphan item and therefore
not add the corresponding subvolume root to the list of dead roots, which
results in the subvolume's tree never being deleted by the cleanup thread.
The same applies to the remount from RO to RW path.
Fix this by making btrfs_find_orphan_roots() run before calling
btrfs_orphan_cleanup() against the root tree.
A test case for fstests will follow soon.
Reported-by: Robbie Ko <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
Fixes: 638331fa56caea ("btrfs: fix transaction leak and crash after cleaning up orphans on RO mount")
CC: [email protected] # 5.11+
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
There are people building the module with M= that's supposed to be used
for external modules. This got broken in e9aa7c285d20 ("btrfs: enable
W=1 checks for btrfs").
$ make M=fs/btrfs
scripts/Makefile.lib:10: *** Recursive variable 'KBUILD_CFLAGS' references itself (eventually). Stop.
make: *** [Makefile:1755: modules] Error 2
There's a difference compared to 'make fs/btrfs/btrfs.ko' which needs
to rebuild a few more things and also the dependency modules need to be
available. It could fail with eg.
WARNING: Symbol version dump "Module.symvers" is missing.
Modules may not have dependencies or modversions.
In some environments it's more convenient to rebuild just the btrfs
module by M= so let's make it work.
The problem is with recursive variable evaluation in += so the
conditional C options are stored in a temporary variable to avoid the
recursion.
Signed-off-by: David Sterba <[email protected]>
|
|
While helping Neal fix his broken file system I added a debug patch to
catch if we were calling btrfs_search_slot with a NULL root, and this
stack trace popped:
we tried to search with a NULL root
CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
Call Trace:
dump_stack+0x6b/0x83
btrfs_search_slot.cold+0x11/0x1b
? btrfs_init_dev_replace+0x36/0x450
btrfs_init_dev_replace+0x71/0x450
open_ctree+0x1054/0x1610
btrfs_mount_root.cold+0x13/0xfa
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x131/0x3d0
? legacy_get_tree+0x27/0x40
? btrfs_show_options+0x640/0x640
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
path_mount+0x441/0xa80
__x64_sys_mount+0xf4/0x130
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f644730352e
Fix this by not starting the device replace stuff if we do not have a
NULL dev root.
Reported-by: Neal Gompa <[email protected]>
CC: [email protected] # 5.11+
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Neal reported a panic trying to use -o rescue=all
BUG: kernel NULL pointer dereference, address: 0000000000000030
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 696 Comm: mount Tainted: G W 5.12.0-rc2+ #296
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
FS: 00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
Call Trace:
? btrfs_init_dev_stats+0x1f/0xf0
? kmem_cache_alloc+0xef/0x1f0
btrfs_init_dev_stats+0x5f/0xf0
open_ctree+0x10cb/0x1720
btrfs_mount_root.cold+0x12/0xea
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x10d/0x380
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
path_mount+0x433/0xa00
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
This happens because when we call btrfs_init_dev_stats we do
device->fs_info->dev_root. However device->fs_info isn't initialized
because we were only calling btrfs_init_devices_late() if we properly
read the device root. However we don't actually need the device root to
init the devices, this function simply assigns the devices their
->fs_info pointer properly, so this needs to be done unconditionally
always so that we can properly dereference device->fs_info in rescue
cases.
Reported-by: Neal Gompa <[email protected]>
CC: [email protected] # 5.11+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Neal reported a panic trying to use -o rescue=all
BUG: kernel NULL pointer dereference, address: 0000000000000030
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1
RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0
RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10
RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000
RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000
R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00
R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10
FS: 00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0
Call Trace:
? btrfs_init_dev_stats+0x1f/0xf0
btrfs_init_dev_stats+0x62/0xf0
open_ctree+0x1019/0x15ff
btrfs_mount_root.cold+0x13/0xfa
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x131/0x3d0
? legacy_get_tree+0x27/0x40
? btrfs_show_options+0x640/0x640
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
path_mount+0x441/0xa80
__x64_sys_mount+0xf4/0x130
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f698c04e52e
This happens because we unconditionally attempt to initialize device
stats on mount, but we may not have been able to read the device root.
Fix this by skipping initializing the device stats if we do not have a
device root.
Reported-by: Neal Gompa <[email protected]>
CC: [email protected] # 5.11+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
In btrfs_submit_direct() there's a WAN_ON_ONCE() that will trigger if
we're submitting a DIO write on a zoned filesystem but are not using
REQ_OP_ZONE_APPEND to submit the IO to the block device.
This is a left over from a previous version where btrfs_dio_iomap_begin()
didn't use btrfs_use_zone_append() to check for sequential write only
zones.
It is an oversight from the development phase. In v11 (I think) I've
added 08f455593fff ("btrfs: zoned: cache if block group is on a
sequential zone") and forgot to remove the WARN_ON_ONCE() for
544d24f9de73 ("btrfs: zoned: enable zone append writing for direct IO").
When developing auto relocation I got hit by the WARN as a block groups
where relocated to conventional zone and the dio code calls
btrfs_use_zone_append() introduced by 08f455593fff to check if it can
use zone append (a.k.a. if it's a sequential zone) or not and sets the
appropriate flags for iomap.
I've never hit it in testing before, as I was relying on emulation to
test the conventional zones code but this one case wasn't hit, because
on emulation fs_info->max_zone_append_size is 0 and the WARN doesn't
trigger either.
Fixes: 544d24f9de73 ("btrfs: zoned: enable zone append writing for direct IO")
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We were linearizing non-TSO skbs that had too many frags, but
we weren't checking number of frags on TSO skbs. This could
lead to a bad page reference when we received a TSO skb with
more frags than the Tx descriptor could support.
v2: use gso_segs rather than yet another division
don't rework the check on the nr_frags
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
|
|
retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.
When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.
Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
|
|
The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.
The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.
Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.
Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
|
|
While passing the A530-specific lm_setup func to A530 and A540
to !A530 was fine back when only these two were supported, it
certainly is not a good idea to send A540 specifics to smaller
GPUs like A508 and friends.
Signed-off-by: Konrad Dybcio <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
In commit 9fc418430c65 ("drm/msm/dp: unplug interrupt missed after
irq_hpd handler") we dropped a reset of the aux phy during aux transfers
because resetting the phy during active communication caused us to miss
an hpd irq in some cases. Unfortunately, we also dropped the part of the
code that changes the aux phy tuning when an aux transfer fails due to a
timeout. That part of the code was calling into the phy driver to
reconfigure the aux TX swing controls, working around poor channel
quality. Let's restore this phy setting code so that aux channel
communication is more reliable.
Cc: Kuogee Hsieh <[email protected]>
Fixes: 9fc418430c65 ("drm/msm/dp: unplug interrupt missed after irq_hpd handler")
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|