Age | Commit message (Collapse) | Author | Files | Lines |
|
BPF_ALU64 div/mod operations are currently using signed division, unlike
BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass
with this fix.
Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Cc: [email protected] # v4.8+
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
If the result of the division is LLONG_MIN, current tests do not detect
the error since the return value is truncated to a 32-bit value and ends
up being 0.
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Sync the changes to the flags made in "bpf: simplify definition of
BPF_FIB_LOOKUP related flags" with the BPF UAPI headers.
Doing in a separate commit to ease syncing of github/libbpf.
Signed-off-by: Martynas Pumputis <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Previously, the BPF_FIB_LOOKUP_{DIRECT,OUTPUT} flags in the BPF UAPI
were defined with the help of BIT macro. This had the following issues:
- In order to use any of the flags, a user was required to depend
on <linux/bits.h>.
- No other flag in bpf.h uses the macro, so it seems that an unwritten
convention is to use (1 << (nr)) to define BPF-related flags.
Fixes: 87f5fc7e48dd ("bpf: Provide helper to do forwarding lookups in kernel FIB table")
Signed-off-by: Martynas Pumputis <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.
The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.
To avoid this we need to have a spinlock attached to the inode and not the tcon.
RHBZ: 1580165
CC: Stable <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
RH Bugzilla: 1702264
We need to protect so that the call to smb2_reconnect() in
smb2_reconnect_server() does not end up freeing the session
because it can lead to a use after free and crash.
Reviewed-by: Aurelien Aptel <[email protected]>
Cc: <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
current->mm can be non-NULL if a kthread calls use_mm(). Check for
PF_KTHREAD instead to decide when to store user mode FP state.
Fixes: 2722146eb784 ("x86/fpu: Remove fpu->initialized")
Reported-by: Peter Zijlstra <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Aubrey Li <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Nicolai Stange <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull timer fixes from Daniel Lezcano:
- Fix missing notrace leading to deadlock on arch_arm_timer (Julien Thierry)
- Fix compilation warning on timer-ti-dm (Philippe Mazenauer)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- regression fixes (reverts) for module loading changes that turned out
to be incompatible with some userspace, from Benjamin Tissoires
- regression fix for special Logitech unifiying receiver 0xc52f, from
Hans de Goede
- a few device ID additions to logitech driver, from Hans de Goede
- fix for Bluetooth support on 2nd-gen Wacom Intuos Pro, from Jason
Gerecke
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-dj: Fix 064d:c52f receiver support
Revert "HID: core: Call request_module before doing device_add"
Revert "HID: core: Do not call request_module() in async context"
Revert "HID: Increase maximum report size allowed by hid_field_extract()"
HID: a4tech: fix horizontal scrolling
HID: hyperv: Add a module description line
HID: logitech-hidpp: Add support for the S510 remote control
HID: multitouch: handle faulty Elo touch device
HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary
HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth
HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact
HID: wacom: Don't report anything prior to the tool entering range
HID: wacom: Don't set tool type until we're in range
HID: rmi: Use SET_REPORT request on control endpoint for Acer Switch 3 and 5
HID: logitech-hidpp: add support for the MX5500 keyboard
HID: logitech-dj: add support for the Logitech MX5500's Bluetooth Mini-Receiver
HID: i2c-hid: add iBall Aer3 to descriptor override
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.2
There's an awful lot of fixes here, almost all for the newly introduced
SoF DSP drivers (including a few things it turned up in shared code).
This is a large and complex piece of code so it's not surprising that
there have been quite a few issues here, fortunately things seem to have
mostly calmed down now. Otherwise there's just a smattering of small fixes.
|
|
Unfortunately, a couple of mistakes were made while implementing
Enlightened VMCS support, in particular, wrong clean fields were
used in copy_enlightened_to_vmcs12():
- exception_bitmap is covered by CONTROL_EXCPN;
- vm_exit_controls/pin_based_vm_exec_control/secondary_vm_exec_control
are covered by CONTROL_GRP1.
Fixes: 945679e301ea0 ("KVM: nVMX: add enlightened VMCS state")
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Free the vfio_ccw_cmd_region on module exit.
Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions")
Signed-off-by: Farhan Ali <[email protected]>
Message-Id: <c0f39039d28af39ea2939391bf005e3495d890fd.1559576250.git.alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.2-rc5:
- Fix DMC firmware input validation to avoid buffer overflow
- Fix perf register access whitelist for userspace
- Fix DSI panel on GPD MicroPC
- Fix per-pixel alpha with CCS
- Fix HDMI audio for SDVO
Signed-off-by: Daniel Vetter <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The removal of CONFIG_LBDAF changed the type of sector_t from "unsigned
long" to "u64" aka "unsigned long long" on 64-bit platforms, leading to
a compiler warning regression:
drivers/block/ps3vram.c: In function ‘ps3vram_probe’:
drivers/block/ps3vram.c:770:23: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘sector_t {aka long long unsigned int}’ [-Wformat=]
Fix this by using "%llu" instead.
Fixes: 72deb455b5ec619f ("block: remove CONFIG_LBDAF")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We've received a bugreport that using LPM with ST1000LM024 drives leads
to system lockups. So it seems that these models are buggy in more then
1 way. Add NOLPM quirk to the existing quirks entry for BROKEN_FPDMA_AA.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1571330
Cc: [email protected]
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
When people set a writeback percent via sysfs file,
/sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.
If there is no cache set attached to, the writeback kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().
This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).
This problem might be introduced from initial bcache commit, but
commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
changes part of the original code piece, so I add 'Fixes: 3fd47bfe55b0'
to indicate from which commit this patch can be applied.
Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
Reported-by: Bjørn Forsman <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Reviewed-by: Bjørn Forsman <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Recently people report bcache code compiled with gcc9 is broken, one of
the buggy behavior I observe is that two adjacent 4KB I/Os should merge
into one but they don't. Finally it turns out to be a stack corruption
caused by macro PRECEDING_KEY().
See how PRECEDING_KEY() is defined in bset.h,
437 #define PRECEDING_KEY(_k) \
438 ({ \
439 struct bkey *_ret = NULL; \
440 \
441 if (KEY_INODE(_k) || KEY_OFFSET(_k)) { \
442 _ret = &KEY(KEY_INODE(_k), KEY_OFFSET(_k), 0); \
443 \
444 if (!_ret->low) \
445 _ret->high--; \
446 _ret->low--; \
447 } \
448 \
449 _ret; \
450 })
At line 442, _ret points to address of a on-stack variable combined by
KEY(), the life range of this on-stack variable is in line 442-446,
once _ret is returned to bch_btree_insert_key(), the returned address
points to an invalid stack address and this address is overwritten in
the following called bch_btree_iter_init(). Then argument 'search' of
bch_btree_iter_init() points to some address inside stackframe of
bch_btree_iter_init(), exact address depends on how the compiler
allocates stack space. Now the stack is corrupted.
Fixes: 0eacac22034c ("bcache: PRECEDING_KEY()")
Signed-off-by: Coly Li <[email protected]>
Reviewed-by: Rolf Fokkens <[email protected]>
Reviewed-by: Pierre JUHEN <[email protected]>
Tested-by: Shenghui Wang <[email protected]>
Tested-by: Pierre JUHEN <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Nix <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The in-memory representation of SVE and FPSIMD registers is
different: the FPSIMD V-registers are stored as single 128-bit
host-endian values, whereas SVE registers are stored in an
endianness-invariant byte order.
This means that the two representations differ when running on a
big-endian host. But we blindly copy data from one representation
to another when converting between the two, resulting in the
register contents being unintentionally byteswapped in certain
situations. Currently this can be triggered by the first SVE
instruction after a syscall, for example (though the potential
trigger points may vary in future).
So, fix the conversion functions fpsimd_to_sve(), sve_to_fpsimd()
and sve_sync_from_fpsimd_zeropad() to swab where appropriate.
There is no common swahl128() or swab128() that we could use here.
Maybe it would be worth making this generic, but for now add a
simple local hack.
Since the byte order differences are exposed in ABI, also clarify
the documentation.
Cc: Alex Bennée <[email protected]>
Cc: Peter Maydell <[email protected]>
Cc: Alan Hayward <[email protected]>
Cc: Julien Grall <[email protected]>
Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support")
Signed-off-by: Dave Martin <[email protected]>
[will: Fix typos in comments and docs spotted by Julien]
Signed-off-by: Will Deacon <[email protected]>
|
|
blk_mq_sched_free_requests() may be called in failure path in which
q->elevator may not be setup yet, so remove WARN_ON(!q->elevator) from
blk_mq_sched_free_requests for avoiding the false positive.
This function is actually safe to call in case of !q->elevator because
hctx->sched_tags is checked.
Cc: Bart Van Assche <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Yi Zhang <[email protected]>
Fixes: c3e2219216c9 ("block: free sched's request pool in blk_cleanup_queue")
Reported-by: [email protected]
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
CFQ is gone. No need anymore to document its "proportional weight time
based division of disk policy".
Signed-off-by: Andreas Herrmann <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Remove references to CFQ and legacy block layer which are gone.
Update example with what's available under blk-mq.
Signed-off-by: Andreas Herrmann <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch removes the check in the null_blk_zoned for report zone
command, where it checks for the dev-,>zoned before executing the report
zone.
The null_zone_report() function is a block_device operation callback
which is initialized in the null_blk_main.c and gets called as a part
of blkdev for report zone IOCTL (BLKREPORTZONE).
blkdev_ioctl()
blkdev_report_zones_ioctl()
blkdev_report_zones()
blk_report_zones()
disk->fops->report_zones()
nullb_zone_report();
The null_zone_report() will never get executed on the non-zoned block
device, in the non zoned block device blk_queue_is_zoned() will always
be false which is first check the blkdev_report_zones_ioctl()
before actual low level driver report zone callback is executed.
Here is the detailed scenario:-
1. modprobe null_blk
null_init
null_alloc_dev
dev->zoned = 0
null_add_dev
dev->zoned == 0
so we don't set the q->limits.zoned = BLK_ZONED_HR
2. blkzone report /dev/nullb0
blkdev_ioctl()
blkdev_report_zones_ioctl()
blk_queue_is_zoned()
blk_queue_is_zoned
q->limits.zoned == 0
return false
if (!blk_queue_is_zoned(q)) <--- true
return -ENOTTY;
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Bob Liu <[email protected]>
Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
When all of these checks are cleaned up, lots of the functions used in
the blk-mq-debugfs code can now return void, as no need to check the
return value of them either.
Overall, this ends up cleaning up the code and making it smaller, always
a nice win.
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Opening and closing an io_uring instance leaks a UNIX domain socket
inode. This is because the ->file of the io_uring instance's internal
UNIX domain socket is set to point to the io_uring file, but then
sock_release() sees the non-NULL ->file and assumes the inode reference
is held by the file so doesn't call iput(). That's not the case here,
since the reference is still meant to be held by the socket; the actual
inode of the io_uring file is different.
Fix this leak by NULL-ing out ->file before releasing the socket.
Reported-by: [email protected]
Fixes: 2b188cc1bb85 ("Add io_uring IO interface")
Cc: <[email protected]> # v5.1+
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
In most use cases of zoned block devices (aka SMR disks), the
mq-deadline scheduler is mandatory as it implements sequential write
command processing guarantees with zone write locking. So make sure that
this scheduler is always enabled if CONFIG_BLK_DEV_ZONED is selected.
Tested-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
There is a race between the binder driver cleaning
up a completed transaction via binder_free_transaction()
and a user calling binder_ioctl(BC_FREE_BUFFER) to
release a buffer. It doesn't matter which is first but
they need to be protected against running concurrently
which can result in a UAF.
Signed-off-by: Todd Kjos <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Three patches for v5.2.
One fixes a problem where we weren't correctly logging raw SELinux
labels, the other two fix problems where we weren't properly checking
calls to kmemdup()"
* tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts()
selinux: fix a missing-check bug in selinux_add_mnt_opt( )
selinux: log raw contexts as untrusted strings
|
|
Fixes SI cards running on amdgpu.
Fixes: 1929059893022 ("drm/amd/amdgpu: add RLC firmware to support raven1 refresh")
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110883
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The "block" variable can be set by the user through debugfs, so it can
be quite large which leads to shift wrapping here. This means we report
a "block" as supported when it's not, and that leads to array overflows
later on.
This bug is not really a security issue in real life, because debugfs is
generally root only.
Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
They are capable of using intertouch and it works well with
psmouse.synaptics_intertouch=1, so add them to the list.
Without it, scrolling and gestures are jumpy, three-finger pinch gesture
doesn't work and three- or four-finger swipes sometimes get stuck.
Signed-off-by: Alexander Mikhaylenko <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
Maxime Chevallier says:
====================
net: mvpp2: prs: Fixes for VID filtering
This series fixes some issues with VID filtering offload, mainly due to
the wrong ranges being used in the TCAM header parser.
The first patch fixes a bug where removing a VLAN from a port's
whitelist would also remove it from other port's, if they are on the
same PPv2 instance.
The second patch makes so that we don't invalidate the wrong TCAM
entries when clearing the whole whitelist.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be
called with the TCAM id incorrectly used as a VID, causing the wrong
TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
VID filtering is implemented in the Header Parser, with one range of 11
vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in
the Parser.
Since we used a global range instead of a per-port one, this causes VIDs
to be removed from the whitelist from all ports of the same PPv2
instance.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Ido Schimmel says:
====================
mlxsw: Various fixes
This patchset contains various fixes for mlxsw.
Patch #1 fixes an hash polarization problem when a nexthop device is a
LAG device. This is caused by the fact that the same seed is used for
the LAG and ECMP hash functions.
Patch #2 fixes an issue in which the driver fails to refresh a nexthop
neighbour after it becomes dead. This prevents the nexthop from ever
being written to the adjacency table and used to forward traffic. Patch
Patch #4 fixes a wrong extraction of TOS value in flower offload code.
Patch #5 is a test case.
Patch #6 works around a buffer issue in Spectrum-2 by reducing the
default sizes of the shared buffer pools.
Patch #7 prevents prio-tagged packets from entering the switch when PVID
is removed from the bridge port.
Please consider patches #2, #4 and #6 for 5.1.y
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When PVID is removed from a bridge port, the Linux bridge drops both
untagged and prio-tagged packets. Align mlxsw with this behavior.
Fixes: 148f472da5db ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register")
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out
of 32 port buffers cannot be used. To work around this, the next FW release
will mark them as unused, and will report correspondingly lower total
shared buffer size. mlxsw will pick up the new value through a query to
cap_total_buffer_size resource. However the initial size for shared buffer
pool 0 is hard-coded and therefore needs to be updated.
Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the
total size of 42 MiB), and round down to the whole number of cells.
Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration")
Signed-off-by: Petr Machata <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The TOS value was not extracted correctly. Fix it.
Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos")
Reported-by: Alexander Petrovskiy <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Test that IPv4 and IPv6 nexthops are correctly marked with offload
indication in response to neighbour events.
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The driver tries to periodically refresh neighbours that are used to
reach nexthops. This is done by periodically calling neigh_event_send().
However, if the neighbour becomes dead, there is nothing we can do to
return it to a connected state and the above function call is basically
a NOP.
This results in the nexthop never being written to the device's
adjacency table and therefore never used to forward packets.
Fix this by dropping our reference from the dead neighbour and
associating the nexthop with a new neigbhour which we will try to
refresh.
Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Alex Veber <[email protected]>
Tested-by: Alex Veber <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The same hash function and seed are used for both ECMP and LAG hash.
Therefore, when a LAG device is used as a nexthop device as part of an
ECMP group, hash polarization can occur and all the traffic will be
hashed to a single LAG slave.
Fix this by using a different seed for the LAG hash.
Fixes: fa73989f2697 ("mlxsw: spectrum: Use a stable ECMP/LAG seed")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Alex Veber <[email protected]>
Tested-by: Alex Veber <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tls_sw_do_sendpage needs to return the total number of bytes sent
regardless of how many sk_msgs are allocated. Unfortunately, copied
(the value we return up the stack) is zero'd before each new sk_msg
is allocated so we only return the copied size of the last sk_msg used.
The caller (splice, etc.) of sendpage will then believe only part
of its data was sent and send the missing chunks again. However,
because the data actually was sent the receiver will get multiple
copies of the same data.
To reproduce this do multiple sendfile calls with a length close to
the max record size. This will in turn call splice/sendpage, sendpage
may use multiple sk_msg in this case and then returns the incorrect
number of bytes. This will cause splice to resend creating duplicate
data on the receiver. Andre created a C program that can easily
generate this case so we will push a similar selftest for this to
bpf-next shortly.
The fix is to _not_ zero the copied field so that the total sent
bytes is returned.
Reported-by: Steinar H. Gunderson <[email protected]>
Reported-by: Andre Tomt <[email protected]>
Tested-by: Andre Tomt <[email protected]>
Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Get the ingress interface and increment ICMP counters based on that
instead of skb->dev when the the dev is a VRF device.
This is a follow up on the following message:
https://www.spinics.net/lists/netdev/msg560268.html
v2: Avoid changing skb->dev since it has unintended effect for local
delivery (David Ahern).
Signed-off-by: Stephen Suryaputra <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the case that a process is constrained by taskset(1) (i.e.
sched_setaffinity(2)) to a subset of available cpus, and all of those are
subsequently offlined, the scheduler will set tsk->cpus_allowed to
the current value of task_cs(tsk)->effective_cpus.
This is done via a call to do_set_cpus_allowed() in the context of
cpuset_cpus_allowed_fallback() made by the scheduler when this case is
detected. This is the only call made to cpuset_cpus_allowed_fallback()
in the latest mainline kernel.
However, this is not sane behavior.
I will demonstrate this on a system running the latest upstream kernel
with the following initial configuration:
# grep -i cpu /proc/$$/status
Cpus_allowed: ffffffff,fffffff
Cpus_allowed_list: 0-63
(Where cpus 32-63 are provided via smt.)
If we limit our current shell process to cpu2 only and then offline it
and reonline it:
# taskset -p 4 $$
pid 2272's current affinity mask: ffffffffffffffff
pid 2272's new affinity mask: 4
# echo off > /sys/devices/system/cpu/cpu2/online
# dmesg | tail -3
[ 2195.866089] process 2272 (bash) no longer affine to cpu2
[ 2195.872700] IRQ 114: no longer affine to CPU2
[ 2195.879128] smpboot: CPU 2 is now offline
# echo on > /sys/devices/system/cpu/cpu2/online
# dmesg | tail -1
[ 2617.043572] smpboot: Booting Node 0 Processor 2 APIC 0x4
We see that our current process now has an affinity mask containing
every cpu available on the system _except_ the one we originally
constrained it to:
# grep -i cpu /proc/$$/status
Cpus_allowed: ffffffff,fffffffb
Cpus_allowed_list: 0-1,3-63
This is not sane behavior, as the scheduler can now not only place the
process on previously forbidden cpus, it can't even schedule it on
the cpu it was originally constrained to!
Other cases result in even more exotic affinity masks. Take for instance
a process with an affinity mask containing only cpus provided by smt at
the moment that smt is toggled, in a configuration such as the following:
# taskset -p f000000000 $$
# grep -i cpu /proc/$$/status
Cpus_allowed: 000000f0,00000000
Cpus_allowed_list: 36-39
A double toggle of smt results in the following behavior:
# echo off > /sys/devices/system/cpu/smt/control
# echo on > /sys/devices/system/cpu/smt/control
# grep -i cpus /proc/$$/status
Cpus_allowed: ffffff00,ffffffff
Cpus_allowed_list: 0-31,40-63
This is even less sane than the previous case, as the new affinity mask
excludes all smt-provided cpus with ids less than those that were
previously in the affinity mask, as well as those that were actually in
the mask.
With this patch applied, both of these cases end in the following state:
# grep -i cpu /proc/$$/status
Cpus_allowed: ffffffff,ffffffff
Cpus_allowed_list: 0-63
The original policy is discarded. Though not ideal, it is the simplest way
to restore sanity to this fallback case without reinventing the cpuset
wheel that rolls down the kernel just fine in cgroup v2. A user who wishes
for the previous affinity mask to be restored in this fallback case can use
that mechanism instead.
This patch modifies scheduler behavior by instead resetting the mask to
task_cs(tsk)->cpus_allowed by default, and cpu_possible mask in legacy
mode. I tested the cases above on both modes.
Note that the scheduler uses this fallback mechanism if and only if
_every_ other valid avenue has been traveled, and it is the last resort
before calling BUG().
Suggested-by: Waiman Long <[email protected]>
Suggested-by: Phil Auld <[email protected]>
Signed-off-by: Joel Savitz <[email protected]>
Acked-by: Phil Auld <[email protected]>
Acked-by: Waiman Long <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Using ethtool, users can specify a classification action matching on the
full vlan tag, which includes the DEI bit (also previously called CFI).
However, when converting the ethool_flow_spec to a flow_rule, we use
dissector keys to represent the matching patterns.
Since the vlan dissector key doesn't include the DEI bit, this
information was silently discarded when translating the ethtool
flow spec in to a flow_rule.
This commit adds the DEI bit into the vlan dissector key, and allows
propagating the information to the driver when parsing the ethtool flow
spec.
Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
Reported-by: Michał Mirosław <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch fixes a spelling typo in rds.txt
Signed-off-by: Masanari Iida <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
err must be nonzero in order to reach text_poke(), which caused kgdb to
fail to set breakpoints:
(gdb) break __x64_sys_sync
Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124.
(gdb) c
Continuing.
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0xffffffff81288910
Command aborted.
Fixes: 86a22057127d ("x86/kgdb: Avoid redundant comparison of patched code")
Signed-off-by: Matt Mullins <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Nadav Amit <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Daniel Thompson <[email protected]>
Cc: Douglas Anderson <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Rick Edgecombe <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Randy reported that selecting MPLS_ROUTING without PROC_FS breaks
the build, because since commit c1a9d65954c6 ("mpls: fix af_mpls
dependencies"), MPLS_ROUTING selects PROC_SYSCTL, but Kconfig's select
doesn't recursively handle dependencies.
Change the select into a dependency.
Fixes: c1a9d65954c6 ("mpls: fix af_mpls dependencies")
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Matteo Croce <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In selinux_sb_eat_lsm_opts(), 'arg' is allocated by kmemdup_nul(). It
returns NULL when fails. So 'arg' should be checked. And 'mnt_opts'
should be freed when error.
Signed-off-by: Gen Zhang <[email protected]>
Fixes: 99dbbb593fe6 ("selinux: rewrite selinux_sb_eat_lsm_opts()")
Cc: <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes
CK writes:
This include unbind error fix, clock control flow refinement, and PRIME
mmap with page offset.
Signed-off-by: Daniel Vetter <[email protected]>
From: CK Hu <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/1560325868.3259.6.camel@mtksdaap41
|