Age | Commit message (Collapse) | Author | Files | Lines |
|
Xen is requiring 64-bit machines today and since Xen 4.14 it can be
built without 32-bit PV guest support. There is no need to carry the
burden of 32-bit PV guest support in the kernel any longer, as new
guests can be either HVM or PVH, or they can use a 64 bit kernel.
Remove the 32-bit Xen PV support from the kernel.
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-fixes-5.9-2020-08-07:
amdgpu:
- Re-add spelling typo fix
- Sienna Cichlid fixes
- Navy Flounder fixes
- DC fixes
- SMU i2c fix
- Power fixes
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"New features:
- Introduce controlling how 'perf stat' and 'perf record' works via a
control file descriptor, allowing starting with events configured
but disabled until commands are received via the control file
descriptor. This allows, for instance for tools such as Intel VTune
to make further use of perf as its Linux platform driver.
- Improve 'perf record' to to register in a perf.data file header the
clockid used to help later correlate things like syslog files and
perf events recorded.
- Add basic syscall and find_next_bit benchmarks to 'perf bench'.
- Allow using computed metrics in calculating other metrics. For
instance:
{
.metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit",
.metric_name = "DCache_L2_All_Hits",
},
{
.metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss",
.metric_name = "DCache_L2_All_Miss",
},
{
.metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss",
.metric_name = "DCache_L2_All",
}
- Add suport for 'd_ratio', '>' and '<' operators to the expression
resolver used in calculating metrics in 'perf stat'.
Support for new kernel features:
- Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope
with things like ftrace, trampolines, i.e. changes in the kernel
text that gets in the way of properly decoding Intel PT hardware
traces, for instance.
Intel PT:
- Add various knobs to reduce the volume of Intel PT traces by
reducing the level of details such as decoding just some types of
packets (e.g., FUP/TIP, PSB+), also filtering by time range.
- Add new itrace options (log flags to the 'd' option, error flags to
the 'e' one, etc), controlling how Intel PT is transformed into
perf events, document some missing options (e.g., how to synthesize
callchains).
BPF:
- Properly report BPF errors when parsing events.
- Do not setup side-band events if LIBBPF is not linked, fixing a
segfault.
Libraries:
- Improvements to the libtraceevent plugin mechanism.
- Improve libtracevent support for KVM trace events SVM exit reasons.
- Add a libtracevent plugins for decoding syscalls/sys_enter_futex
and for tlb_flush.
- Ensure sample_period is set libpfm4 events in 'perf test'.
- Fixup libperf namespacing, to make sure what is in libperf has the
perf_ namespace while what is now only in tools/perf/ doesn't use
that prefix.
Arch specific:
- Improve the testing of vendor events and metrics in 'perf test'.
- Allow no ARM CoreSight hardware tracer sink to be specified on
command line.
- Fix arm_spe_x recording when mixed with other perf events.
- Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of
idle symbols.
- List kernel supplied event aliases for arm64 in 'perf list'.
- Add support for extended register capability in PowerPC 9 and 10.
- Added nest IMC power9 metric events.
Miscellaneous:
- No need to setup sample_regs_intr/sample_regs_user for dummy
events.
- Update various copies of kernel headers, some causing perf to
handle new syscalls, MSRs, etc.
- Improve usage of flex and yacc, enabling warnings and addressing
the fallout.
- Add missing '--output' option to 'perf kmem' so that it can pass it
along to 'perf record'.
- 'perf probe' fixes related to adding multiple probes on the same
address for the same event.
- Make 'perf probe' warn if the target function is a GNU indirect
function.
- Remove //anon mmap events from 'perf inject jit' to fix supporting
both using ELF files for generated functions and the perf-PID.map
approaches"
* tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits)
perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
perf tools powerpc: Add support for extended regs in power10
perf tools powerpc: Add support for extended register capability
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools arch x86: Sync asm/cpufeatures.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: update linux/in.h copy
tools headers API: Update close_range affected files
perf script: Add 'tod' field to display time of day
perf script: Change the 'enum perf_output_field' enumerators to be 64 bits
perf data: Add support to store time of day in CTF data conversion
perf tools: Move clockid_res_ns under clock struct
perf header: Store clock references for -k/--clockid option
perf tools: Add clockid_name function
perf clockid: Move parse_clockid() to new clockid object
tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API
libtraceevent: Fixed description of tep_add_plugin_path() API
libtraceevent: Fixed type in PRINT_FMT_STING
libtraceevent: Fixed broken indentation in parse_ip4_print_args()
libtraceevent: Improve error handling of tep_plugin_add_option() API
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Have config-bisect save the good/bad configs at each step.
- Show log file location even on success
- Add PRE_TEST_DIE to kill test if the PRE_TEST fails
- Add a NOT operator for conditionals in config file
- Add the log output of the last test when emailing on failure.
- Other minor clean ups and small fixes.
* tag 'ktest-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest.pl: Fix spelling mistake "Cant" -> "Can't"
ktest.pl: Change the logic to control the size of the log file emailed
ktest.pl: Add MAIL_MAX_SIZE to limit the amount of log emailed
ktest.pl: Add the log of last test in email on failure
ktest.pl: Turn off buffering to the log file
ktest.pl: Just open up the log file once
ktest.pl: Add a NOT operator
ktest.pl: Define PRE_TEST_DIE to kill the test if the PRE_TEST fails
ktest.pl: Always show log file location if defined even on success
ktest.pl: Have config-bisect save each config used in the bisect
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Thomas Gleixner:
"A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in
various situations caused by the lockdep additions to seqcount to
validate that the write side critical sections are non-preemptible.
- The seqcount associated lock debug addons which were blocked by the
above fallout.
seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict
per CPU seqcounts. As the lock is not part of the seqcount, lockdep
cannot validate that the lock is held.
This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored
and write_seqcount_begin() has a lockdep assertion to validate that
the lock is held.
Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API
is unchanged and determines the type at compile time with the help
of _Generic which is possible now that the minimal GCC version has
been moved up.
Adding this lockdep coverage unearthed a handful of seqcount bugs
which have been addressed already independent of this.
While generally useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if
the writers are serialized by an associated lock, which leads to
the well known reader preempts writer livelock. RT prevents this by
storing the associated lock pointer independent of lockdep in the
seqcount and changing the reader side to block on the lock when a
reader detects that a writer is in the write side critical section.
- Conversion of seqcount usage sites to associated types and
initializers"
* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
locking/seqlock, headers: Untangle the spaghetti monster
locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
x86/headers: Remove APIC headers from <asm/smp.h>
seqcount: More consistent seqprop names
seqcount: Compress SEQCNT_LOCKNAME_ZERO()
seqlock: Fold seqcount_LOCKNAME_init() definition
seqlock: Fold seqcount_LOCKNAME_t definition
seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
hrtimer: Use sequence counter with associated raw spinlock
kvm/eventfd: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
vfs: Use sequence counter with associated spinlock
timekeeping: Use sequence counter with associated raw spinlock
xfrm: policy: Use sequence counters with associated lock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
netfilter: conntrack: Use sequence counter with associated spinlock
sched: tasks: Use sequence counter with associated spinlock
...
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
* backmerge from drm-fixes at v5.8-rc7
* add orientation quirk for ASUS T103HAF
* drm/omap: force runtime PM suspend on system suspend
* drm/tidss: fix modeset init for DPI panels
* re-added docs for drm_gem_flink_ioctl()
* ttm: fix page-offset calculation within TTM
Signed-off-by: Dave Airlie <[email protected]>
From: Thomas Zimmermann <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804125510.GA29670@linux-uq9g
|
|
I need to backmerge 5.8 as I've got a bunch of fixes sitting
on an rc7 base that I want to land.
Signed-off-by: Dave Airlie <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added two small interfaces: (a) GC_URGENT_LOW
mode for performance and (b) F2FS_IOC_SEC_TRIM_FILE ioctl for
security.
The new GC mode allows Android to run some lower priority GCs in
background, while new ioctl discards user information without race
condition when the account is removed.
In addition, some patches were merged to address latency-related
issues. We've fixed some compression-related bug fixes as well as edge
race conditions.
Enhancements:
- add GC_URGENT_LOW mode in gc_urgent
- introduce F2FS_IOC_SEC_TRIM_FILE ioctl
- bypass racy readahead to improve read latencies
- shrink node_write lock coverage to avoid long latency
Bug fixes:
- fix missing compression flag control, i_size, and mount option
- fix deadlock between quota writes and checkpoint
- remove inode eviction path in synchronous path to avoid deadlock
- fix to wait GCed compressed page writeback
- fix a kernel panic in f2fs_is_compressed_page
- check page dirty status before writeback
- wait page writeback before update in node page write flow
- fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry
We've added some minor sanity checks and refactored trivial code
blocks for better readability and debugging information"
* tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
f2fs: prepare a waiter before entering io_schedule
f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
f2fs: replace test_and_set/clear_bit() with set/clear_bit()
f2fs: make file immutable even if releasing zero compression block
f2fs: compress: disable compression mount option if compression is off
f2fs: compress: add sanity check during compressed cluster read
f2fs: use macro instead of f2fs verity version
f2fs: fix deadlock between quota writes and checkpoint
f2fs: correct comment of f2fs_exist_written_data
f2fs: compress: delay temp page allocation
f2fs: compress: fix to update isize when overwriting compressed file
f2fs: space related cleanup
f2fs: fix use-after-free issue
f2fs: Change the type of f2fs_flush_inline_data() to void
f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl
f2fs: should avoid inode eviction in synchronous path
f2fs: segment.h: delete a duplicated word
f2fs: compress: fix to avoid memory leak on cc->cpages
f2fs: use generic names for generic ioctls
f2fs: don't keep meta inode pages used for compressed block migration
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Make sure transactions won't be started recursively in
gfs2_block_zero_range (bug introduced in 5.4 when switching to
iomap_zero_range)
- Fix a glock holder refcount leak introduced in the iopen glock
locking scheme rework merged in 5.8.
- A few other small improvements (debugging, stack usage, comment
fixes).
* tag 'gfs2-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: When gfs2_dirty_inode gets a glock error, dump the glock
gfs2: Never call gfs2_block_zero_range with an open transaction
gfs2: print details on transactions that aren't properly ended
gfs2: Fix inaccurate comment
fs: Fix typo in comment
gfs2: Fix refcount leak in gfs2_glock_poke
gfs2: Pass glock holder to gfs2_file_direct_{read,write}
gfs2: Add some flags missing from glock output
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull JFFS2, UBI and UBIFS updates from Richard Weinberger:
"JFFS2:
- Fix for a corner case while mounting
- Fix for an use-after-free issue
UBI:
- Fix for a memory load while attaching
- Don't produce an anchor PEB with fastmap being disabled
UBIFS:
- Fix for orphan inode logic
- Spelling fixes
- New mount option to specify filesystem version"
* tag 'for-linus-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
jffs2: fix UAF problem
jffs2: fix jffs2 mounting failure
ubifs: Fix wrong orphan node deletion in ubifs_jnl_update|rename
ubi: fastmap: Free fastmap next anchor peb during detach
ubi: fastmap: Don't produce the initial next anchor PEB when fastmap is disabled
ubifs: misc.h: delete a duplicated word
ubifs: add option to specify version for new file systems
|
|
There is a spelling mistake in a DRM_ERROR message. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
There is a spelling mistake in a DRM_ERROR message. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
Same problem as in stdu, same fix.
Fixes: 51f644b40b4b ("drm/atomic-helper: reset vblank on crtc reset")
Acked-by: Charmaine Lee <[email protected]>
Reviewed-by: Zack Rusin <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
Same problem as in stdu, same fix.
Fixes: 51f644b40b4b ("drm/atomic-helper: reset vblank on crtc reset")
Acked-by: Charmaine Lee <[email protected]>
Reviewed-by: Zack Rusin <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
When converting to atomic the state reset was done by directly calling
the functions, and before the modeset object was fully initialized.
This means the various ->dev pointers weren't set up.
After
commit 51f644b40b4b794b28b982fdd5d0dd8ee63f9272
Author: Daniel Vetter <[email protected]>
Date: Fri Jun 12 18:00:49 2020 +0200
drm/atomic-helper: reset vblank on crtc reset
this started to oops because now we're trying to derefence
drm_crtc->dev. Fix this up by entirely switching over to
drm_mode_config_reset, called once everything is set up.
Fixes: 51f644b40b4b ("drm/atomic-helper: reset vblank on crtc reset")
Reported-by: Tetsuo Handa <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Tested-by: Roland Scheidegger <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
These if statements are supposed to be true if we ended the
list_for_each_entry() loops without hitting a break statement but they
don't work.
In the first loop, we increment "i" after the "if (i == unit)" condition
so we don't necessarily know that "i" is not equal to unit at the end of
the loop.
In the second loop we exit when mode is not pointing to a valid
drm_display_mode struct so it doesn't make sense to check "mode->type".
Fixes: a278724aa23c ("drm/vmwgfx: Implement fbdev on kms v2")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
The "entry" pointer is an offset from the list head and it doesn't
point to a valid vmw_legacy_display_unit struct. Presumably the
intent was to point to the last entry.
Also the "i++" wasn't used so I have removed that as well.
Fixes: d7e1958dbe4a ("drm/vmwgfx: Support older hardware.")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle and, audited and
fixed manually.
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next-fixes for v5.9-rc1:
- Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
- Fix a fbcon OOB read in fbdev, found by syzbot.
- Mark vga_tryget static as it's not used elsewhere.
- Small fixes to xlnx.
- Remove null check for kfree in drm_dev_release.
- Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
- Fix mode initialization in omap_connector_mode_valid().
Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- an update to Elan touchpad controller driver supporting newer ICs
with enhanced precision reports and a new firmware update process
- an update to EXC3000 touch controller supporting additional parts
- assorted driver fixups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (27 commits)
Input: exc3000 - add support to query model and fw_version
Input: exc3000 - add reset gpio support
Input: exc3000 - add EXC80H60 and EXC80H84 support
dt-bindings: touchscreen: Convert EETI EXC3000 touchscreen to json-schema
Input: sentelic - fix error return when fsp_reg_write fails
Input: alps - remove redundant assignment to variable ret
Input: ims-pcu - return error code rather than -ENOMEM
Input: elan_i2c - add ic type 0x15
Input: atmel_mxt_ts - only read messages in mxt_acquire_irq() when necessary
Input: uinput - fix typo in function name documentation
Input: ati_remote2 - add missing newlines when printing module parameters
Input: psmouse - add a newline when printing 'proto' by sysfs
Input: synaptics-rmi4 - drop a duplicated word
Input: elan_i2c - add support for high resolution reports
Input: elan_i2c - do not constantly re-query pattern ID
Input: elan_i2c - add firmware update info for ICs 0x11, 0x13, 0x14
Input: elan_i2c - handle firmware updated on newer ICs
Input: elan_i2c - add support for different firmware page sizes
Input: elan_i2c - fix detecting IAP version on older controllers
Input: elan_i2c - handle devices with patterns above 1
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- fix for some modern devices that return multi-byte battery report,
from Grant Likely
- fix for devices with Resolution Multiplier, from Peter Hutterer
- device probing speed increase, from Dmitry Torokhov
- ThinkPad 10 Ultrabook Keyboard support, from Hans de Goede
- other small assorted fixes and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: quirks: add NOGET quirk for Logitech GROUP
HID: Replace HTTP links with HTTPS ones
HID: udraw-ps3: Replace HTTP links with HTTPS ones
HID: mcp2221: Replace HTTP links with HTTPS ones
HID: input: Fix devices that return multiple bytes in battery report
HID: lenovo: Fix spurious F23 key press report during resume from suspend
HID: lenovo: Add ThinkPad 10 Ultrabook Keyboard fn_lock support
HID: lenovo: Add ThinkPad 10 Ultrabook Keyboard support
HID: lenovo: Rename fn_lock sysfs attr handlers to make them generic
HID: lenovo: Factor out generic parts of the LED code
HID: lenovo: Merge tpkbd and cptkbd data structures
HID: intel-ish-hid: Replace PCI_DEV_FLAGS_NO_D3 with pci_save_state
HID: Wiimote: Treat the d-pad as an analogue stick
HID: input: do not run GET_REPORT unless there's a Resolution Multiplier
HID: usbhid: remove redundant assignment to variable retval
HID: usbhid: do not sleep when opening device
|
|
If we're in the error path failing links and we have a link that has
grabbed a reference to the fs_struct, then we cannot safely drop our
reference to the table if we already hold the completion lock. This
adds a hardirq dependency to the fs_struct->lock, which it currently
doesn't have.
Defer the final cleanup and free of such requests to avoid adding this
dependency.
Reported-by: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
When we traverse into failing links or timeouts, we need to ensure we
propagate the REQ_F_COMP_LOCKED flag to ensure that we correctly signal
to the completion side that we already hold the completion lock.
Signed-off-by: Jens Axboe <[email protected]>
|
|
syszbot reports a scenario where we recurse on the completion lock
when flushing an overflow:
1 lock held by syz-executor287/6816:
#0: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_cqring_overflow_flush+0xc6/0xab0 fs/io_uring.c:1333
stack backtrace:
CPU: 1 PID: 6816 Comm: syz-executor287 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1f0/0x31e lib/dump_stack.c:118
print_deadlock_bug kernel/locking/lockdep.c:2391 [inline]
check_deadlock kernel/locking/lockdep.c:2432 [inline]
validate_chain+0x69a4/0x88a0 kernel/locking/lockdep.c:3202
__lock_acquire+0x1161/0x2ab0 kernel/locking/lockdep.c:4426
lock_acquire+0x160/0x730 kernel/locking/lockdep.c:5005
__raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
_raw_spin_lock_irq+0x67/0x80 kernel/locking/spinlock.c:167
spin_lock_irq include/linux/spinlock.h:379 [inline]
io_queue_linked_timeout fs/io_uring.c:5928 [inline]
__io_queue_async_work fs/io_uring.c:1192 [inline]
__io_queue_deferred+0x36a/0x790 fs/io_uring.c:1237
io_cqring_overflow_flush+0x774/0xab0 fs/io_uring.c:1359
io_ring_ctx_wait_and_kill+0x2a1/0x570 fs/io_uring.c:7808
io_uring_release+0x59/0x70 fs/io_uring.c:7829
__fput+0x34f/0x7b0 fs/file_table.c:281
task_work_run+0x137/0x1c0 kernel/task_work.c:135
exit_task_work include/linux/task_work.h:25 [inline]
do_exit+0x5f3/0x1f20 kernel/exit.c:806
do_group_exit+0x161/0x2d0 kernel/exit.c:903
__do_sys_exit_group+0x13/0x20 kernel/exit.c:914
__se_sys_exit_group+0x10/0x10 kernel/exit.c:912
__x64_sys_exit_group+0x37/0x40 kernel/exit.c:912
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by passing back the link from __io_queue_async_work(), and
then let the caller handle the queueing of the link. Take care to also
punt the submission reference put to the caller, as we're holding the
completion lock for the __io_queue_defer() case. Hence we need to mark
the io_kiocb appropriately for that case.
Reported-by: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
An earlier commit:
b7db41c9e03b ("io_uring: fix regression with always ignoring signals in io_cqring_wait()")
ensured that we didn't get stuck waiting for eventfd reads when it's
registered with the io_uring ring for event notification, but we still
have cases where the task can be waiting on other events in the kernel and
need a bigger nudge to make forward progress. Or the task could be in the
kernel and running, but on its way to blocking.
This means that TWA_RESUME cannot reliably be used to ensure we make
progress. Use TWA_SIGNAL unconditionally.
Cc: [email protected] # v5.7+
Reported-by: Josef <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Drop repeated words in kernel/time/. {when, one, into}
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: John Stultz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When ur_load_imm_any() is inlined into jeq_imm(), it's possible for the
compiler to deduce a case where _val can only have the value of -1 at
compile time. Specifically,
/* struct bpf_insn: _s32 imm */
u64 imm = insn->imm; /* sign extend */
if (imm >> 32) { /* non-zero only if insn->imm is negative */
/* inlined from ur_load_imm_any */
u32 __imm = imm >> 32; /* therefore, always 0xffffffff */
if (__builtin_constant_p(__imm) && __imm > 255)
compiletime_assert_XXX()
This can result in tripping a BUILD_BUG_ON() in __BF_FIELD_CHECK() that
checks that a given value is representable in one byte (interpreted as
unsigned).
FIELD_FIT() should return true or false at runtime for whether a value
can fit for not. Don't break the build over a value that's too large for
the mask. We'd prefer to keep the inlining and compiler optimizations
though we know this case will always return false.
Cc: [email protected]
Fixes: 1697599ee301a ("bitfield.h: add FIELD_FIT() helper")
Link: https://lore.kernel.org/kernel-hardening/CAK7LNASvb0UDJ0U5wkYYRzTAdnEs64HjXpEUL7d=V0CXiAXcNw@mail.gmail.com/
Reported-by: Masahiro Yamada <[email protected]>
Debugged-by: Sami Tolvanen <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When TFO keys are read back on big endian systems either via the global
sysctl interface or via getsockopt() using TCP_FASTOPEN_KEY, the values
don't match what was written.
For example, on s390x:
# echo "1-2-3-4" > /proc/sys/net/ipv4/tcp_fastopen_key
# cat /proc/sys/net/ipv4/tcp_fastopen_key
02000000-01000000-04000000-03000000
Instead of:
# cat /proc/sys/net/ipv4/tcp_fastopen_key
00000001-00000002-00000003-00000004
Fix this by converting to the correct endianness on read. This was
reported by Colin Ian King when running the 'tcp_fastopen_backup_key' net
selftest on s390x, which depends on the read value matching what was
written. I've confirmed that the test now passes on big and little endian
systems.
Signed-off-by: Jason Baron <[email protected]>
Fixes: 438ac88009bc ("net: fastopen: robustness and endianness fixes for SipHash")
Cc: Ard Biesheuvel <[email protected]>
Cc: Eric Dumazet <[email protected]>
Reported-and-tested-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
I'm not doing much work on the NFP driver any more.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload
support") added support for encapsulation offload. However, while
calculating tcp hdr length, it does not take into account if the
packet is encapsulated or not.
This patch fixes this issue by using correct reference for inner
tcp header.
Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support")
Signed-off-by: Ronak Doshi <[email protected]>
Acked-by: Guolin Yang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts commits 6d04fe15f78acdf8e32329e208552e226f7a8ae6 and
a31edb2059ed4e498f9aa8230c734b59d0ad797a.
It turns out the idea to share a single pointer for both kernel and user
space address causes various kinds of problems. So use the slightly less
optimal version that uses an extra bit, but which is guaranteed to be safe
everywhere.
Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces")
Reported-by: Eric Dumazet <[email protected]>
Reported-by: John Stultz <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There is a spelling mistake in an error message. Fix it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
If the log file for a given test is larger than the max size given then use
set the seek from the end of the log file instead of from the start of the
test.
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
The Intel uncore driver may claim some of the pci ids from ie31200 which
means that the ie31200 edac driver will not initialize them as part of
pci_register_driver().
Let's add a fallback for this case to 'pci_get_device()' to get a
reference on the device such that it can still be configured. This is
similar in approach to other edac drivers.
Signed-off-by: Jason Baron <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: linux-edac <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
[BUG]
Unmounting a btrfs filesystem with quota disabled will cause the
following NULL pointer dereference:
BTRFS info (device dm-5): has skinny extents
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
CPU: 7 PID: 637 Comm: umount Not tainted 5.8.0-rc7-next-20200731-custom #76
RIP: 0010:kobject_del+0x6/0x20
Call Trace:
btrfs_sysfs_del_qgroups+0xac/0xf0 [btrfs]
btrfs_free_qgroup_config+0x63/0x70 [btrfs]
close_ctree+0x1f5/0x323 [btrfs]
btrfs_put_super+0x15/0x17 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x17/0x30 [btrfs]
deactivate_locked_super+0x3b/0xa0
deactivate_super+0x40/0x50
cleanup_mnt+0x135/0x190
__cleanup_mnt+0x12/0x20
task_work_run+0x64/0xb0
exit_to_user_mode_prepare+0x18a/0x190
syscall_exit_to_user_mode+0x4f/0x270
do_syscall_64+0x45/0x50
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 37b7adca5c1d5c5d ]---
[CAUSE]
Commit 079ad2fb4bf9 ("kobject: Avoid premature parent object freeing in
kobject_cleanup()") changed kobject_del() that it no longer accepts NULL
pointer.
Before that commit, kobject_del() and kobject_put() all accept NULL
pointers and just ignore such NULL pointers.
But that mentioned commit needs to access the parent node, killing the
old NULL pointer behavior.
Unfortunately btrfs is relying on that hidden feature thus we will
trigger such NULL pointer dereference.
[FIX]
Instead of just saving several lines, do proper fs_info->qgroups_kobj
check before calling kobject_del() and kobject_put().
Reviewed-by: Nikolay Borisov <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
The `if (!ret)` check will always be false and it may result in
ret->path being dereferenced while it is a NULL pointer.
Fixes: a37f232b7b65 ("btrfs: backref: introduce the skeleton of btrfs_backref_iter")
CC: [email protected] # 5.8+
Reviewed-by: Nikolay Borisov <[email protected]>
Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Boleyn Su <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Based on an analysis of the HyperV firmwares (Gen1 and Gen2) it seems
like the SCONTROL is not being set to the ENABLED state as like we have
thought.
Also from a test done by Vitaly Kuznetsov, running a nested HyperV it
was concluded that the first access to the SCONTROL MSR with a read
resulted with the value of 0x1, aka HV_SYNIC_CONTROL_ENABLE.
It's important to note that this diverges from the value states in the
HyperV TLFS of 0.
Signed-off-by: Jon Doron <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Convert the uses of fallthrough comments to fallthrough macro.
Signed-off-by: Hongxiang Lou <[email protected]>
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
There's some inconsistency around SB_I_VERSION handling with mount and
remount. Since we don't really want it to be off ever just work around
this by making sure we don't get the flag cleared on remount.
There's a tiny cpu cost of setting the bit, otherwise all changes to
i_version also change some of the times (ctime/mtime) so the inode needs
to be synced. We wouldn't save anything by disabling it.
Reported-by: Eric Sandeen <[email protected]>
CC: [email protected] # 5.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add perf impact analysis ]
Signed-off-by: David Sterba <[email protected]>
|
|
While logging an inode, at copy_items(), if we fail to lookup the checksums
for an extent we release the destination path, free the ins_data array and
then return immediately. However a previous iteration of the for loop may
have added checksums to the ordered_sums list, in which case we leak the
memory used by them.
So fix this by making sure we iterate the ordered_sums list and free all
its checksums before returning.
Fixes: 3650860b90cc2a ("Btrfs: remove almost all of the BUG()'s from tree-log.c")
CC: [email protected] # 4.4+
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Chris Murphy reported a problem where rpm ostree will bind mount a bunch
of things for whatever voodoo it's doing. But when it does this
/proc/mounts shows something like
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo/bar 0 0
Despite subvolid=256 being subvol=/foo. This is because we're just
spitting out the dentry of the mount point, which in the case of bind
mounts is the source path for the mountpoint. Instead we should spit
out the path to the actual subvol. Fix this by looking up the name for
the subvolid we have mounted. With this fix the same test looks like
this
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
Reported-by: Chris Murphy <[email protected]>
CC: [email protected] # 4.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Reported by Forza on IRC that remounting with compression options does
not reflect the change in level, or at least it does not appear to do so
according to the messages:
mount -o compress=zstd:1 /dev/sda /mnt
mount -o remount,compress=zstd:15 /mnt
does not print the change to the level to syslog:
[ 41.366060] BTRFS info (device vda): use zstd compression, level 1
[ 41.368254] BTRFS info (device vda): disk space caching is enabled
[ 41.390429] BTRFS info (device vda): disk space caching is enabled
What really happens is that the message is lost but the level is actualy
changed.
There's another weird output, if compression is reset to 'no':
[ 45.413776] BTRFS info (device vda): use no compression, level 4
To fix that, save the previous compression level and print the message
in that case too and use separate message for 'no' compression.
CC: [email protected] # 4.19+
Signed-off-by: David Sterba <[email protected]>
|
|
try_merge_free_space
In try_to_merge_free_space we attempt to find entries to the left and
right of the entry we are adding to see if they can be merged. We
search for an entry past our current info (saved into right_info), and
then if right_info exists and it has a rb_prev() we save the rb_prev()
into left_info.
However there's a slight problem in the case that we have a right_info,
but no entry previous to that entry. At that point we will search for
an entry just before the info we're attempting to insert. This will
simply find right_info again, and assign it to left_info, making them
both the same pointer.
Now if right_info _can_ be merged with the range we're inserting, we'll
add it to the info and free right_info. However further down we'll
access left_info, which was right_info, and thus get a use-after-free.
Fix this by only searching for the left entry if we don't find a right
entry at all.
The CVE referenced had a specially crafted file system that could
trigger this use-after-free. However with the tree checker improvements
we no longer trigger the conditions for the UAF. But the original
conditions still apply, hence this fix.
Reference: CVE-2019-19448
Fixes: 963030817060 ("Btrfs: use hybrid extents+bitmap rb tree for free space")
CC: [email protected] # 4.4+
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
[BUG]
There is a bug report of NULL pointer dereference caused in
compress_file_extent():
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Workqueue: btrfs-delalloc btrfs_delalloc_helper [btrfs]
NIP [c008000006dd4d34] compress_file_range.constprop.41+0x75c/0x8a0 [btrfs]
LR [c008000006dd4d1c] compress_file_range.constprop.41+0x744/0x8a0 [btrfs]
Call Trace:
[c000000c69093b00] [c008000006dd4d1c] compress_file_range.constprop.41+0x744/0x8a0 [btrfs] (unreliable)
[c000000c69093bd0] [c008000006dd4ebc] async_cow_start+0x44/0xa0 [btrfs]
[c000000c69093c10] [c008000006e14824] normal_work_helper+0xdc/0x598 [btrfs]
[c000000c69093c80] [c0000000001608c0] process_one_work+0x2c0/0x5b0
[c000000c69093d10] [c000000000160c38] worker_thread+0x88/0x660
[c000000c69093db0] [c00000000016b55c] kthread+0x1ac/0x1c0
[c000000c69093e20] [c00000000000b660] ret_from_kernel_thread+0x5c/0x7c
---[ end trace f16954aa20d822f6 ]---
[CAUSE]
For the following execution route of compress_file_range(), it's
possible to hit NULL pointer dereference:
compress_file_extent()
|- pages = NULL;
|- start = async_chunk->start = 0;
|- end = async_chunk = 4095;
|- nr_pages = 1;
|- inode_need_compress() == false; <<< Possible, see later explanation
| Now, we have nr_pages = 1, pages = NULL
|- cont:
|- ret = cow_file_range_inline();
|- if (ret <= 0) {
|- for (i = 0; i < nr_pages; i++) {
|- WARN_ON(pages[i]->mapping); <<< Crash
To enter above call execution branch, we need the following race:
Thread 1 (chattr) | Thread 2 (writeback)
--------------------------+------------------------------
| btrfs_run_delalloc_range
| |- inode_need_compress = true
| |- cow_file_range_async()
btrfs_ioctl_set_flag() |
|- binode_flags |= |
BTRFS_INODE_NOCOMPRESS |
| compress_file_range()
| |- inode_need_compress = false
| |- nr_page = 1 while pages = NULL
| | Then hit the crash
[FIX]
This patch will fix it by checking @pages before doing accessing it.
This patch is only designed as a hot fix and easy to backport.
More elegant fix may make btrfs only check inode_need_compress() once to
avoid such race, but that would be another story.
Reported-by: Luciano Chavez <[email protected]>
Fixes: 4d3a800ebb12 ("btrfs: merge nr_pages input and output parameter in compress_pages")
CC: [email protected] # 4.14.x: cecc8d9038d16: btrfs: Move free_pages_out label in inline extent handling branch in compress_file_range
CC: [email protected] # 4.14+
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
VDPA mlx5 accesses config space as native endian - this is
wrong since it's a modern device and actually uses LE.
It only supports modern guests so we could punt and
just force LE, but let's use the full virtio APIs since people
tend to copy/paste code, and this is not data path anyway.
Signed-off-by: Michael S. Tsirkin <[email protected]>
|
|
If "offset" is non-zero then we end up copying from beyond the end of
the config because of pointer math. We can fix this by casting the
struct to a u8 pointer.
Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/20200406144552.GF68494@mwanda
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
|
|
Commit ea0eada45632 leads to the following build failure on powerpc:
HOSTCC scripts/recordmcount
scripts/recordmcount.c: In function 'arm64_is_fake_mcount':
scripts/recordmcount.c:440: error: 'R_AARCH64_CALL26' undeclared (first use in this function)
scripts/recordmcount.c:440: error: (Each undeclared identifier is reported only once
scripts/recordmcount.c:440: error: for each function it appears in.)
make[2]: *** [scripts/recordmcount] Error 1
Make sure R_AARCH64_CALL26 is always defined.
Fixes: ea0eada45632 ("recordmcount: only record relocation of type R_AARCH64_CALL26 on arm64.")
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Acked-by: Gregory Herrero <[email protected]>
Cc: Gregory Herrero <[email protected]>
Link: https://lore.kernel.org/r/5ca1be21fa6ebf73203b45fd9aadd2bafb5e6b15.1597049145.git.christophe.leroy@csgroup.eu
Signed-off-by: Catalin Marinas <[email protected]>
|
|
The USB device (0x17aa:0x1046) that support Lenovo P620 rear panel
line-in claim to support volume control, but it doens't seem to have an
AMP, so when line-in volume lowers below 80, nothing gets recorded
anymore.
Disable the volume control to workaround the issue.
Fixes: f8c11eb7da4a ("ALSA: usb-audio: Add support for Lenovo ThinkStation P620")
Signed-off-by: Kai-Heng Feng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Drivers using legacy power management .suspen()/.resume() callbacks
have to manage PCI states and device's PM states themselves. They also
need to take care of standard configuration registers.
Switch to generic power management framework using a single
"struct dev_pm_ops" variable to take the unnecessary load from the driver.
This also avoids the need for the driver to directly call most of the PCI
helper functions and device power state control functions, as through
the generic framework PCI Core takes care of the necessary operations,
and drivers are required to do only device-specific jobs.
Signed-off-by: Vaibhav Gupta <[email protected]>
Reviewed-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
The driver calls pci_enable_wake(...., false) in pch_i2c_suspend() as well
as pch_i2c_resume(). Either it should enable-wake the device in .suspend()
or should not invoke pci_enable_wake() at all.
Concluding that this driver doesn't support enable-wake and PCI core calls
pci_enable_wake(pci_dev, PCI_D0, false) during resume, drop it from
.suspend() and .resume().
Reported-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Vaibhav Gupta <[email protected]>
Reviewed-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|