Age | Commit message (Collapse) | Author | Files | Lines |
|
Tag-based KASAN reuses a significant part of the generic KASAN code, so
move the common parts to common.c without any functional changes.
Link: http://lkml.kernel.org/r/114064d002356e03bb8cc91f7835e20dc61b51d9.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The previous patch updated KASAN hooks signatures and their usage in SLAB
and SLUB code, except for the early_kmem_cache_node_alloc function. This
patch handles that function separately, as it requires to reorder some of
the initialization code to correctly propagate a tagged pointer in case a
tag is assigned by kasan_kmalloc.
Link: http://lkml.kernel.org/r/fc8d0fdcf733a7a52e8d0daaa650f4736a57de8c.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "kasan: add software tag-based mode for arm64", v13.
This patchset adds a new software tag-based mode to KASAN [1]. (Initially
this mode was called KHWASAN, but it got renamed, see the naming rationale
at the end of this section).
The plan is to implement HWASan [2] for the kernel with the incentive,
that it's going to have comparable to KASAN performance, but in the same
time consume much less memory, trading that off for somewhat imprecise bug
detection and being supported only for arm64.
The underlying ideas of the approach used by software tag-based KASAN are:
1. By using the Top Byte Ignore (TBI) arm64 CPU feature, we can store
pointer tags in the top byte of each kernel pointer.
2. Using shadow memory, we can store memory tags for each chunk of kernel
memory.
3. On each memory allocation, we can generate a random tag, embed it into
the returned pointer and set the memory tags that correspond to this
chunk of memory to the same value.
4. By using compiler instrumentation, before each memory access we can add
a check that the pointer tag matches the tag of the memory that is being
accessed.
5. On a tag mismatch we report an error.
With this patchset the existing KASAN mode gets renamed to generic KASAN,
with the word "generic" meaning that the implementation can be supported
by any architecture as it is purely software.
The new mode this patchset adds is called software tag-based KASAN. The
word "tag-based" refers to the fact that this mode uses tags embedded into
the top byte of kernel pointers and the TBI arm64 CPU feature that allows
to dereference such pointers. The word "software" here means that shadow
memory manipulation and tag checking on pointer dereference is done in
software. As it is the only tag-based implementation right now, "software
tag-based" KASAN is sometimes referred to as simply "tag-based" in this
patchset.
A potential expansion of this mode is a hardware tag-based mode, which
would use hardware memory tagging support (announced by Arm [3]) instead
of compiler instrumentation and manual shadow memory manipulation.
Same as generic KASAN, software tag-based KASAN is strictly a debugging
feature.
[1] https://www.kernel.org/doc/html/latest/dev-tools/kasan.html
[2] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[3] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a
====== Rationale
On mobile devices generic KASAN's memory usage is significant problem.
One of the main reasons to have tag-based KASAN is to be able to perform a
similar set of checks as the generic one does, but with lower memory
requirements.
Comment from Vishwath Mohan <[email protected]>:
I don't have data on-hand, but anecdotally both ASAN and KASAN have proven
problematic to enable for environments that don't tolerate the increased
memory pressure well. This includes
(a) Low-memory form factors - Wear, TV, Things, lower-tier phones like Go,
(c) Connected components like Pixel's visual core [1].
These are both places I'd love to have a low(er) memory footprint option at
my disposal.
Comment from Evgenii Stepanov <[email protected]>:
Looking at a live Android device under load, slab (according to
/proc/meminfo) + kernel stack take 8-10% available RAM (~350MB). KASAN's
overhead of 2x - 3x on top of it is not insignificant.
Not having this overhead enables near-production use - ex. running
KASAN/KHWASAN kernel on a personal, daily-use device to catch bugs that do
not reproduce in test configuration. These are the ones that often cost
the most engineering time to track down.
CPU overhead is bad, but generally tolerable. RAM is critical, in our
experience. Once it gets low enough, OOM-killer makes your life
miserable.
[1] https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/
====== Technical details
Software tag-based KASAN mode is implemented in a very similar way to the
generic one. This patchset essentially does the following:
1. TCR_TBI1 is set to enable Top Byte Ignore.
2. Shadow memory is used (with a different scale, 1:16, so each shadow
byte corresponds to 16 bytes of kernel memory) to store memory tags.
3. All slab objects are aligned to shadow scale, which is 16 bytes.
4. All pointers returned from the slab allocator are tagged with a random
tag and the corresponding shadow memory is poisoned with the same value.
5. Compiler instrumentation is used to insert tag checks. Either by
calling callbacks or by inlining them (CONFIG_KASAN_OUTLINE and
CONFIG_KASAN_INLINE flags are reused).
6. When a tag mismatch is detected in callback instrumentation mode
KASAN simply prints a bug report. In case of inline instrumentation,
clang inserts a brk instruction, and KASAN has it's own brk handler,
which reports the bug.
7. The memory in between slab objects is marked with a reserved tag, and
acts as a redzone.
8. When a slab object is freed it's marked with a reserved tag.
Bug detection is imprecise for two reasons:
1. We won't catch some small out-of-bounds accesses, that fall into the
same shadow cell, as the last byte of a slab object.
2. We only have 1 byte to store tags, which means we have a 1/256
probability of a tag match for an incorrect access (actually even
slightly less due to reserved tag values).
Despite that there's a particular type of bugs that tag-based KASAN can
detect compared to generic KASAN: use-after-free after the object has been
allocated by someone else.
====== Testing
Some kernel developers voiced a concern that changing the top byte of
kernel pointers may lead to subtle bugs that are difficult to discover.
To address this concern deliberate testing has been performed.
It doesn't seem feasible to do some kind of static checking to find
potential issues with pointer tagging, so a dynamic approach was taken.
All pointer comparisons/subtractions have been instrumented in an LLVM
compiler pass and a kernel module that would print a bug report whenever
two pointers with different tags are being compared/subtracted (ignoring
comparisons with NULL pointers and with pointers obtained by casting an
error code to a pointer type) has been used. Then the kernel has been
booted in QEMU and on an Odroid C2 board and syzkaller has been run.
This yielded the following results.
The two places that look interesting are:
is_vmalloc_addr in include/linux/mm.h
is_kernel_rodata in mm/util.c
Here we compare a pointer with some fixed untagged values to make sure
that the pointer lies in a particular part of the kernel address space.
Since tag-based KASAN doesn't add tags to pointers that belong to rodata
or vmalloc regions, this should work as is. To make sure debug checks to
those two functions that check that the result doesn't change whether we
operate on pointers with or without untagging has been added.
A few other cases that don't look that interesting:
Comparing pointers to achieve unique sorting order of pointee objects
(e.g. sorting locks addresses before performing a double lock):
tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
pipe_double_lock in fs/pipe.c
unix_state_double_lock in net/unix/af_unix.c
lock_two_nondirectories in fs/inode.c
mutex_lock_double in kernel/events/core.c
ep_cmp_ffd in fs/eventpoll.c
fsnotify_compare_groups fs/notify/mark.c
Nothing needs to be done here, since the tags embedded into pointers
don't change, so the sorting order would still be unique.
Checks that a pointer belongs to some particular allocation:
is_sibling_entry in lib/radix-tree.c
object_is_on_stack in include/linux/sched/task_stack.h
Nothing needs to be done here either, since two pointers can only belong
to the same allocation if they have the same tag.
Overall, since the kernel boots and works, there are no critical bugs.
As for the rest, the traditional kernel testing way (use until fails) is
the only one that looks feasible.
Another point here is that tag-based KASAN is available under a separate
config option that needs to be deliberately enabled. Even though it might
be used in a "near-production" environment to find bugs that are not found
during fuzzing or running tests, it is still a debug tool.
====== Benchmarks
The following numbers were collected on Odroid C2 board. Both generic and
tag-based KASAN were used in inline instrumentation mode.
Boot time [1]:
* ~1.7 sec for clean kernel
* ~5.0 sec for generic KASAN
* ~5.0 sec for tag-based KASAN
Network performance [2]:
* 8.33 Gbits/sec for clean kernel
* 3.17 Gbits/sec for generic KASAN
* 2.85 Gbits/sec for tag-based KASAN
Slab memory usage after boot [3]:
* ~40 kb for clean kernel
* ~105 kb (~260% overhead) for generic KASAN
* ~47 kb (~20% overhead) for tag-based KASAN
KASAN memory overhead consists of three main parts:
1. Increased slab memory usage due to redzones.
2. Shadow memory (the whole reserved once during boot).
3. Quaratine (grows gradually until some preset limit; the more the limit,
the more the chance to detect a use-after-free).
Comparing tag-based vs generic KASAN for each of these points:
1. 20% vs 260% overhead.
2. 1/16th vs 1/8th of physical memory.
3. Tag-based KASAN doesn't require quarantine.
[1] Time before the ext4 driver is initialized.
[2] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
[3] Measured as `cat /proc/meminfo | grep Slab`.
====== Some notes
A few notes:
1. The patchset can be found here:
https://github.com/xairy/kasan-prototype/tree/khwasan
2. Building requires a recent Clang version (7.0.0 or later).
3. Stack instrumentation is not supported yet and will be added later.
This patch (of 25):
Tag-based KASAN changes the value of the top byte of pointers returned
from the kernel allocation functions (such as kmalloc). This patch
updates KASAN hooks signatures and their usage in SLAB and SLUB code to
reflect that.
Link: http://lkml.kernel.org/r/aec2b5e3973781ff8a6bb6760f8543643202c451.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
So that the user can specify outside CFLAGS values.
Signed-off-by: Jiri Olsa <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Cc: Hartmut Knaack <[email protected]> <[email protected]>
Cc: Herton Krzesinski <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Lars-Peter Clausen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
command
So that the user can specify outside CFLAGS/LDFLAGS values.
Signed-off-by: Jiri Olsa <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Cc: Herton Krzesinski <[email protected]>
Cc: Len Brown <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that the user can provide, e.g. distro package alternative values.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Brian Norris <[email protected]>
Cc: Herton Krzesinski <[email protected]>
Cc: Markus Mayer <[email protected]>
Cc: Zhang Rui <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
LDFLAGS to build command
So user could specify outside CFLAGS/LDFLAGS values.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Herton Krzesinski <[email protected]>
Cc: Len Brown <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The cachelines being reported are the ones with percentages all the way
down to 0.05%. That makes for very long output files. Raising that to
0.1%. The user can always specify --show-all if they want all the
cachelines with hits.
Suggested-by: Joe Mario <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Joe suggested to have the coalesce default set just to 'iaddr', because
it's easier to read on the default 'perf c2c report' output.
By removing the "pid" field from the default -c/--coalesce option, the
'perf c2c' report will group all the relevant PIDs under the instruction
address ('iaddr') bucket. User can always run "-c pid,iaddr" for a more
fine grained output on particular PIDs.
Suggested-by: Joe Mario <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For instance, while debugging the 'galileo' python utility to
synchronize fitbit trackers:
# perf trace -e ioctl ./run --force
ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666420) = 0
ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
ioctl(2</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
ioctl(3</home/acme/hg/galileo/run>, TCSETS, 0x7ffe286663f0) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286655a0) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0
ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665400) = 0
ioctl(1</dev/pts/8>, TIOCSWINSZ, 0x7ffe286654c0) = 0
ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0
ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0
ioctl(0</dev/pts/8>, TIOCMGET, 0x7ffe28665560) = 0
ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28665530) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GET_CAPABILITIES, 0x561468dad048) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available)
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available)
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SETCONFIGURATION, 0x7ffe2866513c) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_CLAIMINTERFACE, 0x7ffe286647bc) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = -1 EAGAIN (Resource temporarily unavailable)
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = -1 EAGAIN (Resource temporarily unavailable)
<SNIP>
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468e72ec0) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = -1 EAGAIN (Resource temporarily unavailable)
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0
ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0
Tracker: 813F4690C3D1: Synchronisation successful
#
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that beautifiers can access things like dev_maj.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
That ends up generating this:
[acme@quaco perf]$ cat /tmp/build/perf/trace/beauty/generated/ioctl/usbdevfs_ioctl_array.c
static const char *usbdevfs_ioctl_cmds[] = {
[0] = "CONTROL",
[10] = "SUBMITURB",
[11] = "DISCARDURB",
[12] = "REAPURB",
[13] = "REAPURBNDELAY",
[14] = "DISCSIGNAL",
[15] = "CLAIMINTERFACE",
[16] = "RELEASEINTERFACE",
[17] = "CONNECTINFO",
[18] = "IOCTL",
[19] = "HUB_PORTINFO",
[2] = "BULK",
[20] = "RESET",
[21] = "CLEAR_HALT",
[22] = "DISCONNECT",
[23] = "CONNECT",
[24] = "CLAIM_PORT",
[25] = "RELEASE_PORT",
[26] = "GET_CAPABILITIES",
[27] = "DISCONNECT_CLAIM",
[28] = "ALLOC_STREAMS",
[29] = "FREE_STREAMS",
[3] = "RESETEP",
[30] = "DROP_PRIVILEGES",
[31] = "GET_SPEED",
[4] = "SETINTERFACE",
[5] = "SETCONFIGURATION",
[8] = "GETDRIVER",
};
#if 0
static const char *usbdevfs_ioctl_32_cmds[] = {
[0] = "CONTROL32",
[10] = "SUBMITURB32",
[12] = "REAPURB32",
[13] = "REAPURBNDELAY32",
[14] = "DISCSIGNAL32",
[18] = "IOCTL32",
[2] = "BULK32",
};
#endif
$
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Will be associated with fds with the right device major.
$ tools/perf/trace/beauty/usbdevfs_ioctl.sh
static const char *usbdevfs_ioctl_cmds[] = {
[0] = "CONTROL",
[10] = "SUBMITURB",
[11] = "DISCARDURB",
[12] = "REAPURB",
[13] = "REAPURBNDELAY",
[14] = "DISCSIGNAL",
[15] = "CLAIMINTERFACE",
[16] = "RELEASEINTERFACE",
[17] = "CONNECTINFO",
[18] = "IOCTL",
[19] = "HUB_PORTINFO",
[20] = "RESET",
[21] = "CLEAR_HALT",
[22] = "DISCONNECT",
[23] = "CONNECT",
[24] = "CLAIM_PORT",
[25] = "RELEASE_PORT",
[26] = "GET_CAPABILITIES",
[27] = "DISCONNECT_CLAIM",
[28] = "ALLOC_STREAMS",
[29] = "FREE_STREAMS",
[2] = "BULK",
[30] = "DROP_PRIVILEGES",
[31] = "GET_SPEED",
[3] = "RESETEP",
[4] = "SETINTERFACE",
[5] = "SETCONFIGURATION",
[8] = "GETDRIVER",
};
#if 0
static const char *usbdevfs_ioctl_32_cmds[] = {
[0] = "CONTROL32",
[10] = "SUBMITURB32",
[12] = "REAPURB32",
[13] = "REAPURBNDELAY32",
[14] = "DISCSIGNAL32",
[18] = "IOCTL32",
[2] = "BULK32",
};
#endif
$
Leaving the '32' variants commented, later we can try to support those
as well, from some other hint (maybe something about the thread issuing
the ioctls) and from the _IOC_SIZE(cmd).
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Will be used to generate the string table for the USBDEVFS_ prefixed
ioctl commands.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We keep a table for the fds to map them back to pathnames when showing
'fd' based APIs such as write(), store as well the major number for the
device the path is in, to use in things like choosing the right ioctl
'cmd' beautifier.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we can have that table expanded when setting other attributes.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we can add more per file attributes besides the pathname, such
as which ioctl beautifier to use, for cases such as the sound and
usbdeffs ioctls, that both use the 'U' command, so we have to
differentiate at the major number for the device file.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is a fix for another instance of the skid problem Milian recently
found [1]
The LBRs don't freeze at the exact same time as the PMI is triggered.
The perf script brstackinsn code that dumps LBR assembler assumes that
the last branch in the LBR leads to the sample point. But with skid
it's possible that the CPU executes one or more branches before the
sample, but which do not appear in the LBR.
What happens then is either that the sample point is before the last LBR
branch. In this case the dumper sees a negative length and ignores it.
Or it the sample point is long after the last branch. Then the dumper
sees a very long block and dumps it upto its block limit (16k bytes),
which is noise in the output.
On typical sample session this can happen regularly.
This patch tries to detect and handle the situation. On the last block
that is dumped by the LBR dumper we always stop on the first branch. If
the block length is negative just scan forward to the first branch.
Otherwise scan until a branch is found.
The PT decoder already has a function that uses the instruction decoder
to detect branches, so we can just reuse it here.
Then when a terminating branch is found print an indication and stop
dumping. This might miss a few instructions, but at least shows no
runaway blocks.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Milian Wolff <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Resolved conflict with dd2e18e9ac20 ("perf tools: Support 'srccode' output") ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
evlist.get_pollfd()
Ondřej reported that when compiled with python3, the python extension
regresses in evlist.get_pollfd function behaviour.
The evlist.get_pollfd function creates file objects from evlist's fds
and returns them in a list. The python3 version also sets them to 'close
the original descriptor' when the object dies (is closed), by passing
True via the 'closefd' arg in the PyFile_FromFd call.
The python's closefd doc says:
If closefd is False, the underlying file descriptor will be kept open
when the file is closed.
That's why the following line in python3 closes all evlist fds:
evlist.get_pollfd()
the returned list is immediately destroyed and that takes down the
original events fds.
Passing closefd as False to PyFile_FromFd to fix this.
Reported-by: Ondřej Lysoněk <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jaroslav Škarvada <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The spelling of the SECCOMP is incorrect, fix these.
Signed-off-by: Colin King <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Fixes: c65c83ffe904 ("perf trace: Allow asking for not suppressing common string prefixes")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
SCU clock can be used in a similar way by IMX8QXP and IMX8QM SoCs.
Let's make the name of clock ID generic to allow other SoCs to reuse
the common part.
This patch only changes the clock id name and file name, so no
functional change.
Cc: Stephen Boyd <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Sascha Hauer <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: Michael Turquette <[email protected]>
Cc: [email protected]
Signed-off-by: Dong Aisheng <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
|
|
It can be useful to inhibit all cgroup1 hierarchies especially during
transition and for debugging. cgroup_no_v1 can block hierarchies with
controllers which leaves out the named hierarchies. Expand it to
cover the named hierarchies so that "cgroup_no_v1=all,named" disables
all cgroup1 hierarchies.
Signed-off-by: Tejun Heo <[email protected]>
Suggested-by: Marcin Pawlowski <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
This fixes the case where all mount options specified are consumed by an
LSM and all that's left is an empty string. In this case cgroupfs should
accept the string and not fail.
How to reproduce (with SELinux enabled):
# umount /sys/fs/cgroup/unified
# mount -o context=system_u:object_r:cgroup_t:s0 -t cgroup2 cgroup2 /sys/fs/cgroup/unified
mount: /sys/fs/cgroup/unified: wrong fs type, bad option, bad superblock on cgroup2, missing codepage or helper program, or other error.
# dmesg | tail -n 1
[ 31.575952] cgroup: cgroup2: unknown option ""
Fixes: 67e9c74b8a87 ("cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type")
[NOTE: should apply on top of commit 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option"), older versions need manual rebase]
Suggested-by: Stephen Smalley <[email protected]>
Signed-off-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Clarify the use of the CONFIG_DFS_UPCALL for DNS name resolution
when server ip addresses change (e.g. on long running mounts)
Signed-off-by: Steve French <[email protected]>
|
|
In case a hostname resolves to a different IP address (e.g. long
running mounts), make sure to resolve it every time prior to calling
generic_ip_connect() in reconnect.
Suggested-by: Steve French <[email protected]>
Signed-off-by: Paulo Alcantara <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
After a successful failover, the cifs_reconnect_tcon() function will
make sure to reconnect every tcon to new target server.
Same as previous commit but for SMB1 codepath.
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
After a successful failover in cifs_reconnect(), the smb2_reconnect()
function will make sure to reconnect every tcon to new target server.
For SMB2+.
Signed-off-by: Paulo Alcantara <[email protected]>
Signed-off-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Fix potential NULL ptr deref when DFS target list is empty.
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Start the DFS cache refresh worker per volume during cifs mount.
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
A spin lock is held before kstrndup, it may sleep with holding
the spinlock, so we should use GFP_ATOMIC instead.
Fixes: e58c31d5e387 ("cifs: Add support for failover in cifs_reconnect()")
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Paulo Alcantara <[email protected]>
|
|
After failing to reconnect to original target, it will retry any
target available from DFS cache.
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This patch adds support for failover when failing to connect in
cifs_mount().
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
fs/cifs/cifs_dfs_ref.c: In function 'cifs_dfs_do_automount':
fs/cifs/cifs_dfs_ref.c:309:7: warning:
variable 'sep' set but not used [-Wunused-but-set-variable]
It never used since introdution in commit 0f56b277073c ("cifs: Make use
of DFS cache to get new DFS referrals")
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Paulo Alcantara <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This patch will make use of DFS cache routines where appropriate and
do not always request a new referral from server.
Signed-off-by: Paulo Alcantara <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Update cifs "TODO" file.
Signed-off-by: Steve French <[email protected]>
|
|
kzalloc can return NULL so an additional check is needed. While there
is a check for ret_buf there is no check for the allocation of
ret_buf->crfid.fid - this check is thus added. Both call-sites
of tconInfoAlloc() check for NULL return of tconInfoAlloc()
so returning NULL on failure of kzalloc() here seems appropriate.
As the kzalloc() is the only thing here that can fail it is
moved to the beginning so as not to initialize other resources
on failure of kzalloc.
Fixes: 3d4ef9a15343 ("smb3: fix redundant opens on root")
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
fs/cifs/smb2pdu.c: In function 'smb311_posix_mkdir':
fs/cifs/smb2pdu.c:2040:26: warning:
variable 'server' set but not used [-Wunused-but-set-variable]
fs/cifs/smb2pdu.c: In function 'build_qfs_info_req':
fs/cifs/smb2pdu.c:4067:26: warning:
variable 'server' set but not used [-Wunused-but-set-variable]
The first 'server' never used since commit bea851b8babe ("smb3: Fix mode on
mkdir on smb311 mounts")
And the second not used since commit 1fc6ad2f10ad ("cifs: remove
header_preamble_size where it is always 0")
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
We should zero out the password before we free it.
Fixes: 3d6cacbb5310 ("cifs: Add DFS cache routines")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Paulo Alcantara <[email protected]>
|
|
memory allocated by kmem_cache_alloc() in alloc_cache_entry()
should be freed using kmem_cache_free(), not kfree().
Fixes: 34a44fb160f9 ("cifs: Add DFS cache routines")
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
|
|
Fixes cifs build failure after merge of the y2038 tree
After merging the y2038 tree, today's linux-next build (x86_64
allmodconfig) failed like this:
fs/cifs/dfs_cache.c: In function 'cache_entry_expired':
fs/cifs/dfs_cache.c:106:7: error: implicit declaration of function 'current_kernel_time64'; did you mean 'core_kernel_text'? [-Werror=implicit-function-declaration]
ts = current_kernel_time64();
^~~~~~~~~~~~~~~~~~~~~
core_kernel_text
fs/cifs/dfs_cache.c:106:5: error: incompatible types when assigning to type 'struct timespec64' from type 'int'
ts = current_kernel_time64();
^
fs/cifs/dfs_cache.c: In function 'get_expire_time':
fs/cifs/dfs_cache.c:342:24: error: incompatible type for argument 1 of 'timespec64_add'
return timespec64_add(current_kernel_time64(), ts);
^~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/restart_block.h:10,
from include/linux/thread_info.h:13,
from arch/x86/include/asm/preempt.h:7,
from include/linux/preempt.h:78,
from include/linux/rcupdate.h:40,
from fs/cifs/dfs_cache.c:8:
include/linux/time64.h:66:66: note: expected 'struct timespec64' but argument is of type 'int'
static inline struct timespec64 timespec64_add(struct timespec64 lhs,
~~~~~~~~~~~~~~~~~~^~~
fs/cifs/dfs_cache.c:343:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
Caused by:
commit ccea641b6742 ("timekeeping: remove obsolete time accessors")
interacting with:
commit 34a44fb160f9 ("cifs: Add DFS cache routines")
from the cifs tree.
Signed-off-by: Stephen Rothwell <[email protected]>
Reviewed-by: Paulo Alcantara <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
* Add new dfs_cache.[ch] files
* Add new /proc/fs/cifs/dfscache file
- dump current cache when read
- clear current cache when writing "0" to it
* Add delayed_work to periodically refresh cache entries
The new interface will be used for caching DFS referrals, as well as
supporting client target failover.
The DFS cache is a hashtable that maps UNC paths to cache entries.
A cache entry contains:
- the UNC path it is mapped on
- how much the the UNC path the entry consumes
- flags
- a Time-To-Live after which the entry expires
- a list of possible targets (linked lists of UNC paths)
- a "hint target" pointing the last known working target or the first
target if none were tried. This hint lets cifs.ko remember and try
working targets first.
* Looking for an entry in the cache is done with dfs_cache_find()
- if no valid entries are found, a DFS query is made, stored in the
cache and returned
- the full target list can be copied and returned to avoid race
conditions and looped on with the help with the
dfs_cache_tgt_iterator
* Updating the target hint to the next target is done with
dfs_cache_update_tgthint()
These functions have a dfs_cache_noreq_XXX() version that doesn't
fetches referrals if no entries are found. These versions don't
require the tcp/ses/tcon/cifs_sb parameters as a result.
Expired entries cannot be used and since they have a pretty short TTL
[1] in order for them to be useful for failover the DFS cache adds a
delayed work called periodically to keep them fresh.
Since we might not have available connections to issue the referral
request when refreshing we need to store volume_info structs with
credentials and other needed info to be able to connect to the right
server.
1: Windows defaults: 5mn for domain-based referrals, 30mn for regular
links
Signed-off-by: Paulo Alcantara <[email protected]>
Signed-off-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
h8300 builds fail with:
In file included from drivers/of/address.c:11:
include/linux/pci.h:1966:20: error: redefinition of 'pcibios_penalize_isa_irq'
This is because CONFIG_PCI is not enabled, and pcibios_penalize_isa_irq()
is now declared as inline static function in generic code if this is the
case. Since h8300 does not support PCI to start with, fix the problem by
removing the architecture specific pci.h.
Fixes: 5d32a66541c46 ("PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set")
Cc: Sinan Kaya <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Yoshinori Sato <[email protected]>
|
|
Fix the following warning:
no previous prototype for ‘dbg_sym_flags’ [-Wmissing-prototypes]
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Currently, images.c is included by qconf.cc and gconf.c.
qconf.cc uses all of xpm_* arrays, but gconf.c only some of them.
Hence, lots of "... defined but not used" warnings are displayed
while compiling gconf.c
Splitting out images.c fixes the warnings.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Add "static" to functions that are locally used in gconf.c
This fixes some "no previous prototype for ..." warnings.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Compile zconf.lex.c independently of the other files.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
I want to compile each C file independently instead of including all
of them from zconf.y.
Split out confdata.c, expr.c, symbol.c, and preprocess.c .
These are low-hanging fruits.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
All files in lxdialog/ are licensed under GPL-2.0+, and the rest are
under GPL-2.0. I added GPL-2.0 tags to test scripts in tests/.
Documentation/process/license-rules.rst does not suggest anything
about the flex/bison files. Because flex does not accept the C++
comment style at the very top of a file, I used the C style for
zconf.l, and so for zconf.y for consistency.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Commit 7a88488bbc23 ("[PATCH] kconfig: use gperf for kconfig keywords")
introduced gperf for the keyword lookup.
Then, commit bb3290d91695 ("Remove gperf usage from toolchain") killed
the gperf use. As a result, the linear keyword search was left behind.
If we do not use gperf, there is no reason to have the separate table
of the keywords. Move all keywords back to the lexer.
I also refactored the lexer to remove the COMMAND and PARAM states.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
GVT fixes for v4.21-rc1
Signed-off-by: Dave Airlie <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|