Age | Commit message (Collapse) | Author | Files | Lines |
|
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Pull networking fixes from David Miller:
"The iwlwifi firmware compat fix is in here as well as some other
stuff:
1) Fix request socket leak introduced by BPF deadlock fix, from Eric
Dumazet.
2) Fix VLAN handling with TXQs in mac80211, from Johannes Berg.
3) Missing __qdisc_drop conversions in prio and qfq schedulers, from
Gao Feng.
4) Use after free in netlink nlk groups handling, from Xin Long.
5) Handle MTU update properly in ipv6 gre tunnels, from Xin Long.
6) Fix leak of ipv6 fib tables on netns teardown, from Sabrina Dubroca
with follow-on fix from Eric Dumazet.
7) Need RCU and preemption disabled during generic XDP data patch,
from John Fastabend"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
Revert "mdio_bus: Remove unneeded gpiod NULL check"
bpf: devmap, use cond_resched instead of cpu_relax
bpf: add support for sockmap detach programs
net: rcu lock and preempt disable missing around generic xdp
bpf: don't select potentially stale ri->map from buggy xdp progs
net: tulip: Constify tulip_tbl
net: ethernet: ti: netcp_core: no need in netif_napi_del
davicom: Display proper debug level up to 6
net: phy: sfp: rename dt properties to match the binding
dt-binding: net: sfp binding documentation
dt-bindings: add SFF vendor prefix
dt-bindings: net: don't confuse with generic PHY property
ip6_tunnel: fix setting hop_limit value for ipv6 tunnel
ip_tunnel: fix setting ttl and tos value in collect_md mode
ipv6: fix typo in fib6_net_exit()
tcp: fix a request socket leak
sctp: fix missing wake ups in some situations
netfilter: xt_hashlimit: fix build error caused by 64bit division
netfilter: xt_hashlimit: alloc hashtable with right size
...
|
|
Merge more updates from Andrew Morton:
- most of the rest of MM
- a small number of misc things
- lib/ updates
- checkpatch
- autofs updates
- ipc/ updates
* emailed patches from Andrew Morton <[email protected]>: (126 commits)
ipc: optimize semget/shmget/msgget for lots of keys
ipc/sem: play nicer with large nsops allocations
ipc/sem: drop sem_checkid helper
ipc: convert kern_ipc_perm.refcount from atomic_t to refcount_t
ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
ipc: convert ipc_namespace.count from atomic_t to refcount_t
kcov: support compat processes
sh: defconfig: cleanup from old Kconfig options
mn10300: defconfig: cleanup from old Kconfig options
m32r: defconfig: cleanup from old Kconfig options
drivers/pps: use surrounding "if PPS" to remove numerous dependency checks
drivers/pps: aesthetic tweaks to PPS-related content
cpumask: make cpumask_next() out-of-line
kmod: move #ifdef CONFIG_MODULES wrapper to Makefile
kmod: split off umh headers into its own file
MAINTAINERS: clarify kmod is just a kernel module loader
kmod: split out umh code into its own file
test_kmod: flip INT checks to be consistent
test_kmod: remove paranoid UINT_MAX check on uint range processing
vfat: deduplicate hex2bin()
...
|
|
I removed all the gperf use, but not the Makefile rules. Sam Ravnborg
says I get bonus points for cleaning this up. I'll hold him to it.
Requested-by: Sam Ravnborg <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
It's pretty much a duplicate of nfs_scan_commit_list() that also
clears the PG_COMMIT_TO_DS flag.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Since the commit list is not ordered, it is possible for nfs_scan_commit_list
to hold a request that nfs_lock_and_join_requests() is waiting for, while
at the same time trying to grab a request that nfs_lock_and_join_requests
already holds.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This reverts commit 1fccb73011ea8a5fa0c6d357c33fa29c695139ea.
Reported as Bug 196509 - iTCO_wdt regression reboot before timeout expire
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
|
|
The kernel watchdog is a great debugging tool for finding tasks that
consume a disproportionate amount of CPU time in contiguous chunks. One
can imagine building a similar watchdog for arbitrary driver threads
using save_stack_trace_tsk() and print_stack_trace(). However, this is
not viable for dynamically loaded driver modules on ARM platforms
because save_stack_trace_tsk() is not exported for those architectures.
Export save_stack_trace_tsk() for the ARM architecture to align with
x86 and support various debugging use cases such as arbitrary driver
thread watchdog timers.
Signed-off-by: Dustin Brown <[email protected]>
Signed-off-by: Russell King <[email protected]>
|
|
Differ between illegal XDP action code and just driver
unsupported one to provide better feedback when we throw
a one-time warning here. Reason is that with 814abfabef3c
("xdp: add bpf_redirect helper function") not all drivers
support the new XDP return code yet and thus they will
fall into their 'default' case when checking for return
codes after program return, which then triggers a
bpf_warn_invalid_xdp_action() stating that the return
code is illegal, but from XDP perspective it's not.
I decided not to place something like a XDP_ACT_MAX define
into uapi i) given we don't have this either for all other
program types, ii) future action codes could have further
encoding there, which would render such define unsuitable
and we wouldn't be able to rip it out again, and iii) we
rarely add new action codes.
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts commit 95b80bf3db03c2bf572a357cf74b9a6aefef0a4a ("mdio_bus:
Remove unneeded gpiod NULL check"), this commit assumed that GPIOLIB
checks for NULL descriptors, so it's safe to drop them, but it is not
when CONFIG_GPIOLIB is disabled in the kernel. If we do call
gpiod_set_value_cansleep() on a GPIO descriptor we will issue warnings
coming from the inline stubs declared in include/linux/gpio/consumer.h.
Fixes: 95b80bf3db03 ("mdio_bus: Remove unneeded gpiod NULL check")
Reported-by: Woojung Huh <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
John Fastabend says:
====================
net: Fixes for XDP/BPF
The following fixes, UAPI updates, and small improvement,
i. XDP needs to be called inside RCU with preempt disabled.
ii. Not strictly a bug fix but we have an attach command in the
sockmap UAPI already to avoid having a single kernel released with
only the attach and not the detach I'm pushing this into net branch.
Its early in the RC cycle so I think this is OK (not ideal but better
than supporting a UAPI with a missing detach forever).
iii. Final patch replace cpu_relax with cond_resched in devmap.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Be a bit more friendly about waiting for flush bits to complete.
Replace the cpu_relax() with a cond_resched().
Suggested-by: Daniel Borkmann <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The bpf map sockmap supports adding programs via attach commands. This
patch adds the detach command to keep the API symmetric and allow
users to remove previously added programs. Otherwise the user would
have to delete the map and re-add it to get in this state.
This also adds a series of additional tests to capture detach operation
and also attaching/detaching invalid prog types.
API note: socks will run (or not run) programs depending on the state
of the map at the time the sock is added. We do not for example walk
the map and remove programs from previously attached socks.
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
do_xdp_generic must be called inside rcu critical section with preempt
disabled to ensure BPF programs are valid and per-cpu variables used
for redirect operations are consistent. This patch ensures this is true
and fixes the splat below.
The netif_receive_skb_internal() code path is now broken into two rcu
critical sections. I decided it was better to limit the preempt_enable/disable
block to just the xdp static key portion and the fallout is more
rcu_read_lock/unlock calls. Seems like the best option to me.
[ 607.596901] =============================
[ 607.596906] WARNING: suspicious RCU usage
[ 607.596912] 4.13.0-rc4+ #570 Not tainted
[ 607.596917] -----------------------------
[ 607.596923] net/core/dev.c:3948 suspicious rcu_dereference_check() usage!
[ 607.596927]
[ 607.596927] other info that might help us debug this:
[ 607.596927]
[ 607.596933]
[ 607.596933] rcu_scheduler_active = 2, debug_locks = 1
[ 607.596938] 2 locks held by pool/14624:
[ 607.596943] #0: (rcu_read_lock_bh){......}, at: [<ffffffff95445ffd>] ip_finish_output2+0x14d/0x890
[ 607.596973] #1: (rcu_read_lock_bh){......}, at: [<ffffffff953c8e3a>] __dev_queue_xmit+0x14a/0xfd0
[ 607.597000]
[ 607.597000] stack backtrace:
[ 607.597006] CPU: 5 PID: 14624 Comm: pool Not tainted 4.13.0-rc4+ #570
[ 607.597011] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A17 03/01/2017
[ 607.597016] Call Trace:
[ 607.597027] dump_stack+0x67/0x92
[ 607.597040] lockdep_rcu_suspicious+0xdd/0x110
[ 607.597054] do_xdp_generic+0x313/0xa50
[ 607.597068] ? time_hardirqs_on+0x5b/0x150
[ 607.597076] ? mark_held_locks+0x6b/0xc0
[ 607.597088] ? netdev_pick_tx+0x150/0x150
[ 607.597117] netif_rx_internal+0x205/0x3f0
[ 607.597127] ? do_xdp_generic+0xa50/0xa50
[ 607.597144] ? lock_downgrade+0x2b0/0x2b0
[ 607.597158] ? __lock_is_held+0x93/0x100
[ 607.597187] netif_rx+0x119/0x190
[ 607.597202] loopback_xmit+0xfd/0x1b0
[ 607.597214] dev_hard_start_xmit+0x127/0x4e0
Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices")
Fixes: b5cdae3291f7 ("net: Generic XDP")
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:
i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.
ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.
iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.
Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.
The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.
Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.
Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It looks like all users of tulip_tbl are reads, so mark this table
as read-only.
$ git grep tulip_tbl # edited to avoid line-wraps...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs&~RxPollInt, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs | TimerInt,
pnic.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip.h: extern struct tulip_chip_table tulip_tbl[];
tulip_core.c:struct tulip_chip_table tulip_tbl[] = {
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR5);
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:const char *chip_name = tulip_tbl[chip_idx].chip_name;
tulip_core.c:if (pci_resource_len (pdev, 0) < tulip_tbl[chip_idx].io_size)
tulip_core.c:ioaddr = pci_iomap(..., tulip_tbl[chip_idx].io_size);
tulip_core.c:tp->flags = tulip_tbl[chip_idx].flags;
tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:INIT_WORK(&tp->media_work, tulip_tbl[tp->chip_id].media_task);
Cc: "David S. Miller" <[email protected]>
Cc: Jarod Wilson <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Don't remove rx_napi specifically just before free_netdev(),
it's supposed to be done in it and is confusing w/o tx_napi deletion.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This will make it explicit some messages are of the form:
dm9000_dbg(db, 5, ...
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Make the Rx rate select control gpio property name match the documented
binding. This would make the addition of 'rate-select1-gpios' for SFP+
support more natural.
Also, make the MOD-DEF0 gpio property name match the documentation.
Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add device-tree binding documentation SFP transceivers. Support for SFP
transceivers has been recently introduced (drivers/net/phy/sfp.c).
Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This complements commit 9a94b3a4bd (dt-binding: phy: don't confuse with
Ethernet phy properties).
The generic PHY 'phys' property sometime appears in the same node with
the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
ethernet.txt to reduce confusion.
Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Similar to vxlan/geneve tunnel, if hop_limit is zero, it should fall
back to ip6_dst_hoplimt().
Signed-off-by: Haishuang Yan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
ttl and tos variables are declared and assigned, but are not used in
iptunnel_xmit() function.
Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Haishuang Yan <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add zstd compression and decompression support to SquashFS. zstd is a
great fit for SquashFS because it can compress at ratios approaching xz,
while decompressing twice as fast as zlib. For SquashFS in particular,
it can decompress as fast as lzo and lz4. It also has the flexibility
to turn down the compression ratio for faster compression times.
The compression benchmark is run on the file tree from the SquashFS archive
found in ubuntu-16.10-desktop-amd64.iso [1]. It uses `mksquashfs` with the
default block size (128 KB) and and various compression algorithms/levels.
xz and zstd are also benchmarked with 256 KB blocks. The decompression
benchmark times how long it takes to `tar` the file tree into `/dev/null`.
See the benchmark file in the upstream zstd source repository located under
`contrib/linux-kernel/squashfs-benchmark.sh` [2] for details.
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.
| Method | Ratio | Compression MB/s | Decompression MB/s |
|----------------|-------|------------------|--------------------|
| gzip | 2.92 | 15 | 128 |
| lzo | 2.64 | 9.5 | 217 |
| lz4 | 2.12 | 94 | 218 |
| xz | 3.43 | 5.5 | 35 |
| xz 256 KB | 3.53 | 5.4 | 40 |
| zstd 1 | 2.71 | 96 | 210 |
| zstd 5 | 2.93 | 69 | 198 |
| zstd 10 | 3.01 | 41 | 225 |
| zstd 15 | 3.13 | 11.4 | 224 |
| zstd 16 256 KB | 3.24 | 8.1 | 210 |
This patch was written by Sean Purcell <[email protected]>, but I will be
taking over the submission process.
[1] http://releases.ubuntu.com/16.10/
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Sean Purcell <[email protected]>
Signed-off-by: Nick Terrell <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Acked-by: Phillip Lougher <[email protected]>
|
|
The writeback code wants to send a commit after processing the pages,
which is why we want to delay releasing the struct path until after
that's done.
Also, the layout code expects that we do not free the inode before
we've put the layout segments in pnfs_writehdr_free() and
pnfs_readhdr_free()
Fixes: 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete")
Fixes: 4714fb51fd03 ("nfs: remove pgio_header refcount, related cleanup")
Cc: [email protected]
Signed-off-by: Trond Myklebust <[email protected]>
|
|
ipc_findkey() used to scan all objects to look for the wanted key. This
is slow when using a high number of keys. This change adds an rhashtable
of kern_ipc_perm objects in ipc_ids, so that one lookup cease to be O(n).
This change gives a 865% improvement of benchmark reaim.jobs_per_min on a
56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory [1]
Other (more micro) benchmark results, by the author: On an i5 laptop, the
following loop executed right after a reboot took, without and with this
change:
for (int i = 0, k=0x424242; i < KEYS; ++i)
semget(k++, 1, IPC_CREAT | 0600);
total total max single max single
KEYS without with call without call with
1 3.5 4.9 µs 3.5 4.9
10 7.6 8.6 µs 3.7 4.7
32 16.2 15.9 µs 4.3 5.3
100 72.9 41.8 µs 3.7 4.7
1000 5,630.0 502.0 µs * *
10000 1,340,000.0 7,240.0 µs * *
31900 17,600,000.0 22,200.0 µs * *
*: unreliable measure: high variance
The duration for a lookup-only usage was obtained by the same loop once
the keys are present:
total total max single max single
KEYS without with call without call with
1 2.1 2.5 µs 2.1 2.5
10 4.5 4.8 µs 2.2 2.3
32 13.0 10.8 µs 2.3 2.8
100 82.9 25.1 µs * 2.3
1000 5,780.0 217.0 µs * *
10000 1,470,000.0 2,520.0 µs * *
31900 17,400,000.0 7,810.0 µs * *
Finally, executing each semget() in a new process gave, when still
summing only the durations of these syscalls:
creation:
total total
KEYS without with
1 3.7 5.0 µs
10 32.9 36.7 µs
32 125.0 109.0 µs
100 523.0 353.0 µs
1000 20,300.0 3,280.0 µs
10000 2,470,000.0 46,700.0 µs
31900 27,800,000.0 219,000.0 µs
lookup-only:
total total
KEYS without with
1 2.5 2.7 µs
10 25.4 24.4 µs
32 106.0 72.6 µs
100 591.0 352.0 µs
1000 22,400.0 2,250.0 µs
10000 2,510,000.0 25,700.0 µs
31900 28,200,000.0 115,000.0 µs
[1] http://lkml.kernel.org/r/20170814060507.GE23258@yexl-desktop
Link: http://lkml.kernel.org/r/20170815194954.ck32ta2z35yuzpwp@debix
Signed-off-by: Guillaume Knispel <[email protected]>
Reviewed-by: Marc Pardo <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: Andrey Vagin <[email protected]>
Cc: Guillaume Knispel <[email protected]>
Cc: Marc Pardo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replacing semop()'s kmalloc for kvmalloc was originally proposed by
Manfred on the premise that it can be called for large (than order-1)
sizes. For example, while Oracle recommends setting SEMOPM to a _minimum_
of 100, some distros[1] encourage the setting to be a factor of the amount
of db tasks (PROCESSES), which can get fishy for large systems (easily
going beyond 1000).
[1] An Example of Semaphore Settings
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Tuning_and_Optimizing_Red_Hat_Enterprise_Linux_for_Oracle_9i_and_10g_Databases/sect-Oracle_9i_and_10g_Tuning_Guide-Setting_Semaphores-An_Example_of_Semaphore_Settings.html
So let's just convert this to kvmalloc, just like the rest of the
allocations we do in ipc. While the fallback vmalloc obviously involves
more overhead, this by far the uncommon path, and it's better for the user
than just erroring out with kmalloc.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
... 'tis not used.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter. This allows to avoid
accidental refcounter overflows that might lead to use-after-free
situations.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter. This allows to avoid
accidental refcounter overflows that might lead to use-after-free
situations.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter. This allows to avoid
accidental refcounter overflows that might lead to use-after-free
situations.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Support compat processes in KCOV by providing compat_ioctl callback.
Compat mode uses the same ioctl callback: we have 2 commands that do not
use the argument and 1 that already checks that the arg does not overflow
INT_MAX. This allows to use KCOV-guided fuzzing in compat processes.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove old, dead Kconfig options (in order appearing in this commit):
- EXPERIMENTAL is gone since v3.9;
- INET_LRO: commit 7bbf3cae65b6 ("ipv4: Remove inet_lro library");
- MTD_CONCAT: commit f53fdebcc3e1 ("mtd: drop MTD_CONCAT from Kconfig
entirely");
- MTD_PARTITIONS: commit 6a8a98b22b10 ("mtd: kill
CONFIG_MTD_PARTITIONS");
- MTD_CHAR: commit 660685d9d1b4 ("mtd: merge mtdchar module with
mtdcore");
- NETDEV_1000 and NETDEV_10000: commit f860b0522f65 ("drivers/net:
Kconfig and Makefile cleanup"); NET_ETHERNET should be replaced with
just ETHERNET but that is separate change;
- HID_SUPPORT: commit 1f41a6a99476 ("HID: Fix the generic Kconfig
options");
- RCU_CPU_STALL_DETECTOR: commit a00e0d714fbd ("rcu: Remove conditional
compilation for RCU CPU stall warnings");
- SYSCTL_SYSCALL_CHECK: commit 7c60c48f58a7 ("sysctl: Improve the
sysctl sanity checks");
- VIDEO_OUTPUT_CONTROL: commit f167a64e9d67 ("video / output: Drop
display output class support");
- MISC_DEVICES: commit 7c5763b8453a ("drivers: misc: Remove
MISC_DEVICES config option");
- AUTOFS_FS: commit 561c5cf9236a ("staging: Remove autofs3");
- IP_NF_QUEUE: commit 3dd6664fac7e ("netfilter: remove unused "config
IP_NF_QUEUE"");
- USB_DEVICE_CLASS: commit 007bab91324e ("USB: remove
CONFIG_USB_DEVICE_CLASS");
- USB_LIBUSUAL: commit f61870ee6f8c ("usb: remove libusual");
- DISPLAY_SUPPORT: commit 5a6b5e02d673 ("fbdev: remove display
subsystem");
- IP_NF_TARGET_ULOG: commit d4da843e6fad ("netfilter: kill remnants of
ulog targets");
- IP6_NF_QUEUE: commit d16cf20e2f2f ("netfilter: remove ip_queue
support");
- IP6_NF_TARGET_LOG: commit 6939c33a757b ("netfilter: merge ipt_LOG and
ip6_LOG into xt_LOG");
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove old, dead Kconfig options (in order appearing in this commit):
- EXPERIMENTAL is gone since v3.9;
- INET_LRO: commit 7bbf3cae65b6 ("ipv4: Remove inet_lro library");
- MTD_PARTITIONS: commit 6a8a98b22b10 ("mtd: kill
CONFIG_MTD_PARTITIONS");
- MTD_CHAR: commit 660685d9d1b4 ("mtd: merge mtdchar module with
mtdcore");
- NETDEV_1000 and NETDEV_10000: commit f860b0522f65 ("drivers/net:
Kconfig and Makefile cleanup"); NET_ETHERNET should be replaced with
just ETHERNET but that is separate change;
- HID_SUPPORT: commit 1f41a6a99476 ("HID: Fix the generic Kconfig
options");
- RCU_CPU_STALL_DETECTOR: commit a00e0d714fbd ("rcu: Remove conditional
compilation for RCU CPU stall warnings");
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove old, dead Kconfig options (in order appearing in this commit):
- EXPERIMENTAL is gone since v3.9;
- IP_NF_QUEUE: commit 3dd6664fac7e ("netfilter: remove unused "config
IP_NF_QUEUE"");
- IP_NF_TARGET_ULOG: commit d4da843e6fad ("netfilter: kill remnants of
ulog targets");
- VIDEO_OUTPUT_CONTROL: commit f167a64e9d67 ("video / output: Drop
display output class support");
- MTD_PARTITIONS: commit 6a8a98b22b10 ("mtd: kill
CONFIG_MTD_PARTITIONS");
- MTD_CHAR: commit 660685d9d1b4 ("mtd: merge mtdchar module with
mtdcore");
- MTD_CONCAT: commit f53fdebcc3e1 ("mtd: drop MTD_CONCAT from Kconfig
entirely");
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Adding high-level "if PPS" makes lower-level dependency tests superfluous.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Robert P. J. Day <[email protected]>
Acked-by: Rodolfo Giometti <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Collection of aesthetic adjustments to various PPS-related files,
directories and Documentation, some quite minor just for the sake of
consistency, including:
* Updated example of pps device tree node (courtesy Rodolfo G.)
* "PPS-API" -> "PPS API"
* "pps_source_info_s" -> "pps_source_info"
* "ktimer driver" -> "pps-ktimer driver"
* "ppstest /dev/pps0" -> "ppstest /dev/pps1" to match example
* Add missing PPS-related entries to MAINTAINERS file
* Other trivialities
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Robert P. J. Day <[email protected]>
Acked-by: Rodolfo Giometti <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Every for_each_XXX_cpu() invocation calls cpumask_next() which is an
inline function:
static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
{
/* -1 is a legal arg here. */
if (n != -1)
cpumask_check(n);
return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
}
However!
find_next_bit() is regular out-of-line function which means "nr_cpu_ids"
load and increment happen at the caller resulting in a lot of bloat
x86_64 defconfig:
add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513)
x86_64 allyesconfig-ish:
add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!!
Some archs redefine find_next_bit() but it is OK:
m68k inline but SMP is not supported
arm out-of-line
unicore32 out-of-line
Function call will happen anyway, so move load and increment into callee.
Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The entire file is now conditionally compiled only when CONFIG_MODULES is
enabled, and this this is a bool. Just move this conditional to the
Makefile as its easier to read this way.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Matt Redfearn <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Daniel Mentz <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In the future usermode helper users do not need to carry in all the of
kmod headers declarations.
Since kmod.h still includes umh.h this change has no functional changes,
each umh user can be cleaned up separately later and with time.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Matt Redfearn <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Daniel Mentz <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This should make it clearer what the kmod code is now that
the umh code is split out separately.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Matt Redfearn <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Daniel Mentz <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "kmod: few code cleanups to split out umh code"
The usermode helper has a provenance from the old usb code which first
required a usermode helper. Eventually this was shoved into kmod.c and
the kernel's modprobe calls was converted over eventually to share the
same code. Over time the list of usermode helpers in the kernel has grown
-- so kmod is just but one user of the API.
This series is a simple logical cleanup which acknowledges the code
evolution of the usermode helper and shoves the UMH API into its own
dedicated file. This way users of the API can later just include umh.h
instead of kmod.h.
Note despite the diff state the first patch really is just a code shove,
no functional changes are done there. I did use git format-patch -M to
generate the patch, but in the end the split was not enough for git to
consider it a rename hence the large diffstat.
I've put this through 0-day and it gives me their machine compilation
blessings with all tests as OK.
This patch (of 4):
There's a slew of usermode helper users and kmod is just one of them.
Split out the usermode helper code into its own file to keep the logic and
focus split up.
This change provides no functional changes.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Matt Redfearn <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Daniel Mentz <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Most checks will check for min and then max, except the int check. Flip
the checks to be consistent with the other code.
[[email protected]: massaged commit log]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: David Binderman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The UINT_MAX comparison is not needed because "max" is already an unsigned
int, and we expect developer C code max value input to have a sensible 0 -
UINT_MAX range. Note that if it so happens to be UINT_MAX + 1 it would
lead to an issue, but we expect the developer to know this.
[[email protected]: massaged commit log]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: David Binderman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We may use hex2bin() instead of custom approach.
Link: http://lkml.kernel.org/r/87zibktpil.fsf@devron
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: OGAWA Hirofumi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|