blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2018-07-23	devres: Add devm_of_iomap()	Benjamin Herrenschmidt	1	-0/+36
	There are still quite a few cases where a device might want to get to a different node of the device-tree, obtain the resources and map them. We have of_iomap() and of_io_request_and_map() but they both have shortcomings, such as not returning the size of the resource found (which can be useful) and not being "managed". This adds a devm_of_iomap() that provides all of these and should probably replace uses of the above in most drivers. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Reviewed-by: Joel Stanley <[email protected]>
2018-07-21	Merge branch 'core-urgent-for-linus' of ↵	Linus Torvalds	1	-4/+73
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core kernel fixes from Ingo Molnar: "This is mostly the copy_to_user_mcsafe() related fixes from Dan Williams, and an ORC fix for Clang" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handling lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe() lib/iov_iter: Document _copy_to_iter_flushcache() lib/iov_iter: Document _copy_to_iter_mcsafe() objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang
2018-07-20	kobject: kset_create_and_add() - fetch ownership info from parent	Dmitry Torokhov	1	-1/+8
	This change implements get_ownership() for ksets created with kset_create_and_add() call by fetching ownership data from parent kobject. This is done mostly for benefit of "queues" attribute of net devices so that corresponding directory belongs to container's root instead of global root for network devices in a container. Signed-off-by: Dmitry Torokhov <[email protected]> Reviewed-by: Tyler Hicks <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-20	sysfs, kobject: allow creating kobject belonging to arbitrary users	Dmitry Torokhov	1	-0/+19
	Normally kobjects and their sysfs representation belong to global root, however it is not necessarily the case for objects in separate namespaces. For example, objects in separate network namespace logically belong to the container's root and not global root. This change lays groundwork for allowing network namespace objects ownership to be transferred to container's root user by defining get_ownership() callback in ktype structure and using it in sysfs code to retrieve desired uid/gid when creating sysfs objects for given kobject. Co-Developed-by: Tyler Hicks <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Tyler Hicks <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-20	Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux	David S. Miller	1	-8/+19
	All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <[email protected]>
2018-07-18	lib/rhashtable: consider param->min_size when setting initial table size	Davidlohr Bueso	1	-6/+11
	rhashtable_init() currently does not take into account the user-passed min_size parameter unless param->nelem_hint is set as well. As such, the default size (number of buckets) will always be HASH_DEFAULT_SIZE even if the smallest allowed size is larger than that. Remediate this by unconditionally calling into rounded_hashtable_size() and handling things accordingly. Signed-off-by: Davidlohr Bueso <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-17	vsprintf: Add command line option debug_boot_weak_hash	Tobin C. Harding	1	-0/+17
	Currently printing [hashed] pointers requires enough entropy to be available. Early in the boot sequence this may not be the case resulting in a dummy string '(____ptrval____)' being printed. This makes debugging the early boot sequence difficult. We can relax the requirement to use cryptographically secure hashing during debugging. This enables debugging while keeping development/production kernel behaviour the same. If new command line option debug_boot_weak_hash is enabled use cryptographically insecure hashing and hash pointer value immediately. Reviewed-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Tobin C. Harding <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2018-07-17	vsprintf: Use hw RNG for ptr_key	Tobin C. Harding	1	-1/+9
	Currently we must wait for enough entropy to become available before hashed pointers can be printed. We can remove this wait by using the hw RNG if available. Use hw RNG to get keying material. Reviewed-by: Steven Rostedt (VMware) <[email protected]> Suggested-by: Kees Cook <[email protected]> Signed-off-by: Tobin C. Harding <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2018-07-17	Merge tag 'v4.18-rc5' into x86/mm, to pick up fixes	Ingo Molnar	1	-0/+20
	Signed-off-by: Ingo Molnar <[email protected]>
2018-07-17	Merge tag 'v4.18-rc5' into locking/core, to pick up fixes	Ingo Molnar	8	-19/+67
	Signed-off-by: Ingo Molnar <[email protected]>
2018-07-16	lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()	Dan Williams	1	-4/+33
	By mistake the ITER_PIPE early-exit / warning from copy_from_iter() was cargo-culted in _copy_to_iter_mcsafe() rather than a machine-check-safe version of copy_to_iter_pipe(). Implement copy_pipe_to_iter_mcsafe() being careful to return the indication of short copies due to a CPU exception. Without this regression-fix all splice reads to dax-mode files fail. Reported-by: Ross Zwisler <[email protected]> Tested-by: Ross Zwisler <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Fixes: 8780356ef630 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()") Link: http://lkml.kernel.org/r/153108277278.37979.3327916996902264102.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2018-07-16	lib/iov_iter: Document _copy_to_iter_flushcache()	Dan Williams	1	-0/+14
	Add some theory of operation documentation to _copy_to_iter_flushcache(). Reported-by: Al Viro <[email protected]> Signed-off-by: Dan Williams <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/153108276767.37979.9462477994086841699.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2018-07-16	lib/iov_iter: Document _copy_to_iter_mcsafe()	Dan Williams	1	-0/+26
	Add some theory of operation documentation to _copy_to_iter_mcsafe(). Reported-by: Al Viro <[email protected]> Signed-off-by: Dan Williams <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/153108276256.37979.1689794213845539316.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2018-07-13	locking/refcount: Always allow checked forms	Mark Rutland	1	-28/+25
	In many cases, it would be useful to be able to use the full sanity-checked refcount helpers regardless of CONFIG_REFCOUNT_FULL, as this would help to avoid duplicate warnings where callers try to sanity-check refcount manipulation. This patch refactors things such that the full refcount helpers were always built, as refcount_${op}_checked(), such that they can be used regardless of CONFIG_REFCOUNT_FULL. This will allow code which always wants a checked refcount to opt-in, avoiding the need to duplicate the logic for warnings. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: David Sterba <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-10	reed_solomon: Fix kernel-doc	Matthew Wilcox	1	-1/+1
	The current doc build warns: ./lib/reed_solomon/reed_solomon.c:287: WARNING: Unknown target name: "gfp". This is because it misinterprets the "GFP_" that is part of the description. Change the description to avoid the problem. Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2018-07-09	rhashtable: add restart routine in rhashtable_free_and_destroy()	Taehee Yoo	1	-1/+7
	rhashtable_free_and_destroy() cancels re-hash deferred work then walks and destroys elements. at this moment, some elements can be still in future_tbl. that elements are not destroyed. test case: nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy all elements of sets before destroying sets and chains. But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl. so that splat occurred. test script: %cat test.nft table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset %while :; do nft -f test.nft; done Splat looks like: [ 200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363! [ 200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24 [ 200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables] [ 200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff [ 200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282 [ 200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000 [ 200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90 [ 200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa [ 200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200 [ 200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000 [ 200.906354] FS: 00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 200.915533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0 [ 200.930353] Call Trace: [ 200.932351] ? nf_tables_commit+0x26f6/0x2c60 [nf_tables] [ 200.939525] ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables] [ 200.947525] ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables] [ 200.952383] ? nft_add_set_elem+0x1700/0x1700 [nf_tables] [ 200.959532] ? nla_parse+0xab/0x230 [ 200.963529] ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink] [ 200.968384] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 200.975525] ? debug_show_all_locks+0x290/0x290 [ 200.980363] ? debug_show_all_locks+0x290/0x290 [ 200.986356] ? sched_clock_cpu+0x132/0x170 [ 200.990352] ? find_held_lock+0x39/0x1b0 [ 200.994355] ? sched_clock_local+0x10d/0x130 [ 200.999531] ? memset+0x1f/0x40 V2: - free all tables requested by Herbert Xu Signed-off-by: Taehee Yoo <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-09	printk/nmi: Prevent deadlock when accessing the main log buffer in NMI	Petr Mladek	1	-3/+0
	The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") brought back the possible deadlocks in printk() and NMI. The check of logbuf_lock is done only in printk_nmi_enter() to prevent mixed output. But another CPU might take the lock later, enter NMI, and: + Both NMIs might be serialized by yet another lock, for example, the one in nmi_cpu_backtrace(). + The other CPU might get stopped in NMI, see smp_send_stop() in panic(). The only safe solution is to use trylock when storing the message into the main log-buffer. It might cause reordering when some lines go to the main lock buffer directly and others are delayed via the per-CPU buffer. It means that it is not useful in general. This patch replaces the problematic NMI deferred context with NMI direct context. It can be used to mark a code that might produce many messages in NMI and the risk of losing them is more critical than problems with eventual reordering. The context is then used when dumping trace buffers on oops. It was the primary motivation for the original fix. Also the reordering is even smaller issue there because some traces have their own time stamps. Finally, nmi_cpu_backtrace() need not longer be serialized because it will always us the per-CPU buffers again. Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] To: Steven Rostedt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]>
2018-07-07	kobject: Replace strncpy with memcpy	Guenter Roeck	1	-1/+1
	gcc 8.1.0 complains: lib/kobject.c:128:3: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation] lib/kobject.c: In function 'kobject_get_path': lib/kobject.c:125:13: note: length computed here Using strncpy() is indeed less than perfect since the length of data to be copied has already been determined with strlen(). Replace strncpy() with memcpy() to address the warning and optimize the code a little. Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-07-07	lib: reciprocal_div: implement the improved algorithm on the paper mentioned	Jiong Wang	1	-0/+41
	The new added "reciprocal_value_adv" implements the advanced version of the algorithm described in Figure 4.2 of the paper except when "divisor > (1U << 31)" whose ceil(log2(d)) result will be 32 which then requires u128 divide on host. The exception case could be easily handled before calling "reciprocal_value_adv". The advanced version requires more complex calculation to get the reciprocal multiplier and other control variables, but then could reduce the required emulation operations. It makes no sense to use this advanced version for host divide emulation, those extra complexities for calculating multiplier etc could completely waive our saving on emulation operations. However, it makes sense to use it for JIT divide code generation (for example eBPF JIT backends) for which we are willing to trade performance of JITed code with that of host. As shown by the following pseudo code, the required emulation operations could go down from 6 (the basic version) to 3 or 4. To use the result of "reciprocal_value_adv", suppose we want to calculate n/d, the C-style pseudo code will be the following, it could be easily changed to real code generation for other JIT targets. struct reciprocal_value_adv rvalue; u8 pre_shift, exp; // handle exception case. if (d >= (1U << 31)) { result = n >= d; return; } rvalue = reciprocal_value_adv(d, 32) exp = rvalue.exp; if (rvalue.is_wide_m && !(d & 1)) { // floor(log2(d & (2^32 -d))) pre_shift = fls(d & -d) - 1; rvalue = reciprocal_value_adv(d >> pre_shift, 32 - pre_shift); } else { pre_shift = 0; } // code generation starts. if (imm == 1U << exp) { result = n >> exp; } else if (rvalue.is_wide_m) { // pre_shift must be zero when reached here. t = (n * rvalue.m) >> 32; result = n - t; result >>= 1; result += t; result >>= rvalue.sh - 1; } else { if (pre_shift) result = n >> pre_shift; result = ((u64)result * rvalue.m) >> 32; result >>= rvalue.sh; } Signed-off-by: Jiong Wang <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-07-06	Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux ↵	Dave Airlie	1	-1/+1
	into drm-next A patchset worked out together with Peter Zijlstra. Ingo is OK with taking it through the DRM tree: This is a small fallout from a work to allow batching WW mutex locks and unlocks. Our Wound-Wait mutexes actually don't use the Wound-Wait algorithm but the Wait-Die algorithm. One could perhaps rename those mutexes tree-wide to "Wait-Die mutexes" or "Deadlock Avoidance mutexes". Another approach suggested here is to implement also the "Wound-Wait" algorithm as a per-WW-class choice, as it has advantages in some cases. See for example http://www.mathcs.emory.edu/~cheung/Courses/554/Syllabus/8-recv+serial/deadlock-compare.html Now Wound-Wait is a preemptive algorithm, and the preemption is implemented using a lazy scheme: If a wounded transaction is about to go to sleep on a contended WW mutex, we return -EDEADLK. That is sufficient for deadlock prevention. Since with WW mutexes we also require the aborted transaction to sleep waiting to lock the WW mutex it was aborted on, this choice also provides a suitable WW mutex to sleep on. If we were to return -EDEADLK on the first WW mutex lock after the transaction was wounded whether the WW mutex was contended or not, the transaction might frequently be restarted without a wait, which is far from optimal. Note also that with the lazy preemption scheme, contrary to Wait-Die there will be no rollbacks on lock contention of locks held by a transaction that has completed its locking sequence. The modeset locks are then changed from Wait-Die to Wound-Wait since the typical locking pattern of those locks very well matches the criterion for a substantial reduction in the number of rollbacks. For reservation objects, the benefit is more unclear at this point and they remain using Wait-Die. Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-07-04	ioremap: Update pgtable free interfaces with addr	Chintan Pandya	1	-2/+2
	The following kernel panic was observed on ARM64 platform due to a stale TLB entry. 1. ioremap with 4K size, a valid pte page table is set. 2. iounmap it, its pte entry is set to 0. 3. ioremap the same address with 2M size, update its pmd entry with a new value. 4. CPU may hit an exception because the old pmd entry is still in TLB, which leads to a kernel panic. Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page table") has addressed this panic by falling to pte mappings in the above case on ARM64. To support pmd mappings in all cases, TLB purge needs to be performed in this case on ARM64. Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page() so that TLB purge can be added later in seprate patches. [[email protected]: merge changes, rewrite patch description] Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: Chintan Pandya <[email protected]> Signed-off-by: Toshi Kani <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Will Deacon <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: [email protected] Cc: Andrew Morton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-04	RAID/s390: Remove VLA usage	Kees Cook	1	-16/+18
	In the quest to remove all stack VLA usage from the kernel[1], this moves the "$#" replacement from being an argument to being inside the function, which avoids generating VLAs. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-07-03	lib: rhashtable: Correct self-assignment in rhashtable.c	Rishabh Bhatnagar	1	-1/+1
	In file lib/rhashtable.c line 777, skip variable is assigned to itself. The following error was observed: lib/rhashtable.c:777:41: warning: explicitly assigning value of variable of type 'int' to itself [-Wself-assign] error, forbidden warning: rhashtable.c:777 This error was found when compiling with Clang 6.0. Change it to iter->skip. Signed-off-by: Rishabh Bhatnagar <[email protected]> Acked-by: Herbert Xu <[email protected]> Reviewed-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-03	locking: Implement an algorithm choice for Wound-Wait mutexes	Thomas Hellstrom	1	-1/+1
	The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait-Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions. Update documentation and callers. Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15 Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64: Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s. Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Gustavo Padovan <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Sean Paul <[email protected]> Cc: David Airlie <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Kate Stewart <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Co-authored-by: Peter Zijlstra <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Ingo Molnar <[email protected]>
2018-07-03	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller	5	-14/+22
	Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <[email protected]>
2018-07-02	scsi: klist: Make it safe to use klists in atomic context	Bart Van Assche	1	-4/+6
	In the scsi_transport_srp implementation it cannot be avoided to iterate over a klist from atomic context when using the legacy block layer instead of blk-mq. Hence this patch that makes it safe to use klists in atomic context. This patch avoids that lockdep reports the following: WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&q->__queue_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&q->__queue_lock)->rlock); stack backtrace: Workqueue: kblockd blk_timeout_work Call Trace: dump_stack+0xa4/0xf5 check_usage+0x6e6/0x700 __lock_acquire+0x185d/0x1b50 lock_acquire+0xd2/0x260 _raw_spin_lock+0x32/0x50 klist_next+0x47/0x190 device_for_each_child+0x8e/0x100 srp_timed_out+0xaf/0x1d0 [scsi_transport_srp] scsi_times_out+0xd4/0x410 [scsi_mod] blk_rq_timed_out+0x36/0x70 blk_timeout_work+0x1b5/0x220 process_one_work+0x4fe/0xad0 worker_thread+0x63/0x5a0 kthread+0x1c1/0x1e0 ret_from_fork+0x24/0x30 See also commit c9ddf73476ff ("scsi: scsi_transport_srp: Fix shost to rport translation"). Signed-off-by: Bart Van Assche <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: James Bottomley <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	1	-0/+20
	Pull networking fixes from David Miller: 1) Verify netlink attributes properly in nf_queue, from Eric Dumazet. 2) Need to bump memory lock rlimit for test_sockmap bpf test, from Yonghong Song. 3) Fix VLAN handling in lan78xx driver, from Dave Stevenson. 4) Fix uninitialized read in nf_log, from Jann Horn. 5) Fix raw command length parsing in mlx5, from Alex Vesker. 6) Cleanup loopback RDS connections upon netns deletion, from Sowmini Varadhan. 7) Fix regressions in FIB rule matching during create, from Jason A. Donenfeld and Roopa Prabhu. 8) Fix mpls ether type detection in nfp, from Pieter Jansen van Vuuren. 9) More bpfilter build fixes/adjustments from Masahiro Yamada. 10) Fix XDP_{TX,REDIRECT} flushing in various drivers, from Jesper Dangaard Brouer. 11) fib_tests.sh file permissions were broken, from Shuah Khan. 12) Make sure BH/preemption is disabled in data path of mac80211, from Denis Kenzior. 13) Don't ignore nla_parse_nested() return values in nl80211, from Johannes berg. 14) Properly account sock objects ot kmemcg, from Shakeel Butt. 15) Adjustments to setting bpf program permissions to read-only, from Daniel Borkmann. 16) TCP Fast Open key endianness was broken, it always took on the host endiannness. Whoops. Explicitly make it little endian. From Yuching Cheng. 17) Fix prefix route setting for link local addresses in ipv6, from David Ahern. 18) Potential Spectre v1 in zatm driver, from Gustavo A. R. Silva. 19) Various bpf sockmap fixes, from John Fastabend. 20) Use after free for GRO with ESP, from Sabrina Dubroca. 21) Passing bogus flags to crypto_alloc_shash() in ipv6 SR code, from Eric Biggers. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) qede: Adverstise software timestamp caps when PHC is not available. qed: Fix use of incorrect size in memcpy call. qed: Fix setting of incorrect eswitch mode. qed: Limit msix vectors in kdump kernel to the minimum required count. ipvlan: call dev_change_flags when ipvlan mode is reset ipv6: sr: fix passing wrong flags to crypto_alloc_shash() net: fix use-after-free in GRO with ESP tcp: prevent bogus FRTO undos with non-SACK flows bpf: sockhash, add release routine bpf: sockhash fix omitted bucket lock in sock_close bpf: sockmap, fix smap_list_map_remove when psock is in many maps bpf: sockmap, fix crash when ipv6 sock is added net: fib_rules: bring back rule_exists to match rule during add hv_netvsc: split sub-channel setup into async and sync net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN atm: zatm: Fix potential Spectre v1 s390/qeth: consistently re-enable device features s390/qeth: don't clobber buffer on async TX completion s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6] s390/qeth: fix race when setting MAC address ...
2018-07-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller	1	-0/+20
	Daniel Borkmann says: ==================== pull-request: bpf 2018-07-01 The following pull-request contains BPF updates for your net tree. The main changes are: 1) A bpf_fib_lookup() helper fix to change the API before freeze to return an encoding of the FIB lookup result and return the nexthop device index in the params struct (instead of device index as return code that we had before), from David. 2) Various BPF JIT fixes to address syzkaller fallout, that is, do not reject progs when set_memory_*() fails since it could still be RO. Also arm32 JIT was not using bpf_jit_binary_lock_ro() API which was an issue, and a memory leak in s390 JIT found during review, from Daniel. 3) Multiple fixes for sockmap/hash to address most of the syzkaller triggered bugs. Usage with IPv6 was crashing, a GPF in bpf_tcp_close(), a missing sock_map_release() routine to hook up to callbacks, and a fix for an omitted bucket lock in sock_close(), from John. 4) Two bpftool fixes to remove duplicated error message on program load, and another one to close the libbpf object after program load. One additional fix for nfp driver's BPF offload to avoid stopping offload completely if replace of program failed, from Jakub. 5) Couple of BPF selftest fixes that bail out in some of the test scripts if the user does not have the right privileges, from Jeffrin. 6) Fixes in test_bpf for s390 when CONFIG_BPF_JIT_ALWAYS_ON is set where we need to set the flag that some of the test cases are expected to fail, from Kleber. 7) Fix to detangle BPF_LIRC_MODE2 dependency from CONFIG_CGROUP_BPF since it has no relation to it and lirc2 users often have configs without cgroups enabled and thus would not be able to use it, from Sean. 8) Fix a selftest failure in sockmap by removing a useless setrlimit() call that would set a too low limit where at the same time we are already including bpf_rlimit.h that does the job, from Yonghong. 9) Fix BPF selftest config with missing missing NET_SCHED, from Anders. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-06-30	Merge tag 'for-linus-20180629' of git://git.kernel.dk/linux-block	Linus Torvalds	1	-6/+0
	Pull block fixes from Jens Axboe: "Small set of fixes for this series. Mostly just minor fixes, the only oddball in here is the sg change. The sg change came out of the stall fix for NVMe, where we added a mempool and limited us to a single page allocation. CONFIG_SG_DEBUG sort-of ruins that, since we'd need to account for that. That's actually a generic problem, since lots of drivers need to allocate SG lists. So this just removes support for CONFIG_SG_DEBUG, which I added back in 2007 and to my knowledge it was never useful. Anyway, outside of that, this pull contains: - clone of request with special payload fix (Bart) - drbd discard handling fix (Bart) - SATA blk-mq stall fix (me) - chunk size fix (Keith) - double free nvme rdma fix (Sagi)" * tag 'for-linus-20180629' of git://git.kernel.dk/linux-block: sg: remove ->sg_magic member drbd: Fix drbd_request_prepare() discard handling blk-mq: don't queue more if we get a busy return block: Fix cloning of requests with a special payload nvme-rdma: fix possible double free of controller async event buffer block: Fix transfer when chunk sectors exceeds max
2018-06-29	sg: remove ->sg_magic member	Jens Axboe	1	-6/+0
	This was introduced more than a decade ago when sg chaining was added, but we never really caught anything with it. The scatterlist entry size can be critical, since drivers allocate it, so remove the magic member. Recently it's been triggering allocation stalls and failures in NVMe. Tested-by: Jordan Glover <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-28	test_bpf: flag tests that cannot be jited on s390	Kleber Sacilotto de Souza	1	-0/+20
	Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited on s390 because they exceed BPF_SIZE_MAX and fail when CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP so the tests pass in that case. Signed-off-by: Kleber Sacilotto de Souza <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-06-28	Merge tag 'printk-for-4.18-rc3' of ↵	Linus Torvalds	1	-7/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk fix from Petr Mladek: "Revert a commit that went in by mistake. I already have a better fix in the queue for 4.19" * tag 'printk-for-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests"
2018-06-28	kasan: depend on CONFIG_SLUB_DEBUG	Jason A. Donenfeld	1	-0/+1
	KASAN depends on having access to some of the accounting that SLUB_DEBUG does; without it, there are immediate crashes [1]. So, the natural thing to do is to make KASAN select SLUB_DEBUG. [1] http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQg@mail.gmail.com Link: http://lkml.kernel.org/r/[email protected] Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()") Signed-off-by: Jason A. Donenfeld <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: David Rientjes <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-28	lib/percpu_ida.c: don't do alloc from per-CPU list if there is none	Sebastian Andrzej Siewior	1	-1/+1
	In commit 804209d8a009 ("lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock") I inlined alloc_local_tag() and mixed up the >= check from percpu_ida_alloc() with the one in alloc_local_tag(). Don't alloc from per-CPU freelist if ->nr_free is zero. Link: http://lkml.kernel.org/r/[email protected] Fixes: 804209d8a009 ("lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reported-by: David Disseldorp <[email protected]> Tested-by: David Disseldorp <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Nicholas Bellinger <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-28	netlink: Return extack message if attribute validation fails	David Ahern	1	-2/+2
	Have one extack message for parsing and validating. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-27	bitfield: add tests	Johannes Berg	3	-0/+176
	Add tests for the bitfield helpers. The constant ones will all be folded to nothing by the compiler (if everything is correct in the header file), and the variable ones do some tests against open-coding the necessary shifts. A few test cases that should fail/warn compilation are provided under ifdef. Suggested-by: Andy Shevchenko <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-06-27	printk: Make CONSOLE_LOGLEVEL_QUIET configurable	Hans de Goede	1	-0/+11
	The goal of passing the "quiet" option to the kernel is for the kernel to be quiet unless something really is wrong. Sofar passing quiet has been (mostly) equivalent to passing loglevel=4 on the kernel commandline. Which means to show any messages with a level of KERN_ERR or higher severity on the console. In practice this often does not result in a quiet boot though, since there are many false-positive or otherwise harmless error messages printed, defeating the purpose of the quiet option. Esp. the ACPICA code is really bad wrt this, but there are plenty of others too. This commit makes CONSOLE_LOGLEVEL_QUIET configurable. This for example will allow distros which want quiet to really mean quiet to set CONSOLE_LOGLEVEL_QUIET so that only messages with a higher severity then KERN_ERR (CRIT, ALERT, EMERG) get printed, avoiding an endless game of whack-a-mole silencing harmless error messages. Link: http://lkml.kernel.org/r/[email protected] To: Petr Mladek <[email protected]> To: Sergey Senozhatsky <[email protected]> Cc: Hans de Goede <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Signed-off-by: Hans de Goede <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]>
2018-06-26	Merge branch 'linus' into perf/core, to pick up fixes	Ingo Molnar	3	-5/+45
	Signed-off-by: Ingo Molnar <[email protected]>
2018-06-26	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller	3	-5/+45

2018-06-25	Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests"	Petr Mladek	1	-7/+0
	This reverts commit ee410f15b1418f2f4428e79980674c979081bcb7. It might prevent the machine from boot. It would wait for enough randomness at the very beginning of kernel_init(). But there is basically nothing running in parallel that would help to produce any randomness. Reported-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Petr Mladek <[email protected]>
2018-06-24	Merge branch 'locking-urgent-for-linus' of ↵	Linus Torvalds	3	-5/+45
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of fixes and updates for the locking code: - Prevent lockdep from updating irq state within its own code and thereby confusing itself. - Buid fix for older GCCs which mistreat anonymous unions - Add a missing lockdep annotation in down_read_non_onwer() which causes up_read_non_owner() to emit a lockdep splat - Remove the custom alpha dec_and_lock() implementation which is incorrect in terms of ordering and use the generic one. The remaining two commits are not strictly fixes. They provide irqsave variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These are required to merge the relevant updates and cleanups into different maintainer trees for 4.19, so routing them into mainline without actual users is the sanest approach. They should have been in -rc1, but last weekend I took the liberty to just avoid computers in order to regain some mental sanity" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/qspinlock: Fix build for anonymous union in older GCC compilers locking/lockdep: Do not record IRQ state within lockdep code locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS locking/refcounts: Implement refcount_dec_and_lock_irqsave() atomic: Add irqsave variant of atomic_dec_and_lock() alpha: Remove custom dec_and_lock() implementation
2018-06-22	rhashtable: clean up dereference of ->future_tbl.	NeilBrown	1	-5/+4
	Using rht_dereference_bucket() to dereference ->future_tbl looks like a type error, and could be confusing. Using rht_dereference_rcu() to test a pointer for NULL adds an unnecessary barrier - rcu_access_pointer() is preferred for NULL tests when no lock is held. This uses 3 different ways to access ->future_tbl. - if we know the mutex is held, use rht_dereference() - if we don't hold the mutex, and are only testing for NULL, use rcu_access_pointer() - otherwise (using RCU protection for true dereference), use rht_dereference_rcu(). Note that this includes a simplification of the call to rhashtable_last_table() - we don't do an extra dereference before the call any more. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: use cmpxchg() to protect ->future_tbl.	NeilBrown	1	-11/+4
	Rather than borrowing one of the bucket locks to protect ->future_tbl updates, use cmpxchg(). This gives more freedom to change how bucket locking is implemented. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: simplify nested_table_alloc() and rht_bucket_nested_insert()	NeilBrown	1	-12/+6
	Now that we don't use the hash value or shift in nested_table_alloc() there is room for simplification. We only need to pass a "is this a leaf" flag to nested_table_alloc(), and don't need to track as much information in rht_bucket_nested_insert(). Note there is another minor cleanup in nested_table_alloc() here. The number of elements in a page of "union nested_tables" is most naturally PAGE_SIZE / sizeof(ntbl[0]) The previous code had PAGE_SIZE / sizeof(ntbl[0].bucket) which happens to be the correct value only because the bucket uses all the space in the union. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: simplify INIT_RHT_NULLS_HEAD()	NeilBrown	1	-9/+6
	The 'ht' and 'hash' arguments to INIT_RHT_NULLS_HEAD() are no longer used - so drop them. This allows us to also remove the nhash argument from nested_table_alloc(). Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: remove nulls_base and related code.	NeilBrown	2	-12/+1
	This "feature" is unused, undocumented, and untested and so doesn't really belong. A patch is under development to properly implement support for detecting when a search gets diverted down a different chain, which the common purpose of nulls markers. This patch actually fixes a bug too. The table resizing allows a table to grow to 2^31 buckets, but the hash is truncated to 27 bits - any growth beyond 2^27 is wasteful an ineffective. This patch results in NULLS_MARKER(0) being used for all chains, and leaves the use of rht_is_a_null() to test for it. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: split rhashtable.h	NeilBrown	1	-0/+1
	Due to the use of rhashtables in net namespaces, rhashtable.h is included in lots of the kernel, so a small changes can required a large recompilation. This makes development painful. This patch splits out rhashtable-types.h which just includes the major type declarations, and does not include (non-trivial) inline code. rhashtable.h is no longer included by anything in the include/ directory. Common include files only include rhashtable-types.h so a large recompilation is only triggered when that changes. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	rhashtable: silence RCU warning in rhashtable_test.	NeilBrown	1	-0/+3
	print_ht in rhashtable_test calls rht_dereference() with neither RCU protection or the mutex. This triggers an RCU warning. So take the mutex to silence the warning. Acked-by: Herbert Xu <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22	lib/bch: Remove VLA usage	Kees Cook	2	-8/+16
	In the quest to remove all stack VLA usage from the kernel[1], this allocates a fixed size stack array to cover the range needed for bch. This was done instead of a preallocation on the SLAB due to performance reasons, shown by Ivan Djelic: little-endian, type sizes: int=4 long=8 longlong=8 cpu: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz calibration: iter=4.9143µs niter=2034 nsamples=200 m=13 t=4 Buffer allocation \| Encoding throughput (Mbit/s) --------------------------------------------------- on-stack, VLA \| 3988 on-stack, fixed \| 4494 kmalloc \| 1967 So this change actually improves performance too, it seems. The resulting stack allocation can get rather large; without CONFIG_BCH_CONST_PARAMS, it will allocate 4096 bytes, which trips the stack size checking: lib/bch.c: In function ‘encode_bch’: lib/bch.c:261:1: warning: the frame size of 4432 bytes is larger than 2048 bytes [-Wframe-larger-than=] Even the default case for "allmodconfig" (with CONFIG_BCH_CONST_M=14 and CONFIG_BCH_CONST_T=4) would have started throwing a warning: lib/bch.c: In function ‘encode_bch’: lib/bch.c:261:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=] But this is how large it's always been; it was just hidden from the checker because it was a VLA. So the Makefile has been adjusted to silence this warning for anything smaller than 4500 bytes, which should provide room for normal cases, but still low enough to catch any future pathological situations. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Ivan Djelic <[email protected]> Tested-by: Ivan Djelic <[email protected]> Acked-by: Boris Brezillon <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
2018-06-21	locking/refcounts: Include fewer headers in <linux/refcount.h>	Alexey Dobriyan	1	-0/+2
	Debloat <linux/refcount.h>'s dependencies: - <linux/kernel.h> is not needed, but <linux/compiler.h> is. - <linux/mutex.h> is not needed, only a forward declaration of "struct mutex". - <linux/spinlock.h> is not needed, <linux/spinlock_types.h> is enough. Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/lkml/20180331220036.GA7676@avx2 Signed-off-by: Ingo Molnar <[email protected]>