aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-10-27mm: remove per-zone hashtable of bitlock waitqueuesLinus Torvalds6-182/+21
The per-zone waitqueues exist because of a scalability issue with the page waitqueues on some NUMA machines, but it turns out that they hurt normal loads, and now with the vmalloced stacks they also end up breaking gfs2 that uses a bit_wait on a stack object: wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE) where 'gh' can be a reference to the local variable 'mount_gh' on the stack of fill_super(). The reason the per-zone hash table breaks for this case is that there is no "zone" for virtual allocations, and trying to look up the physical page to get at it will fail (with a BUG_ON()). It turns out that I actually complained to the mm people about the per-zone hash table for another reason just a month ago: the zone lookup also hurts the regular use of "unlock_page()" a lot, because the zone lookup ends up forcing several unnecessary cache misses and generates horrible code. As part of that earlier discussion, we had a much better solution for the NUMA scalability issue - by just making the page lock have a separate contention bit, the waitqueue doesn't even have to be looked at for the normal case. Peter Zijlstra already has a patch for that, but let's see if anybody even notices. In the meantime, let's fix the actual gfs2 breakage by simplifying the bitlock waitqueues and removing the per-zone issue. Reported-by: Andreas Gruenbacher <[email protected]> Tested-by: Bob Peterson <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Steven Whitehouse <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27blk-mq: update hardware and software queues for sleeping allocJens Axboe1-3/+3
If we end up sleeping due to running out of requests, we should update the hardware and software queues in the map ctx structure. Otherwise we could end up having rq->mq_ctx point to the pre-sleep context, and risk corrupting ctx->rq_list since we'll be grabbing the wrong lock when inserting the request. Reported-by: Dave Jones <[email protected]> Reported-by: Chris Mason <[email protected]> Tested-by: Chris Mason <[email protected]> Fixes: 63581af3f31e ("blk-mq: remove non-blocking pass in blk_mq_map_request") Signed-off-by: Jens Axboe <[email protected]>
2016-10-27ALSA: usb-audio: Add quirk for Syntek STK1160Marcel Hasler1-0/+17
The stk1160 chip needs QUIRK_AUDIO_ALIGN_TRANSFER. This patch resolves the issue reported on the mailing list (http://marc.info/?l=linux-sound&m=139223599126215&w=2) and also fixes bug 180071 (https://bugzilla.kernel.org/show_bug.cgi?id=180071). Signed-off-by: Marcel Hasler <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2016-10-27sched/fair: Remove unused but set variable 'rq'Tobias Klauser1-3/+0
Since commit: 8663e24d56dc ("sched/fair: Reorder cgroup creation code") ... the variable 'rq' in alloc_fair_sched_group() is set but no longer used. Remove it to fix the following GCC warning when building with 'W=1': kernel/sched/fair.c:8842:13: warning: variable ‘rq’ set but not used [-Wunused-but-set-variable] Signed-off-by: Tobias Klauser <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-10-27objtool: Fix rare switch jump table pattern detectionJosh Poimboeuf1-1/+1
The following commit: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection") ... improved objtool's ability to detect GCC switch statement jump tables for GCC 6. However the check to allow short jumps with the scanned range of instructions wasn't quite right. The pattern detection should allow jumps to the indirect jump instruction itself. This fixes the following warning: drivers/infiniband/sw/rxe/rxe_comp.o: warning: objtool: rxe_completer()+0x315: sibling call from callable instruction with changed frame pointer Reported-by: Arnd Bergmann <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection") Link: http://lkml.kernel.org/r/20161026153408.2rifnw7bvoc5sex7@treble Signed-off-by: Ingo Molnar <[email protected]>
2016-10-27security/keys: make BIG_KEYS dependent on stdrng.Artem Savkov1-1/+1
Since BIG_KEYS can't be compiled as module it requires one of the "stdrng" providers to be compiled into kernel. Otherwise big_key_crypto_init() fails on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in big_key_preparse()) results in a NULL pointer dereference. Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: Artem Savkov <[email protected]> Signed-off-by: David Howells <[email protected]> cc: Stephan Mueller <[email protected]> cc: Kirill Marinushkin <[email protected]> cc: [email protected] Signed-off-by: James Morris <[email protected]>
2016-10-27KEYS: Sort out big_key initialisationDavid Howells1-27/+32
big_key has two separate initialisation functions, one that registers the key type and one that registers the crypto. If the key type fails to register, there's no problem if the crypto registers successfully because there's no way to reach the crypto except through the key type. However, if the key type registers successfully but the crypto does not, big_key_rng and big_key_blkcipher may end up set to NULL - but the code neither checks for this nor unregisters the big key key type. Furthermore, since the key type is registered before the crypto, it is theoretically possible for the kernel to try adding a big_key before the crypto is set up, leading to the same effect. Fix this by merging big_key_crypto_init() and big_key_init() and calling the resulting function late. If they're going to be encrypted, we shouldn't be creating big_keys before we have the facilities to do the encryption available. The key type registration is also moved after the crypto initialisation. The fix also includes message printing on failure. If the big_key type isn't correctly set up, simply doing: dd if=/dev/zero bs=4096 count=1 | keyctl padd big_key a @s ought to cause an oops. Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: David Howells <[email protected]> cc: Peter Hlavaty <[email protected]> cc: Kirill Marinushkin <[email protected]> cc: Artem Savkov <[email protected]> cc: [email protected] Signed-off-by: James Morris <[email protected]>
2016-10-27KEYS: Fix short sprintf buffer in /proc/keys show functionDavid Howells1-1/+1
This fixes CVE-2016-7042. Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Reported-by: Ondrej Kozina <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Ondrej Kozina <[email protected]> cc: [email protected] Signed-off-by: James Morris <[email protected]>
2016-10-26arm64: mm: fix __page_to_voff definitionNeeraj Upadhyay1-1/+1
Fix parameter name for __page_to_voff, to match its definition. At present, we don't see any issue, as page_to_virt's caller declares 'page'. Fixes: 9f2875912dac ("arm64: mm: restrict virt_to_page() to the linear mapping") Acked-by: Mark Rutland <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2016-10-26arm64/numa: fix incorrect log for memory-less nodeHanjun Guo1-2/+5
When booting on NUMA system with memory-less node (no memory dimm on this memory controller), the print for setup_node_data() is incorrect: NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff] It can be fixed by printing [mem 0x00000000-0x00000000] when end_pfn is 0, but print <memory-less node> will be more useful. Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") Signed-off-by: Hanjun Guo <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yisheng Xie <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2016-10-26arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximityYisheng Xie1-1/+1
The pcpu_build_alloc_info() function group CPUs according to their proximity, by call callback function @cpu_distance_fn from different ARCHs. For arm64 the callback of @cpu_distance_fn is pcpu_cpu_distance(from, to) -> node_distance(from, to) The @from and @to for function node_distance() should be nid. However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the cpu id for @from and @to, and didn't convert to numa node id. For this incorrect cpu proximity get from ARCH, it may cause each CPU in one group and make group_cnt out of bound: setup_per_cpu_areas() pcpu_embed_first_chunk() pcpu_build_alloc_info() in pcpu_build_alloc_info, since cpu_distance_fn will return REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture. This may results in triggering the BUG_ON(unit != nr_units) later: [ 0.000000] kernel BUG at mm/percpu.c:1916! [ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155ca-dirty #26 [ 0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT) [ 0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000 [ 0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704 [ 0.000000] pc : [<ffff000008c754f4>] lr : [<ffff000008c75490>] pstate: 800000c5 [ 0.000000] sp : ffff000008d63eb0 [ 0.000000] x29: ffff000008d63eb0 [ 0.000000] x28: 0000000000000000 [ 0.000000] x27: 0000000000000040 [ 0.000000] x26: ffff8413fbfcef00 [ 0.000000] x25: 0000000000000042 [ 0.000000] x24: 0000000000000042 [ 0.000000] x23: 0000000000001000 [ 0.000000] x22: 0000000000000046 [ 0.000000] x21: 0000000000000001 [ 0.000000] x20: ffff000008cb3bc8 [ 0.000000] x19: ffff8413fbfcf570 [ 0.000000] x18: 0000000000000000 [ 0.000000] x17: ffff000008e49ae0 [ 0.000000] x16: 0000000000000003 [ 0.000000] x15: 000000000000001e [ 0.000000] x14: 0000000000000004 [ 0.000000] x13: 0000000000000000 [ 0.000000] x12: 000000000000006f [ 0.000000] x11: 00000413fbffff00 [ 0.000000] x10: 0000000000000004 [ 0.000000] x9 : 0000000000000000 [ 0.000000] x8 : 0000000000000001 [ 0.000000] x7 : ffff8413fbfcf63c [ 0.000000] x6 : ffff000008d65d28 [ 0.000000] x5 : ffff000008d65e50 [ 0.000000] x4 : 0000000000000000 [ 0.000000] x3 : ffff000008cb3cc8 [ 0.000000] x2 : 0000000000000040 [ 0.000000] x1 : 0000000000000040 [ 0.000000] x0 : 0000000000000000 [...] [ 0.000000] Call trace: [ 0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10) [ 0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4 [ 0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000 [ 0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000 [ 0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390 [ 0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000 [ 0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8 [ 0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c [ 0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00 [ 0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e [ 0.000000] 3e00: 0000000000000003 ffff000008e49ae0 [ 0.000000] [<ffff000008c754f4>] pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] [<ffff000008c6658c>] setup_per_cpu_areas+0x38/0xc8 [ 0.000000] [<ffff000008c608d8>] start_kernel+0x10c/0x390 [ 0.000000] [<ffff000008c601d8>] __primary_switched+0x5c/0x64 [ 0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000) [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! Fix by getting cpu's node id with early_cpu_to_node() then pass it to node_distance() as the original intention. Fixes: 7af3a0a99252 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA") Signed-off-by: Yisheng Xie <[email protected]> Signed-off-by: Hanjun Guo <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zhen Lei <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2016-10-26block: flush: fix IO hang in case of flood fua reqMing Lei1-0/+28
This patch fixes one issue reported by Kent, which can be triggered in bcachefs over sata disk. Actually it is a generic issue in block flush vs. blk-tag. Cc: Christoph Hellwig <[email protected]> Reported-by: Kent Overstreet <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-10-26x86: Fix export for mcount and __fentry__Steven Rostedt1-1/+2
Commit 784d5699eddc5 ("x86: move exports to actual definitions") removed the EXPORT_SYMBOL(__fentry__) and EXPORT_SYMBOL(mcount) from x8664_ksyms_64.c, and added EXPORT_SYMBOL(function_hook) in mcount_64.S instead. The problem is that function_hook isn't a function at all, but a macro that is defined as either mcount or __fentry__ depending on the support from gcc. Originally, I thought this was a macro issue, like what __stringify() is used for. But the problem is a bit deeper. The Makefile.build has some magic that does post processing of files to create the CRC bindings. It does some searches for EXPORT_SYMBOL() and because it finds a macro name and not the actual functions, this causes function_hook not to be converted into mcount or __fentry__ and they are missed. Instead of adding more magic to Makefile.build, just add EXPORT_SYMBOL() for mcount and __fentry__ where the ifdef is used. Since this is assembly and not C, it doesn't require being set after the function is defined. Signed-off-by: Steven Rostedt <[email protected]> Tested-by: Borislav Petkov <[email protected]> Cc: Gabriel C <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Al Viro <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-10-26doc: Add missing parameter for msi_setupStephen Hemminger1-0/+2
commit 92ca8d20dee2 ("genirq/msi: Switch to new irq spreading") introduced new parameter to msi_init_setup and but did not update docbook comments. Fixes 'make htmldocs' warning. Signed-off-by: Stephen Hemminger <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-10-26drm/drivers: add support for using the arch wc mapping API.Dave Airlie6-0/+38
This fixes a regression in all these drivers since the cache mode tracking was fixed for mixed mappings. It uses the new arch API to add the VRAM range to the PAT mapping tracking tables. Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed()) Reviewed-by: Christian König <[email protected]>. Signed-off-by: Dave Airlie <[email protected]>
2016-10-26x86/io: add interface to reserve io memtype for a resource range. (v1.1)Dave Airlie3-0/+42
A recent change to the mm code in: 87744ab3832b mm: fix cache mode tracking in vm_insert_mixed() started enforcing checking the memory type against the registered list for amixed pfn insertion mappings. It happens that the drm drivers for a number of gpus relied on this being broken. Currently the driver only inserted VRAM mappings into the tracking table when they came from the kernel, and userspace mappings never landed in the table. This led to a regression where all the mapping end up as UC instead of WC now. I've considered a number of solutions but since this needs to be fixed in fixes and not next, and some of the solutions were going to introduce overhead that hadn't been there before I didn't consider them viable at this stage. These mainly concerned hooking into the TTM io reserve APIs, but these API have a bunch of fast paths I didn't want to unwind to add this to. The solution I've decided on is to add a new API like the arch_phys_wc APIs (these would have worked but wc_del didn't take a range), and use them from the drivers to add a WC compatible mapping to the table for all VRAM on those GPUs. This means we can then create userspace mapping that won't get degraded to UC. v1.1: use CONFIG_X86_PAT + add some comments in io.h Cc: Toshi Kani <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Brian Gerst <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Dan Williams <[email protected]> Acked-by: Ingo Molnar <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2016-10-26MAINTAINERS: Begin module maintainer transitionRusty Russell1-0/+1
Being a Linux kernel maintainer has been my proudest professional accomplishment, spanning the last 19 years. But now we have a surfeit of excellent hackers, and I can hand this over without regret. I'll still be around as co-maintainer for another cycle, but Jessica is now the one to convince if you want your patches applied. She rocks, and is far more timely than me too! Signed-off-by: Rusty Russell <[email protected]> Acked-by: Jessica Yu <[email protected]>
2016-10-25ahci: fix the single MSI-X case in ahci_init_oneChristoph Hellwig1-1/+1
We need to make sure hpriv->irq is set properly if we don't use per-port vectors, so switch from blindly assigning pdev->irq to using pci_irq_vector, which handles all interrupt types correctly. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Robert Richter <[email protected]> Tested-by: Robert Richter <[email protected]> Tested-by: David Daney <[email protected]> Fixes: 0b9e2988ab22 ("ahci: use pci_alloc_irq_vectors") Signed-off-by: Tejun Heo <[email protected]>
2016-10-25timers: Prevent base clock corruption when forwardingThomas Gleixner1-13/+10
When a timer is enqueued we try to forward the timer base clock. This mechanism has two issues: 1) Forwarding a remote base unlocked The forwarding function is called from get_target_base() with the current timer base lock held. But if the new target base is a different base than the current base (can happen with NOHZ, sigh!) then the forwarding is done on an unlocked base. This can lead to corruption of base->clk. Solution is simple: Invoke the forwarding after the target base is locked. 2) Possible corruption due to jiffies advancing This is similar to the issue in get_net_timer_interrupt() which was fixed in the previous patch. jiffies can advance between check and assignement and therefore advancing base->clk beyond the next expiry value. So we need to read jiffies into a local variable once and do the checks and assignment with the local copy. Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <[email protected]> Reported-by: Michael Thayer <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Michal Necasek <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-10-25timers: Prevent base clock rewind when forwarding clockThomas Gleixner1-5/+9
Ashton and Michael reported, that kernel versions 4.8 and later suffer from USB timeouts which are caused by the timer wheel rework. This is caused by a bug in the base clock forwarding mechanism, which leads to timers expiring early. The scenario which leads to this is: run_timers() while (jiffies >= base->clk) { collect_expired_timers(); base->clk++; expire_timers(); } So base->clk = jiffies + 1. Now the cpu goes idle: idle() get_next_timer_interrupt() nextevt = __next_time_interrupt(); if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies has not advanced since run_timers(), so this assignment effectively decrements base->clk by one. base->clk is the index into the timer wheel arrays. So let's assume the following state after the base->clk increment in run_timers(): jiffies = 0 base->clk = 1 A timer gets enqueued with an expiry delta of 63 ticks (which is the case with the USB timeout and HZ=250) so the resulting bucket index is: base->clk + delta = 1 + 63 = 64 The timer goes into the first wheel level. The array size is 64 so it ends up in bucket 0, which is correct as it takes 63 ticks to advance base->clk to index into bucket 0 again. If the cpu goes idle before jiffies advance, then the bug in the forwarding mechanism sets base->clk back to 0, so the next invocation of run_timers() at the next tick will index into bucket 0 and therefore expire the timer 62 ticks too early. Instead of blindly setting base->clk to jiffies we must make the forwarding conditional on jiffies > base->clk, but we cannot use jiffies for this as we might run into the following issue: if (time_after(jiffies, base->clk) { if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies can increment between the check and the assigment far enough to advance beyond nextevt. So we need to use a stable value for checking. get_next_timer_interrupt() has the basej argument which is the jiffies value snapshot taken in the calling code. So we can just that. Thanks to Ashton for bisecting and providing trace data! Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <[email protected]> Reported-by: Michael Thayer <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Michal Necasek <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-10-25timers: Lock base for same bucket optimizationThomas Gleixner1-11/+17
Linus stumbled over the unlocked modification of the timer expiry value in mod_timer() which is an optimization for timers which stay in the same bucket - due to the bucket granularity - despite their expiry time getting updated. The optimization itself still makes sense even if we take the lock, because in case that the bucket stays the same, we avoid the pointless queue/enqueue dance. Make the check and the modification of timer->expires protected by the base lock and shuffle the remaining code around so we can keep the lock held when we actually have to requeue the timer to a different bucket. Fixes: f00c0afdfa62 ("timers: Implement optimization for same expiry time in mod_timer()") Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: [email protected] Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]>
2016-10-25timers: Plug locking race vs. timer migrationThomas Gleixner1-1/+8
Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the timer flags. As a consequence the compiler is allowed to reload the flags between the initial check for TIMER_MIGRATION and the following timer base computation and the spin lock of the base. While this has not been observed (yet), we need to make sure that it never happens. Fixes: 0eeda71bc30d ("timer: Replace timer base by a cpu index") Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: [email protected] Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]>
2016-10-25ALSA: seq: Fix time account regressionTakashi Iwai1-2/+2
The recent rewrite of the sequencer time accounting using timespec64 in the commit [3915bf294652: ALSA: seq_timer: use monotonic times internally] introduced a bad regression. Namely, the time reported back doesn't increase but goes back and forth. The culprit was obvious: the delta is stored to the result (cur_time = delta), instead of adding the delta (cur_time += delta)! Let's fix it. Fixes: 3915bf294652 ('ALSA: seq_timer: use monotonic times internally') Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=177571 Reported-by: Yves Guillemot <[email protected]> Cc: <[email protected]> # v4.8+ Signed-off-by: Takashi Iwai <[email protected]>
2016-10-25i2c: imx: defer probe if bus recovery GPIOs are not readyStefan Agner1-4/+7
Some SoC might load the GPIO driver after the I2C driver and using the I2C bus recovery mechanism via GPIOs. In this case it is crucial to defer probing if the GPIO request functions do so, otherwise the I2C driver gets loaded without recovery mechanisms enabled. Signed-off-by: Stefan Agner <[email protected]> Acked-by: Uwe Kleine-König <[email protected]> Acked-by: Li Yang <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected]
2016-10-25i2c: designware: Avoid aborted transfers with fast reacting I2C slavesJarkko Nikula1-3/+14
I2C DesignWare may abort transfer with arbitration lost if I2C slave pulls SDA down quickly after falling edge of SCL. Reason for this is unknown but after trial and error it was found this can be avoided by enabling non-zero SDA RX hold time for the receiver. By the specification SDA RX hold time extends incoming SDA low to high transition by n * ic_clk cycles but only when SCL is high. However it seems to help avoid above faulty arbitration lost error. Bits 23:16 in IC_SDA_HOLD register define the SDA RX hold time for the receiver. Be conservative and enable 1 ic_clk cycle long hold time in case boot firmware hasn't set it up. Reported-by: Jukka Laitinen <[email protected]> Signed-off-by: Jarkko Nikula <[email protected]> Tested-by: Jukka Laitinen <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: i801: Fix I2C Block Read on 8-Series/C220 and laterJean Delvare1-3/+13
Starting with the 8-Series/C220 PCH (Lynx Point), the SMBus controller includes a SPD EEPROM protection mechanism. Once the SPD Write Disable bit is set, only reads are allowed to slave addresses 0x50-0x57. However the legacy implementation of I2C Block Read since the ICH5 looks like a write, and is therefore blocked by the SPD protection mechanism. This causes the eeprom and at24 drivers to fail. So assume that I2C Block Read is implemented as an actual read on these chipsets. I tested it on my Q87 chipset and it seems to work just fine. Signed-off-by: Jean Delvare <[email protected]> Tested-by: Jarkko Nikula <[email protected]> [wsa: rebased to v4.9-rc2] Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: xgene: Avoid dma_buffer overrunHoan Tran1-1/+1
SMBus block command uses the first byte of buffer for the data length. The dma_buffer should be increased by 1 to avoid the overrun issue. Reported-by: Phil Endecott <[email protected]> Signed-off-by: Hoan Tran <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected]
2016-10-25i2c: digicolor: Fix module autoloadJavier Martinez Canillas1-0/+1
If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas <[email protected]> Acked-by: Baruch Siach <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: xlr: Fix module autoload for OF registrationJavier Martinez Canillas1-0/+1
If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: xlp9xx: Fix module autoloadJavier Martinez Canillas1-0/+1
If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: jz4780: Fix module autoloadJavier Martinez Canillas1-0/+1
If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: allow configuration of imx driver for ColdFire architectureGreg Ungerer1-2/+2
The i2c controller used by Freescales iMX processors is the same hardware module used on Freescales ColdFire family of processors. We can use the existing i2c-imx driver on ColdFire family members. Modify the configuration to allow it to be selected when compiling for ColdFire targets. Signed-off-by: Greg Ungerer <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25i2c: mark device nodes only in case of successful instantiationRalf Ramsauer1-1/+10
Instantiated I2C device nodes are marked with OF_POPULATE. This was introduced in 4f001fd30145a6. On unloading, loaded device nodes will of course be unmarked. The problem are nodes that fail during initialisation: If a node fails, it won't be unloaded and hence not be unmarked. If a I2C driver module is unloaded and reloaded, it will skip nodes that failed before. Skip device nodes that are already populated and mark them only in case of success. Fixes: 4f001fd30145a6 ("i2c: Mark instantiated device nodes with OF_POPULATE") Signed-off-by: Ralf Ramsauer <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Acked-by: Pantelis Antoniou <[email protected]> [wsa: use 14-digit commit sha] Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected]
2016-10-25x86/quirks: Hide maybe-uninitialized warningArnd Bergmann1-2/+1
gcc -Wmaybe-uninitialized detects that quirk_intel_brickland_xeon_ras_cap uses uninitialized data when CONFIG_PCI is not set: arch/x86/kernel/quirks.c: In function ‘quirk_intel_brickland_xeon_ras_cap’: arch/x86/kernel/quirks.c:641:13: error: ‘capid0’ is used uninitialized in this function [-Werror=uninitialized] However, the function is also not called in this configuration, so we can avoid the warning by moving the existing #ifdef to cover it as well. Signed-off-by: Arnd Bergmann <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-10-25x86/build: Fix build with older GCC versionsJan Beulich1-2/+2
Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and also doesn't ignore unknown -Wno-* options. Signed-off-by: Jan Beulich <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Valdis Kletnieks <[email protected]> Cc: [email protected] Fixes: 5e44258d16 "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables" Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-10-25x86/unwind: Fix empty stack dereference in guess unwinderJosh Poimboeuf1-1/+8
Vince Waver reported the following bug: WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0 CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ #37 Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013 Call Trace: <NMI> ? dump_stack+0x46/0x59 ? __warn+0xd5/0xee ? vmalloc_fault+0x58/0x1f0 ? __do_page_fault+0x6d/0x48e ? perf_log_throttle+0xa4/0xf4 ? trace_page_fault+0x22/0x30 ? __unwind_start+0x28/0x42 ? perf_callchain_kernel+0x75/0xac ? get_perf_callchain+0x13a/0x1f0 ? perf_callchain+0x6a/0x6c ? perf_prepare_sample+0x71/0x2eb ? perf_event_output_forward+0x1a/0x54 ? __default_send_IPI_shortcut+0x10/0x2d ? __perf_event_overflow+0xfb/0x167 ? x86_pmu_handle_irq+0x113/0x150 ? native_read_msr+0x6/0x34 ? perf_event_nmi_handler+0x22/0x39 ? perf_ibs_nmi_handler+0x4a/0x51 ? perf_event_nmi_handler+0x22/0x39 ? nmi_handle+0x4d/0xf0 ? perf_ibs_handle_irq+0x3d1/0x3d1 ? default_do_nmi+0x3c/0xd5 ? do_nmi+0x92/0x102 ? end_repeat_nmi+0x1a/0x1e ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a <EOE> ^A4---[ end trace 632723104d47d31a ]--- BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff) kernel stack overflow (page fault): 0000 [#1] SMP ... The NMI hit in the entry code right after setting up the stack pointer from 'cpu_current_top_of_stack', so the kernel stack was empty. The 'guess' version of __unwind_start() attempted to dereference the "top of stack" pointer, which is not actually *on* the stack. Add a check in the guess unwinder to deal with an empty stack. (The frame pointer unwinder already has such a check.) Reported-by: Vince Weaver <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 7c7900f89770 ("x86/unwind: Add new unwind interface and implementations") Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@treble Signed-off-by: Ingo Molnar <[email protected]>
2016-10-25i2c: rk3x: Give the tuning value 0 during rk3x_i2c_v0_calc_timingsDavid Wu1-0/+2
We found a bug that i2c transfer sometimes failed on 3066a board with stabel-4.8, the con register would be updated by uninitialized tuning value, it made the i2c transfer failed. So give the tuning value to be zero during rk3x_i2c_v0_calc_timings. Signed-off-by: David Wu <[email protected]> Tested-by: Andy Yan <[email protected]> Reviewed-by: Douglas Anderson <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected]
2016-10-25i2c: hix5hd2: allow build with ARCH_HISIRuqiang Ju1-4/+4
This driver should be buildable with ARCH_HISI, because some of other HiSilicon SoCs also use it. Signed-off-by: Ruqiang Ju <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2016-10-25ALSA: hda - Fix surround output pins for ASRock B150M moboTakashi Iwai1-0/+12
ASRock B150M Pro4/D3 mobo with ALC892 codec doesn't seem to provide proper pins for the surround outputs, hence we need to specify the pincfgs manually with a couple of other corrections. Reported-and-tested-by: Benjamin Valentin <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2016-10-24Merge branch 'linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a regression caused by the stack vmalloc change" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: core - Don't use a stack buffer in add_early_randomness()
2016-10-24Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds13-35/+41
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "This is the first batch of clk driver fixes for this release. We have a handful of fixes for the uniphier clk driver that was introduced recently, as well as Kconfig option hiding, module autoloading markings, and a few fixes for clk_hw based registration patches that went in this merge window" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: at91: Fix a return value in case of error clk: uniphier: rename MIO clock to SD clock for Pro5, PXs2, LD20 SoCs clk: uniphier: fix memory overrun bug clk: hi6220: use CLK_OF_DECLARE_DRIVER for sysctrl and mediactrl clock init clk: mvebu: armada-37xx-periph: Fix the clock gate flag clk: bcm2835: Clamp the PLL's requested rate to the hardware limits. clk: max77686: fix number of clocks setup for clk_hw based registration clk: mvebu: armada-37xx-periph: Fix the clock provider registration clk: core: add __init decoration for CLK_OF_DECLARE_DRIVER function clk: mediatek: Add hardware dependency clk: samsung: clk-exynos-audss: Fix module autoload clk: uniphier: fix type of variable passed to regmap_read() clk: uniphier: add system clock support for sLD3 SoC
2016-10-24Merge tag 'gpio-v4.9-2' of ↵Linus Torvalds9-12/+64
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Here is a set of GPIO fixes for the v4.9 kernel series: - Fix up off-by one and line offset validation, info leak to userspace, and reject invalid flags. Those are especially valuable hardening patches from Lars-Peter Clausen, all tagged for stable. - Fix module autoload for TS4800 and ATH79. - Correct the IRQ handler for MPC8xxx to use handle_level_irq() as it (a) reacts to edges not levels and (b) even implements .irq_ack(). We were missing IRQs here. - Fix the error path for acpi_dev_gpio_irq_get() - Fix a memory leak in the MXS driver. - Fix an annoying typo in the STMPE driver. - Put a dependency on sysfs to the mockup driver" * tag 'gpio-v4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: mpc8xxx: Correct irq handler function gpio: ath79: Fix module autoload gpio: ts4800: Fix module autoload gpio: GPIO_GET_LINEEVENT_IOCTL: Reject invalid line and event flags gpio: GPIO_GET_LINEHANDLE_IOCTL: Reject invalid line flags gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak gpio: GPIO_GET_LINEEVENT_IOCTL: Validate line offset gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak gpio: GPIO_GET_LINEHANDLE_IOCTL: Validate line offset gpio: GPIO_GET_CHIPINFO_IOCTL: Fix information leak gpio: GPIO_GET_CHIPINFO_IOCTL: Fix line offset validation gpio / ACPI: fix returned error from acpi_dev_gpio_irq_get() gpio: mockup: add sysfs dependency gpio: stmpe: || vs && typo gpio: mxs: Unmap region obtained by of_iomap gpio/board.txt: point to gpiod_set_value
2016-10-24Merge tag 'for-linus-4.9-rc2-tag' of ↵Linus Torvalds4-19/+36
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from David Vrabel: - advertise control feature flags in xenstore - fix x86 build when XEN_PVHVM is disabled * tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: check return value of xenbus_scanf() xenbus: prefer list_for_each() x86: xen: move cpu_up functions out of ifdef xenbus: advertise control feature flags
2016-10-24mm: unexport __get_user_pages()Lorenzo Stoakes4-13/+6
This patch unexports the low-level __get_user_pages() function. Recent refactoring of the get_user_pages* functions allow flags to be passed through get_user_pages() which eliminates the need for access to this function from its one user, kvm. We can see that the two calls to get_user_pages() which replace __get_user_pages() in kvm_main.c are equivalent by examining their call stacks: get_user_page_nowait(): get_user_pages(start, 1, flags, page, NULL) __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL, false, flags | FOLL_TOUCH) __get_user_pages(current, current->mm, start, 1, flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL) check_user_page_hwpoison(): get_user_pages(addr, 1, flags, NULL, NULL) __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL, false, flags | FOLL_TOUCH) __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL, NULL, NULL) Signed-off-by: Lorenzo Stoakes <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-24proc: don't use FOLL_FORCE for reading cmdline and environmentLinus Torvalds1-10/+8
Now that Lorenzo cleaned things up and made the FOLL_FORCE users explicit, it becomes obvious how some of them don't really need FOLL_FORCE at all. So remove FOLL_FORCE from the proc code that reads the command line and arguments from user space. The mem_rw() function actually does want FOLL_FORCE, because gdd (and possibly many other debuggers) use it as a much more convenient version of PTRACE_PEEKDATA, but we should consider making the FOLL_FORCE part conditional on actually being a ptracer. This does not actually do that, just moves adds a comment to that effect and moves the gup_flags settings next to each other. Signed-off-by: Linus Torvalds <[email protected]>
2016-10-24nbd: fix incorrect unlock of nbd->sock_lock in sock_shutdownJohn W. Linville1-1/+1
Commit 0eadf37afc250 ("nbd: allow block mq to deal with timeouts") changed normal usage of nbd->sock_lock to use spin_lock/spin_unlock rather than the *_irq variants, but it missed this unlock in an error path. Found by Coverity, CID 1373871. Signed-off-by: John W. Linville <[email protected]> Cc: Josef Bacik <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Markus Pargmann <[email protected]> Fixes: 0eadf37afc250 ("nbd: allow block mq to deal with timeouts") Signed-off-by: Jens Axboe <[email protected]>
2016-10-24orangefs: don't use d_timeMiklos Szeredi3-6/+14
Instead use d_fsdata which is the same size. Hoping to get rid of d_time, which is used by very few filesystems by this time. Signed-off-by: Miklos Szeredi <[email protected]> Reviewed-by: Martin Brandenburg <[email protected]> Signed-off-by: Mike Marshall <[email protected]>
2016-10-24orangefs: user file_inode() where it is dueAmir Goldstein1-7/+7
Replace wrong use of file->f_path.dentry->d_inode with file_inode(file). In case orangefs ever finds itself as an overelayfs layer, it would want to get its own inode and not overlayfs's inode. DISCLAIMER: I did not test this patch because I do not know how to setup an orangefs mount Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Mike Marshall <[email protected]>
2016-10-24dm table: fix missing dm_put_target_type() in dm_table_add_target()tang.junhui1-15/+9
dm_get_target_type() was previously called so any error returned from dm_table_add_target() must first call dm_put_target_type(). Otherwise the DM target module's reference count will leak and the associated kernel module will be unable to be removed. Also, leverage the fact that r is already -EINVAL and remove an extra newline. Fixes: 36a0456 ("dm table: add immutable feature") Fixes: cc6cbe1 ("dm table: add always writeable feature") Fixes: 3791e2f ("dm table: add singleton feature") Cc: [email protected] # 3.2+ Signed-off-by: tang.junhui <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-10-24xenbus: check return value of xenbus_scanf()Jan Beulich1-1/+3
Don't ignore errors here: Set backend state to unknown when unsuccessful. Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: David Vrabel <[email protected]>