aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-30Merge tag 'iwlwifi-next-for-kalle-2017-06-30' of ↵Kalle Valo65-571/+1299
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next Final set of patches for 4.13 * Some important fixes for 9000 HW; * FW API changes for the upcoming -30 ucode release; * A few new PCI IDs for 9000 series; * Reorganization of common files; * Some more fixes and improvements here and there * Initialization and other important fixes for 9000 series; * Support for version 30 of the FW API for 8000 and 9000 series;
2017-06-30objtool: Implement stack validation 2.0Josh Poimboeuf11-320/+1130
This is a major rewrite of objtool. Instead of only tracking frame pointer changes, it now tracks all stack-related operations, including all register saves/restores. In addition to making stack validation more robust, this also paves the way for undwarf generation. Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30objtool, x86: Add several functions and files to the objtool whitelistJosh Poimboeuf15-6/+39
In preparation for an objtool rewrite which will have broader checks, whitelist functions and files which cause problems because they do unusual things with the stack. These whitelists serve as a TODO list for which functions and files don't yet have undwarf unwinder coverage. Eventually most of the whitelists can be removed in favor of manual CFI hint annotations or objtool improvements. Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30objtool: Move checking code to check.cJosh Poimboeuf4-1268/+1328
In preparation for the new 'objtool undwarf generate' command, which will rely on 'objtool check', move the checking code from builtin-check.c to check.c where it can be used by other commands. Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Jiri Slaby <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/294c5c695fd73c1a5000bbe5960a7c9bec4ee6b4.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30posix_clocks: Use get_itimerspec64() and put_itimerspec64()Deepa Dinamani1-28/+16
Usage of these apis and their compat versions makes the syscalls: timer_settime and timer_gettime and their compat implementations simpler. This patch also serves as a preparatory patch for changing syscalls to use new time_t data types to support the y2038 effort by isolating the processing of user pointers through these apis. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30timerfd: Use get_itimerspec64() and put_itimerspec64()Deepa Dinamani1-22/+21
Usage of these apis and their compat versions makes the syscalls: timerfd_settime and timerfd_gettime and their compat implementations simpler. This patch also serves as a preparatory patch for changing syscalls to use new time_t data types to support the y2038 effort by isolating the processing of user pointers through these apis. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30nanosleep: Use get_timespec64() and put_timespec64()Deepa Dinamani5-38/+26
Usage of these apis and their compat versions makes the syscalls: clock_nanosleep and nanosleep and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30posix-timers: Use get_timespec64() and put_timespec64()Deepa Dinamani2-76/+70
Usage of these apis and their compat versions makes the syscalls: clock_gettime, clock_settime, clock_getres and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30x86/mm: Delete a big outdated comment about TLB flushingAndy Lutomirski1-36/+0
The comment describes the old explicit IPI-based flush logic, which is long gone. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/55e44997e56086528140c5180f8337dc53fb7ffc.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30x86/mm: Don't reenter flush_tlb_func_common()Andy Lutomirski1-2/+15
It was historically possible to have two concurrent TLB flushes targetting the same CPU: one initiated locally and one initiated remotely. This can now cause an OOPS in leave_mm() at arch/x86/mm/tlb.c:47: if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) BUG(); with this call trace: flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline] flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317 Without reentrancy, this OOPS is impossible: leave_mm() is only called if we're not in TLBSTATE_OK, but then we're unexpectedly in TLBSTATE_OK in leave_mm(). This can be caused by flush_tlb_func_remote() happening between the two checks and calling leave_mm(), resulting in two consecutive leave_mm() calls on the same CPU with no intervening switch_mm() calls. We never saw this OOPS before because the old leave_mm() implementation didn't put us back in TLBSTATE_OK, so the assertion didn't fire. Nadav noticed the reentrancy issue in a different context, but neither of us realized that it caused a problem yet. Reported-by: Levin, Alexander (Sasha Levin) <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Nadav Amit <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: [email protected] Fixes: 3d28ebceaffa ("x86/mm: Rework lazy TLB to track the actual loaded mm") Link: http://lkml.kernel.org/r/855acf733268d521c9f2e191faee2dcc23a29729.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30x86/uaccess: Optimize copy_user_enhanced_fast_string() for short stringsPaolo Abeni1-2/+5
According to the Intel datasheet, the REP MOVSB instruction exposes a pretty heavy setup cost (50 ticks), which hurts short string copy operations. This change tries to avoid this cost by calling the explicit loop available in the unrolled code for strings shorter than 64 bytes. The 64 bytes cutoff value is arbitrary from the code logic point of view - it has been selected based on measurements, as the largest value that still ensures a measurable gain. Micro benchmarks of the __copy_from_user() function with lengths in the [0-63] range show this performance gain (shorter the string, larger the gain): - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4 - in the [72%-9%] range on Intel Core i7-4810MQ Other tested CPUs - namely Intel Atom S1260 and AMD Opteron 8216 - show no difference, because they do not expose the ERMS feature bit. Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Alan Cox <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Hannes Frederic Sowa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kees Cook <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com [ Clarified the changelog. ] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30sched/cputime: Refactor the cputime_adjust() codeGustavo A. R. Silva1-11/+5
Address a Coverity false positive, which is caused by overly convoluted code: Value assigned to variable 'utime' at line 619:utime = rtime; is overwritten at line 642:utime = rtime - stime; before it can be used. This makes such variable assignment useless. Remove this variable assignment and refactor the code related. Addresses-Coverity-ID: 1371643 Signed-off-by: Gustavo A. R. Silva <[email protected]> Cc: Frans Klaver <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Stanislaw Gruszka <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Link: http://lkml.kernel.org/r/20170629184128.GA5271@embeddedgus Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30cpu/hotplug: Constify attribute_group structuresArvind Yadav1-2/+2
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const: File size before: text data bss dec hex filename 12582 15361 20 27963 6d3b kernel/cpu.o File size After adding 'const': text data bss dec hex filename 12710 15265 20 27995 6d5b kernel/cpu.o Signed-off-by: Arvind Yadav <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/f9079e94e12b36d245e7adbf67d312bc5d0250c6.1498737970.git.arvind.yadav.cs@gmail.com Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30sched/debug: Expose the number of RT/DL tasks that can migrateDaniel Bristot de Oliveira1-2/+15
Add the value of the rt_rq.rt_nr_migratory and dl_rq.dl_nr_migratory to the sched_debug output, for instance: rt_rq[0]: .rt_nr_running : 2 .rt_nr_migratory : 1 <--- Like this .rt_throttled : 0 .rt_time : 828.645877 .rt_runtime : 1000.000000 This is useful to debug problems related to the RT/DL schedulers. This also fixes the format of some variables, that were unsigned, rather than signed. Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Cc: Clark Williams <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis Claudio R. Goncalves <[email protected]> Cc: Luiz Capitulino <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: linux-rt-users <[email protected]> Link: http://lkml.kernel.org/r/7896f71cada54ee7dd8507bb666063a2e051c3d4.1498482127.git.bristot@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30perf/x86/intel: Constify the 'lbr_desc[]' array and make a function staticColin Ian King1-2/+2
A few minor clean-ups: constify the lbr_desc[] array and make local function lbr_from_signext_quirk_rd() static to fix a sparse warning: "symbol 'lbr_from_signext_quirk_rd' was not declared. Should it be static?" Signed-off-by: Colin Ian King <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30x86/KASLR: Fix detection 32/64 bit bootloaders for 5-level pagingKirill A. Shutemov1-6/+12
KASLR uses hack to detect whether we booted via startup_32() or startup_64(): it checks what is loaded into cr3 and compares it to _pgtables. _pgtables is the array of page tables where early code allocates page table from. KASLR expects cr3 to point to _pgtables if we booted via startup_32(), but that's not true if we booted with 5-level paging enabled. In this case top level page table is allocated separately and only the first p4d page table is allocated from the array. Let's modify the check to cover both 4- and 5-level paging cases. The patch also renames 'level4p' to 'top_level_pgt' as it now can hold page table for 4th or 5th level, depending on configuration. Signed-off-by: Kirill A. Shutemov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30mwifiex: do not update MCS set from hostapdGanapathi Bhat2-27/+0
We should not copy the MCS set from hostapd RX-STBC. We have to just use the MCS set supported by the hardware. This fixes an issue, where mwifiex is advertising wrong MCS sets in beacons. Fixes: 474a41e94dfc ("mwifiex: update MCS set as per RX-STBC bit from hostapd") Signed-off-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bugBaoquan He3-7/+2
Kernel text KASLR is separated into physical address and virtual address randomization. And for virtual address randomization, we only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE. So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR, but not the original kernel loading address 'output'. The bug will cause kernel boot failure if kernel is loaded at a different position than the address, 16M, which is decided at compiled time. Kexec/kdump is such practical case. To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value. Tested-by: Dave Young <[email protected]> Signed-off-by: Baoquan He <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30x86/boot/KASLR: Add checking for the offset of kernel virtual address ↵Baoquan He1-0/+2
randomization For kernel text KASLR, the virtual address is confined to area of 1G, [0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of virtual address randomization, we only randomize to get an offset between 16M and 1G, then add this offset to the starting address, 0xffffffff80000000. Here 16M is the offset which is decided at linking stage. So the amount of the local variable 'virt_addr' which respresents the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE. Add a debug check for the offset. If out of bounds, print error message and hang there. Suggested-by: Ingo Molnar <[email protected]> Signed-off-by: Baoquan He <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-30ieee80211: update public action codesPeter Oh1-1/+34
Update Public Action field values as updated in IEEE Std 802.11-2016, so that modules/drivers can refer it. Signed-off-by: Peter Oh <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30nl80211: Don't verify owner_nlportid on NAN commandsAndrei Otcheretianski2-13/+4
If NAN interface is created with NL80211_ATTR_SOCKET_OWNER, the socket that is used to create the interface is used for all NAN operations and reporting NAN events. However, it turns out that sending commands and receiving events on the same socket is not possible in a completely race-free way: If the socket buffer is overflowed by the events, the command response will not be sent. In that case the caller will block forever on recv. Using non-blocking socket for commands is more complicated and still the command response or ack may not be received. So, keep unicasting NAN events to the interface creator, but allow using a different socket for commands. Signed-off-by: Andrei Otcheretianski <[email protected]> Signed-off-by: Luca Coelho <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30brcmfmac: switch to using cfg80211_connect_done()Arend van Spriel1-9/+11
The driver used cfg80211_connect_result() which is basically a wrapper around cfg80211_connect_done() passing a subset of the information that can be passed. For upcoming functionality this is not sufficient so switching to use cfg80211_connect_done(). Reviewed-by: Hante Meuleman <[email protected]> Reviewed-by: Pieter-Paul Giesberts <[email protected]> Reviewed-by: Franky Lin <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30brcmfmac: support 4-way handshake offloading for 802.1XArend van Spriel2-4/+60
Adding callbacks for PMK provisioning. If firmware supports offloading it is indicated to user-space that 802.1X offload is supported. Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30brcmfmac: support 4-way handshake offloading for WPA/WPA2-PSKArend van Spriel7-7/+134
The firmware may have supplicant code built-in. This is detected by the driver and indicated in the wiphy features flags. User-space can use this flag to determine whether or not to provide the pre-shared key material in the nl80211 CONNECT command. Reviewed-by: Gautam (Gautam Kumar) Shukla <[email protected]> Reviewed-by: Hante Meuleman <[email protected]> Reviewed-by: Pieter-Paul Giesberts <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-06-30bpf: don't open-code memdup_user()Al Viro1-29/+16
Signed-off-by: Al Viro <[email protected]>
2017-06-30kimage_file_prepare_segments(): don't open-code memdup_user()Al Viro1-10/+4
Signed-off-by: Al Viro <[email protected]>
2017-06-30ethtool: don't open-code memdup_user()Al Viro1-14/+6
Signed-off-by: Al Viro <[email protected]>
2017-06-30do_ip_setsockopt(): don't open-code memdup_user()Al Viro1-14/+6
Signed-off-by: Al Viro <[email protected]>
2017-06-30do_ipv6_setsockopt(): don't open-code memdup_user()Al Viro1-8/+3
Signed-off-by: Al Viro <[email protected]>
2017-06-30irda: don't open-code memdup_user()Al Viro1-36/+12
and no, GFP_ATOMIC does not make any sense there... Signed-off-by: Al Viro <[email protected]>
2017-06-30xfrm_user_policy(): don't open-code memdup_user()Al Viro1-8/+3
Signed-off-by: Al Viro <[email protected]>
2017-06-30ima_write_policy(): don't open-code memdup_user_nul()Al Viro1-9/+4
Signed-off-by: Al Viro <[email protected]>
2017-06-29tracing/kprobes: Allow to create probe with a module name starting with a digitSabrina Dubroca1-9/+5
Always try to parse an address, since kstrtoul() will safely fail when given a symbol as input. If that fails (which will be the case for a symbol), try to parse a symbol instead. This allows creating a probe such as: p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0 Which is necessary for this command to work: perf probe -m 8021q -a vlan_gro_receive Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net Cc: [email protected] Fixes: 413d37d1e ("tracing: Add kprobe-based event tracer") Acked-by: Masami Hiramatsu <[email protected]> Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-06-30MIPS: Avoid accidental raw backtraceJames Hogan1-0/+2
Since commit 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with usermode") show_backtrace() invokes the raw backtracer when cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels where user and kernel address spaces overlap. However this is used by show_stack() which creates its own pt_regs on the stack and leaves cp0_status uninitialised in most of the code paths. This results in the non deterministic use of the raw back tracer depending on the previous stack content. show_stack() deals exclusively with kernel mode stacks anyway, so explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure we get a useful backtrace. Fixes: 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with usermode") Signed-off-by: James Hogan <[email protected]> Cc: [email protected] Cc: <[email protected]> # 3.15+ Patchwork: https://patchwork.linux-mips.org/patch/16656/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-30MIPS: Perform post-DMA cache flushes on systems with MAARsPaul Burton1-5/+18
Recent CPUs from Imagination Technologies such as the I6400 or P6600 are able to speculatively fetch data from memory into caches. This means that if used in a system with non-coherent DMA they require that caches be invalidated after a device performs DMA, and before the CPU reads the DMA'd data, in order to ensure that stale values weren't speculatively prefetched. Such CPUs also introduced Memory Accessibility Attribute Registers (MAARs) in order to control the regions in which they are allowed to speculate. Thus we can use the presence of MAARs as a good indication that the CPU requires the above cache maintenance. Use the presence of MAARs to determine the result of cpu_needs_post_dma_flush() in the default case, in order to handle these recent CPUs correctly. Note that the return type of cpu_needs_post_dma_flush() is changed to bool, such that it's clearer what's happening when cpu_has_maar is cast to bool for the return value. If this patch were backported to a pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is currently 1ull<<30, so when truncated to an int gives a non-zero value anyway, but even so the implicit conversion from long long int to bool makes it clearer to understand what will happen than the implicit conversion from long long int to int would. The bool return type also fits this usage better semantically, so seems like an all-round win. Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting the return type change. Signed-off-by: Paul Burton <[email protected]> Reviewed-by: Bryan O'Donoghue <[email protected]> Tested-by: Bryan O'Donoghue <[email protected]> Cc: Ed Blake <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/16363/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-30MIPS: Fix IRQ tracing & lockdep when reschedulingPaul Burton1-0/+3
When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler from arch/mips/kernel/entry.S we disable interrupts. This is true regardless of whether we reach work_resched from syscall_exit_work, resume_userspace or by looping after calling schedule(). Although we disable interrupts in these paths we don't call trace_hardirqs_off() before calling into C code which may acquire locks, and we therefore leave lockdep with an inconsistent view of whether interrupts are disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are both enabled. Without tracing this interrupt state lockdep will print warnings such as the following once a task returns from a syscall via syscall_exit_partial with TIF_NEED_RESCHED set: [ 49.927678] ------------[ cut here ]------------ [ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8 [ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) [ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197 [ 49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4 [ 49.974431] 0000000000000000 0000000000000000 0000000000000000 000000000000004a [ 49.985300] ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8 [ 49.996194] 0000000000000001 0000000000000000 0000000000000000 0000000077c8030c [ 50.007063] 000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88 [ 50.017945] 0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498 [ 50.028827] 0000000000000000 0000000000000001 0000000000000000 0000000000000000 [ 50.039688] 0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc [ 50.050575] 00000000140084e0 0000000000000000 0000000000000000 0000000000040a00 [ 50.061448] 0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc [ 50.072327] ... [ 50.076087] Call Trace: [ 50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8 [ 50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190 [ 50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108 [ 50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48 [ 50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8 [ 50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0 [ 50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8 [ 50.129221] [<ffffffff80946a60>] schedule+0x30/0x98 [ 50.135659] [<ffffffff80106278>] work_resched+0x8/0x34 [ 50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]--- [ 50.148405] possible reason: unannotated irqs-off. [ 50.154600] irq event stamp: 400463 [ 50.159566] hardirqs last enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8 [ 50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0 [ 50.183897] softirqs last enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8 [ 50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128 Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off() when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking schedule() following the work_resched label because: 1) Interrupts are disabled regardless of the path we take to reach work_resched() & schedule(). 2) Performing the tracing here avoids the need to do it in paths which disable interrupts but don't call out to C code before hitting a path which uses the RESTORE_SOME macro that will call trace_hardirqs_on() or trace_hardirqs_off() as appropriate. We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling syscall_trace_leave() for similar reasons, ensuring that lockdep has a consistent view of state after we re-enable interrupts. Signed-off-by: Paul Burton <[email protected]> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: [email protected] Cc: stable <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/15385/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-30MIPS: pm-cps: Drop manual cache-line alignment of ready_countPaul Burton1-8/+1
We allocate memory for a ready_count variable per-CPU, which is accessed via a cached non-coherent TLB mapping to perform synchronisation between threads within the core using LL/SC instructions. In order to ensure that the variable is contained within its own data cache line we allocate 2 lines worth of memory & align the resulting pointer to a line boundary. This is however unnecessary, since kmalloc is guaranteed to return memory which is at least cache-line aligned (see ARCH_DMA_MINALIGN). Stop the redundant manual alignment. Besides cleaning up the code & avoiding needless work, this has the side effect of avoiding an arithmetic error found by Bryan on 64 bit systems due to the 32 bit size of the former dlinesz. This led the ready_count variable to have its upper 32b cleared erroneously for MIPS64 kernels, causing problems when ready_count was later used on MIPS64 via cpuidle. Signed-off-by: Paul Burton <[email protected]> Fixes: 3179d37ee1ed ("MIPS: pm-cps: add PM state entry code for CPS systems") Reported-by: Bryan O'Donoghue <[email protected]> Reviewed-by: Bryan O'Donoghue <[email protected]> Tested-by: Bryan O'Donoghue <[email protected]> Cc: [email protected] Cc: stable <[email protected]> # v3.16+ Patchwork: https://patchwork.linux-mips.org/patch/15383/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-29ARM: 8685/1: ensure memblock-limit is pmd-alignedDoug Berger1-4/+4
The pmd containing memblock_limit is cleared by prepare_page_table() which creates the opportunity for early_alloc() to allocate unmapped memory if memblock_limit is not pmd aligned causing a boot-time hang. Commit 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") attempted to resolve this problem, but there is a path through the adjust_lowmem_bounds() routine where if all memory regions start and end on pmd-aligned addresses the memblock_limit will be set to arm_lowmem_limit. Since arm_lowmem_limit can be affected by the vmalloc early parameter, the value of arm_lowmem_limit may not be pmd-aligned. This commit corrects this oversight such that memblock_limit is always rounded down to pmd-alignment. Fixes: 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") Signed-off-by: Doug Berger <[email protected]> Suggested-by: Mark Rutland <[email protected]> Signed-off-by: Russell King <[email protected]>
2017-06-29nfsd: remove nfsd_vfs_readChristoph Hellwig1-11/+6
Simpler done in the only caller. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29nfsd: use vfs_iter_read/writeChristoph Hellwig1-10/+7
Instead of messing with the address limit to use vfs_read/vfs_writev. Note that this requires that exported file implement ->read_iter and ->write_iter. All currently exportable file systems do this. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: implement vfs_iter_write using do_iter_writeChristoph Hellwig6-26/+16
De-dupliate some code and allow for passing the flags argument to vfs_iter_write. Additionally it now properly updates timestamps. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: implement vfs_iter_read using do_iter_readChristoph Hellwig5-25/+15
De-dupliate some code and allow for passing the flags argument to vfs_iter_read. Additional it properly updates atime now. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: move more code into do_iter_read/do_iter_writeChristoph Hellwig1-45/+28
The checks for the permissions and can read / write flags are common for the callers. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: remove __do_readv_writevChristoph Hellwig1-24/+36
Split it into one helper each for reads vs writes. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: remove do_compat_readv_writevChristoph Hellwig1-26/+16
opencode it in both callers to simplify the call stack a bit. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29fs: remove do_readv_writevChristoph Hellwig1-22/+21
opencode it in both callers to simplify the call stack a bit. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds42-84/+350
Pull networking fixes from David Miller: 1) Need to access netdev->num_rx_queues behind an accessor in netvsc driver otherwise the build breaks with some configs, from Arnd Bergmann. 2) Add dummy xfrm_dev_event() so that build doesn't fail when CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu. 3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan Carpenter. 4) Fix MCDI command size for filter operations in sfc driver, from Martin Habets. 5) Fix UFO segmenting so that we don't calculate incorrect checksums, from Michal Kubecek. 6) When ipv6 datagram connects fail, reset destination address and port. From Wei Wang. 7) TCP disconnect must reset the cached receive DST, from WANG Cong. 8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric Dumazet. 9) fman driver has to depend on HAS_DMA, from Madalin Bucur. 10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann. 11) Fix negative page counts with GFO, from Michal Kubecek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) sfc: fix attempt to translate invalid filter ID net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged arcnet: com20020-pci: add missing pdev setup in netdev structure arcnet: com20020-pci: fix dev_id calculation arcnet: com20020: remove needless base_addr assignment Trivial fix to spelling mistake in arc_printk message arcnet: change irq handler to lock irqsave rocker: move dereference before free mlxsw: spectrum_router: Fix NULL pointer dereference net: sched: Fix one possible panic when no destroy callback virtio-net: serialize tx routine during reset net: usb: asix88179_178a: Add support for the Belkin B2B128 fsl/fman: add dependency on HAS_DMA net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: ipv6: reset daddr and dport in sk if connect() fails bnx2x: Don't log mc removal needlessly bnxt_en: Fix netpoll handling. bnxt_en: Add missing logic to handle TPA end error conditions. ...
2017-06-29cpufreq: Update scaling_cur_freq documentationRafael J. Wysocki1-6/+6
Commit f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" modified the way the scaling_cur_freq cpufreq policy attribute in sysfs is handled on contemporary Intel-based x86 systems, so update the documentation to reflect that change. Signed-off-by: Rafael J. Wysocki <[email protected]>
2017-06-29cpufreq: intel_pstate: Clean up after performance governor changesRafael J. Wysocki2-10/+2
After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in "performance" mode) get_target_pstate_use_performance() and get_target_pstate_use_cpu_load() are never called if scaling_governor is "performance", so drop the CPUFREQ_POLICY_PERFORMANCE checks from them as they will never trigger anyway. Moreover, the documentation needs to be updated to reflect the change made by the above commit, so do that too. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Srinivas Pandruvada <[email protected]>
2017-06-29Merge tag 'for-4.12/dm-fixes-5' of ↵Linus Torvalds2-16/+27
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - dm thinp fix for crash that will occur when metadata device failure races with discard passdown to the underlying data device. - dm raid fix to not access the superblock's >= 1.9.0 'sectors' member unconditionally. * tag 'for-4.12/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: do not queue freed thin mapping for next stage processing dm raid: fix oops on upgrading to extended superblock format