aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-08-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds19-30/+184
Pull networking fixes from David Miller: "Fixes keep trickling in: 1) Various IP fragmentation memory limit hardening changes from Eric Dumazet. 2) Revert ipv6 metrics leak change, it causes more problems than it fixes for now. 3) Fix WoL regression in stmmac driver, from Jose Abreu. 4) Netlink socket spectre v1 gadget fix, from Jeremy Cline" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: Revert "net/ipv6: fix metrics leak" rxrpc: Fix user call ID check in rxrpc_service_prealloc_one net: dsa: Do not suspend/resume closed slave_dev netlink: Fix spectre v1 gadget in netlink_create() Documentation: dpaa2: Use correct heading adornment net: stmmac: Fix WoL for PCI-based setups bonding: avoid lockdep confusion in bond_get_stats() enic: do not call enic_change_mtu in enic_probe ipv4: frags: handle possible skb truesize change inet: frag: enforce memory limits earlier net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow net/mlx5e: Fix null pointer access when setting MTU of vport representor net/mlx5e: Set port trust mode to PCP as default net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager net: dsa: mv88e6xxx: Fix SERDES support on 88E6141/6341 brcmfmac: fix regression in parsing NVRAM for multiple devices iwlwifi: add more card IDs for 9000 series
2018-08-02Squashfs: Compute expected length from inode size rather than block lengthPhillip Lougher4-23/+24
Previously in squashfs_readpage() when copying data into the page cache, it used the length of the datablock read from the filesystem (after decompression). However, if the filesystem has been corrupted this data block may be short, which will leave pages unfilled. The fix for this is to compute the expected number of bytes to copy from the inode size, and use this to detect if the block is short. Signed-off-by: Phillip Lougher <[email protected]> Tested-by: Willy Tarreau <[email protected]> Cc: Анатолий Тросиненко <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-02squashfs: more metadata hardeningLinus Torvalds3-6/+13
The squashfs fragment reading code doesn't actually verify that the fragment is inside the fragment table. The end result _is_ verified to be inside the image when actually reading the fragment data, but before that is done, we may end up taking a page fault because the fragment table itself might not even exist. Another report from Anatoly and his endless squashfs image fuzzing. Reported-by: Анатолий Тросиненко <[email protected]> Acked-by:: Phillip Lougher <[email protected]>, Cc: Willy Tarreau <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-02perf trace: Associate vfs_getname()'ed pathname with fd returned from 'openat'Arnaldo Carvalho de Melo1-5/+8
When the vfs_getname() wannabe tracepoint is in place: # perf probe -l probe:vfs_getname (on getname_flags:73@acme/git/linux/fs/namei.c with pathname) # 'perf trace' will use it to get the pathname when it is copied from userspace to the kernel, right after syscalls:sys_enter_open, copied in the 'probe:vfs_getname', stash it somewhere and then, at syscalls:sys_exit_open time, if the 'open' return is not -1, i.e. a successfull open syscall, associate that pathname to this return, i.e. the fd. We were not doing this for the 'openat' syscall, which would cause 'perf trace' to fallback to using /proc to get the fd, change it so that we use what we got from probe:vfs_getname, reducing the 'openat' beautification process cost, ditching the syscalls performed to read procfs state and avoiding some possible races in the process. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-02stop_machine: Reflow cpu_stop_queue_two_works()Peter Zijlstra1-18/+23
The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by lifting the preempt_disable() to the top to create more natural nesting wrt the spinlocks and make the wake_up_q() and preempt_enable() unconditional at the end. Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait with preemption enabled. Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02clockevents: Warn if cpu_all_mask is used as cpumaskSudeep Holla1-0/+6
Using cpu_all_mask in clockevents cpumask may result in issues while comparing multiple clockevent devices to choose the preferred one. On one of the platforms with 2 system (i.e. non per-CPU) timers with different ratings, having cpu_all_mask for one of the device resulted in a boot hang due to a endless loop in clockevents_notify_released() as both were clocksources were selected as preferred. In order to prevent such issues in the future, warn if any clockevent driver sets cpu_all_mask as it's cpumask and just override it to use cpu_possible_mask. All the existing occurrences of cpu_all_mask are already replaced with cpu_possible_mask. Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02tick/broadcast-hrtimer: Use cpu_possible_mask for ce_broadcast_hrtimerSudeep Holla1-1/+1
This is the last instance of cpu_all_mask usage in the core framework. Replace it with cpu_possible_mask like all other instances in the clockevent drivers. This makes it possible to add a warning in the core clockevents_register_device on usage of cpu_all_mask from any clockevent drivers in the future. Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02clocksource/drivers/arm_arch_timer: Fix bogus cpu_all_mask usageThomas Gleixner1-1/+1
Using cpu_all_mask as target mask for clockevents is wrong as it never can actually target not possible CPUs. Use cpu_possible_mask instead Signed-off-by: Thomas Gleixner <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Daniel Lezcano <[email protected]>
2018-08-02x86/iommu: Use NULL instead of 0Zhong Jiang1-1/+1
Fixes the following sparse warning: arch/x86/kernel/pci-iommu_table.c:63:37: warning: Using plain integer as NULL pointer Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/boot: Use CC_SET()/CC_OUT() instead of open coding itUros Bizjak2-3/+5
Remove open-coded uses of set instructions with CC_SET()/CC_OUT(). Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/mm: Remove redundant check for kmem_cache_create()Chengguang Xu1-3/+0
The flag 'SLAB_PANIC' implies panic on failure, So there is no need to check the returned pointer for NULL. Signed-off-by: Chengguang Xu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/platform/UV: Remove redundant check of p == qColin Ian King1-2/+0
The check for p == q is dead code because the proceeding switch statements jump to the end of the outer for-loop with continue statements. Remove the dead code. Detected by CoverityScan, CID#145071 ("Structurally dead code") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/platform/olpc: Use PTR_ERR_OR_ZERO()zhong jiang1-3/+1
Replace the open coded equivalent with PTR_ERR_OR_ZERO(). Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/boot/compressed/64: Validate trampoline placement against E820Kirill A. Shutemov1-18/+55
There were two report of boot failure cased by trampoline placed into a reserved memory region. It can happen on machines that don't report EBDA correctly. Fix the problem by re-validating the found address against the E820 table. If the address is in a reserved area, find the next usable region below the initial address. Fixes: 3548e131ec6a ("x86/boot/compressed/64: Find a place for 32-bit trampoline") Reported-by: Dmitry Malkin <[email protected]> Reported-by: youling 257 <[email protected]> Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02debugobjects: Remove redundant NULL pointer checkZhong Jiang1-2/+1
kmem_cache_destroy() has a built in NULL pointer check, so the one at the call can be removed. Signed-off-by: Zhong Jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02clocksource: ti-32k: Remove CLOCK_SOURCE_SUSPEND_NONSTOP flagKeerthy1-2/+1
Since commit 39232ed5a179 ("time: Introduce one suspend clocksource to compensate the suspend time") suspend/resume fails on AM437x platforms as the clocksource actually stops in suspend. Hence remove the CLOCK_SOURCE_SUSPEND_NONSTOP flag. Suggested-by: Grygorii Strashko <[email protected]> Signed-off-by: Keerthy <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02timers: Clear timer_base::must_forward_clk with timer_base::lock heldGaurav Kohli1-13/+16
timer_base::must_forward_clock is indicating that the base clock might be stale due to a long idle sleep. The forwarding of the base clock takes place in the timer softirq or when a timer is enqueued to a base which is idle. If the enqueue of timer to an idle base happens from a remote CPU, then the following race can happen: CPU0 CPU1 run_timer_softirq mod_timer base = lock_timer_base(timer); base->must_forward_clk = false if (base->must_forward_clk) forward(base); -> skipped enqueue_timer(base, timer, idx); -> idx is calculated high due to stale base unlock_timer_base(timer); base = lock_timer_base(timer); forward(base); The root cause is that timer_base::must_forward_clk is cleared outside the timer_base::lock held region, so the remote queuing CPU observes it as cleared, but the base clock is still stale. This can cause large granularity values for timers, i.e. the accuracy of the expiry time suffers. Prevent this by clearing the flag with timer_base::lock held, so that the forwarding takes place before the cleared flag is observable by a remote CPU. Signed-off-by: Gaurav Kohli <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02Merge tag 'perf-core-for-mingo-4.19-20180801' of ↵Ingo Molnar29-49/+556
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf trace: (Arnaldo Carvalho de Melo) - Do not require --no-syscalls to suppress strace like output, i.e. # perf trace -e sched:*switch will show just sched:sched_switch events, not strace-like formatted syscall events, use --syscalls to get the previous behaviour. If instead: # perf trace is used, i.e. no events specified, then --syscalls is implied and system wide strace like formatting will be applied to all syscalls. The behaviour when just a syscall subset is used with '-e' is unchanged: # perf trace -e *sleep,sched:*switch will work as before: just the 'nanosleep' syscall will be strace-like formatted plus the sched:sched_switch tracepoint event, system wide. - Allow string table generators to use a default header dir, allowing use of them without parameters to see the table it generates on stdout, e.g.: $ tools/perf/trace/beauty/kvm_ioctl.sh static const char *kvm_ioctl_cmds[] = { [0x00] = "GET_API_VERSION", [0x01] = "CREATE_VM", [0x02] = "GET_MSR_INDEX_LIST", [0x03] = "CHECK_EXTENSION", <BIG SNIP> [0xe0] = "CREATE_DEVICE", [0xe1] = "SET_DEVICE_ATTR", [0xe2] = "GET_DEVICE_ATTR", [0xe3] = "HAS_DEVICE_ATTR", }; $ See 'ls tools/perf/trace/beauty/*.sh' to see the available string table generators. - Add a generator for IPPROTO_ socket's protocol constants. perf record: (Kan Liang) - Fix error out while applying initial delay and using LBR, due to the use of a PERF_TYPE_SOFTWARE/PERF_COUNT_SW_DUMMY event to track PERF_RECORD_MMAP events while waiting for the initial delay. Such events fail when configured asking PERF_SAMPLE_BRANCH_STACK in perf_event_attr.sample_type. perf c2c: (Jiri Olsa) - Fix report crash for empty browser, when processing a perf.data file without events of interest, either because not asked for in 'perf record' or because the workload didn't triggered such events. perf list: (Michael Petlan) - Align metric group description format with PMU event description. perf tests: (Sandipan Das) - Fix indexing when invoking subtests, which caused BPF tests to get results for the next test in the list, with the last one reporting a failure. eBPF: - Fix installation directory for header files included from eBPF proggies, avoiding clashing with relative paths used to build other software projects such as glibc. (Thomas Richter) - Show better message when failing to load an object. (Arnaldo Carvalho de Melo) General: (Christophe Leroy) - Allow overriding MAX_NR_CPUS at compile time, to make the tooling usable in systems with less memory, in time this has to be changed to properly allocate based on _NPROCESSORS_ONLN. Architecture specific: - Update arm64's ThunderX2 implementation defined pmu core events (Ganapatrao Kulkarni) - Fix complex event name parsing in 'perf test' for PowerPC, where the 'umask' event modifier isn't present. (Sandipan Das) CoreSight ARM hardware tracing: (Leo Yan) - Fix start tracing packet handling. - Support dummy address value for CS_ETM_TRACE_ON packet. - Generate branch sample when receiving a CS_ETM_TRACE_ON packet. - Generate branch sample for CS_ETM_TRACE_ON packet. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2018-08-02Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar231-1094/+2290
Signed-off-by: Ingo Molnar <[email protected]>
2018-08-01Revert "net/ipv6: fix metrics leak"David S. Miller1-14/+4
This reverts commit df18b50448fab1dff093731dfd0e25e77e1afcd1. This change causes other problems and use-after-free situations as found by syzbot. Signed-off-by: David S. Miller <[email protected]>
2018-08-01NFSv4: Fix _nfs4_do_setlk()Trond Myklebust1-13/+13
The patch to fix the case where a lock request was interrupted ended up changing default handling of errors such as NFS4ERR_DENIED and caused the client to immediately resend the lock request. Let's do a partial revert of that request so that the default is now to exit, but change the way we handle resends to take into account the fact that the user may have interrupted the request. Reported-by: Kenneth Johansson <[email protected]> Fixes: a3cf9bca2ace ("NFSv4: Don't add a new lock on an interrupted wait..") Cc: Benjamin Coddington <[email protected]> Cc: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Reviewed-by: Jeff Layton <[email protected]>
2018-08-01Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds1-1/+3
Pull ARM fix from Russell King: "Just a single fix this time around for recent binutils causing build problems when generating Thumb-2 code" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+
2018-08-01mm: do not initialize TLB stack vma's with vma_init()Linus Torvalds5-17/+12
Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Kirill Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: John Stultz <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-01mm: delete historical BUG from zap_pmd_range()Hugh Dickins1-4/+2
Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted that mmap_sem must be held when splitting an "anonymous" vma there. Whether that's still strictly true nowadays is not entirely clear, but the danger of sometimes crashing on the BUG is now fairly clear. Even with the new stricter rules for anonymous vma marking, the condition it checks for can possible trigger. Commit 44960f2a7b63 ("staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem pages") is good, and originally I thought it was safe from that VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE) insists on VM_SHARED. But after I read John's earlier mail, drawing attention to the vfs_fallocate() in there: I may be wrong, and I don't know if Android has THP in the config anyway, but it looks to me like an unmap_mapping_range() from ashmem's vfs_fallocate() could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous(). Signed-off-by: Hugh Dickins <[email protected]> Cc: John Stultz <[email protected]> Cc: Kirill Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-01perf trace: Do not require --no-syscalls to suppress strace like outputArnaldo Carvalho de Melo1-8/+3
So far the --syscalls option was the default, requiring explicit --no-syscalls when wanting to process just some other event, invert that and assume it only when no other event was specified, allowing its explicit enablement when wanting to see all syscalls together with some other event: E.g: The existing default is maintained for a single workload: # perf trace sleep 1 <SNIP> 0.264 ( 0.003 ms): sleep/12762 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7f62cbf04000 0.271 ( 0.001 ms): sleep/12762 close(fd: 3) = 0 0.295 (1000.130 ms): sleep/12762 nanosleep(rqtp: 0x7ffd15194fd0) = 0 1000.469 ( 0.006 ms): sleep/12762 close(fd: 1) = 0 1000.480 ( 0.004 ms): sleep/12762 close(fd: 2) = 0 1000.502 ( ): sleep/12762 exit_group() # For a pid: # pidof ssh 7826 3961 3226 2628 2493 # perf trace -p 3961 ? ( ): ... [continued]: select()) = 1 0.023 ( 0.005 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce870 ) = 0 0.036 ( 0.009 ms): read(fd: 5</dev/pts/7>, buf: 0x7ffcc8fca7b0, count: 16384 ) = 3 0.060 ( 0.004 ms): getpid( ) = 3961 (ssh) 0.079 ( 0.004 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce8e0 ) = 0 0.088 ( 0.003 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce7c0 ) = 0 <SNIP> For system wide, threads, cgroups, user, etc when no event is specified, the existing behaviour is maintained, i.e. --syscalls is selected. When some event is specified, then --no-syscalls doesn't need to be specified: # perf trace -e tcp:tcp_probe ssh localhost 0.000 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=53 snd_nxt=0xb67ce8f7 snd_una=0xb67ce8f7 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43690 0.010 tcp:tcp_probe:src=[::1]:39074 dest=[::1]:22 mark=0 length=32 snd_nxt=0xa8f9ef38 snd_una=0xa8f9ef23 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43690 srtt=31 rcv_wnd=43776 4.525 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=1240 snd_nxt=0xb67ce90c snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43776 7.242 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=80 snd_nxt=0xb67ced44 snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=174720 The authenticity of host 'localhost (::1)' can't be established. ECDSA key fingerprint is SHA256:TKZS58923458203490asekfjaklskljmkjfgPMBfHzY. ECDSA key fingerprint is MD5:d8:29:54:40:71:fa:b8:44:89:52:64:8a:35:42:d0:e8. Are you sure you want to continue connecting (yes/no)? ^C # To get the previous behaviour just use --syscalls and get all syscalls formatted strace like + the specified extra events: # trace -e sched:*switch --syscalls sleep 1 <SNIP> 0.160 ( 0.003 ms): sleep/12877 mprotect(start: 0x7fdfe2361000, len: 4096, prot: READ) = 0 0.164 ( 0.009 ms): sleep/12877 munmap(addr: 0x7fdfe2345000, len: 113155) = 0 0.211 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce68e000 0.212 ( 0.002 ms): sleep/12877 brk(brk: 0x55d3ce6af000) = 0x55d3ce6af000 0.215 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce6af000 0.219 ( 0.004 ms): sleep/12877 open(filename: 0xe1f07c00, flags: CLOEXEC) = 3 0.225 ( 0.001 ms): sleep/12877 fstat(fd: 3, statbuf: 0x7fdfe2138aa0) = 0 0.227 ( 0.003 ms): sleep/12877 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7fdfdb1b8000 0.234 ( 0.001 ms): sleep/12877 close(fd: 3) = 0 0.257 ( ): sleep/12877 nanosleep(rqtp: 0x7fffb36b6020) ... 0.260 ( ): sched:sched_switch:prev_comm=sleep prev_pid=12877 prev_prio=120 prev_state=D ==> next_comm=swapper/3 next_pid=0 next_prio=120 0.257 (1000.134 ms): sleep/12877 ... [continued]: nanosleep()) = 0 1000.428 ( 0.006 ms): sleep/12877 close(fd: 1) = 0 1000.440 ( 0.004 ms): sleep/12877 close(fd: 2) = 0 1000.461 ( ): sleep/12877 exit_group() # When specifiying just some syscalls, the behaviour doesn't change, i.e.: # trace -e nanosleep -e sched:*switch sleep 1 0.000 ( ): sleep/14974 nanosleep(rqtp: 0x7ffc344ba9c0 ) ... 0.007 ( ): sched:sched_switch:prev_comm=sleep prev_pid=14974 prev_prio=120 prev_state=D ==> next_comm=swapper/2 next_pid=0 next_prio=120 0.000 (1000.139 ms): sleep/14974 ... [continued]: nanosleep()) = 0 # Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-01rxrpc: Fix user call ID check in rxrpc_service_prealloc_oneYueHaibing1-2/+2
There just check the user call ID isn't already in use, hence should compare user_call_ID with xcall->user_call_ID, which is current node's user_call_ID. Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Suggested-by: David Howells <[email protected]> Signed-off-by: YueHaibing <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01Merge tag 'mmc-v4.18-rc5' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "MMC host: mxcmmc: Fix build error for powerpc" * tag 'mmc-v4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mxcmmc: Fix missing parentheses and brace
2018-08-01Merge tag 'pm-urgent-4.18' of ↵Linus Torvalds3-67/+74
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the scope of a recent intel_pstate driver optimization used incorrectly on some systems due to processor identification ambiguity and fix a few issues in the turbostat utility, including three recent regressions. Specifics: - Use ACPI FADT preferred PM Profile to distinguish Skylake desktop processors from some server ones with the same model number in order to limit the scope of the recent IO-wait boost optimization to servers, as intended (Srinivas Pandruvada). - Fix several issues in the turbostat utility: * Fix the -S option on 1-CPU systems (Len Brown). * Fix computations using incorrect processor core counts (Artem Bityutskiy). * Fix the x2apic debug message (Len Brown). * Fix logical node enumeration to allow for non-sequential physical nodes (Prarit Bhargava). * Fix reported family on modern AMD processors (Calvin Walton). * Clarify the RAPL column information in the man page (Len Brown)" * tag 'pm-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms tools/power turbostat: version 18.07.27 tools/power turbostat: Read extended processor family from CPUID tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes tools/power turbostat: fix x2apic debug message output file tools/power turbostat: fix bogus summary values tools/power turbostat: fix -S on UP systems tools/power turbostat: Update turbostat(8) RAPL throttling column description
2018-08-01squashfs metadata 2: electric boogalooLinus Torvalds3-14/+20
Anatoly continues to find issues with fuzzed squashfs images. This time, corrupt, missing, or undersized data for the page filling wasn't checked for, because the squashfs_{copy,read}_cache() functions did the squashfs_copy_data() call without checking the resulting data size. Which could result in the page cache pages being incompletely filled in, and no error indication to the user space reading garbage data. So make a helper function for the "fill in pages" case, because the exact same incomplete sequence existed in two places. [ I should have made a squashfs branch for these things, but I didn't intend to start doing them in the first place. My historical connection through cramfs is why I got into looking at these issues at all, and every time I (continue to) think it's a one-off. Because _this_ time is always the last time. Right? - Linus ] Reported-by: Anatoly Trosinenko <[email protected]> Tested-by: Willy Tarreau <[email protected]> Cc: Al Viro <[email protected]> Cc: Phillip Lougher <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-01staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem pagesJohn Stultz1-0/+2
Amit Pundir and Youling in parallel reported crashes with recent mainline kernels running Android: F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** F DEBUG : Build fingerprint: 'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key F DEBUG : Revision: '0' F DEBUG : ABI: 'arm' F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<< F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008 ... <snip> ... F DEBUG : backtrace: F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48) F DEBUG : #01 pc 0010c513 /system/lib/libart.so (create_mspace_with_base+82) F DEBUG : #02 pc 0015c601 /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int, unsigned int)+40) F DEBUG : #03 pc 0015c3ed /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*, std::__1::basic_string<char, std::__ 1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int, unsigned int, unsigned int, unsigned int, bool)+36) ... This was bisected back to commit bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives"). create_mspace_with_base() in the trace above, utilizes ashmem, and with ashmem, for shared mappings we use shmem_zero_setup(), which sets the vma->vm_ops to &shmem_vm_ops. But for private ashmem mappings nothing sets the vma->vm_ops. Looking at the problematic patch, it seems to add a requirement that one call vma_set_anonymous() on a vma, otherwise the dummy_vm_ops will be used. Using the dummy_vm_ops seem to triggger SIGBUS when traversing unmapped pages. Thus, this patch adds a call to vma_set_anonymous() for ashmem private mappings and seems to avoid the reported problem. Fixes: bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") Cc: Kirill Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Colin Cross <[email protected]> Cc: Matthew Wilcox <[email protected]> Reported-by: Amit Pundir <[email protected]> Reported-by: Youling 257 <[email protected]> Signed-off-by: John Stultz <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-01ia64: mark special ia64 memory areas anonymousLinus Torvalds1-0/+2
Commit bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") made newly allocated vma's have a dummy vm_ops field so that they wouldn't be mistaken for anonymous mappings, and if you wanted an anonymous vma you had to explicitly say so by calling "vma_set_anonymous()" on it. However, it missed the two special vmas that ia64 processes have: the register backing store and the NaT page. So they wouldn't actually act like anonymous ranges, and page faults on them caused a SIGBUS rather than the creation of a new anon page in them. That obviously will make any ia64 binary very unhappy indeed, and the boot fails early. Fixes: bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") Reported-by: Tony Luck <[email protected]> Cc: Kirill Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: John Stultz <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-01net: dsa: Do not suspend/resume closed slave_devFlorian Fainelli1-0/+6
If a DSA slave network device was previously disabled, there is no need to suspend or resume it. Fixes: 2446254915a7 ("net: dsa: allow switch drivers to implement suspend/resume hooks") Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01netlink: Fix spectre v1 gadget in netlink_create()Jeremy Cline1-0/+2
'protocol' is a user-controlled value, so sanitize it after the bounds check to avoid using it for speculative out-of-bounds access to arrays indexed by it. This addresses the following accesses detected with the help of smatch: * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_keys' [w] * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_key_strings' [w] * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre issue 'nl_table' [w] (local cap) Cc: Josh Poimboeuf <[email protected]> Signed-off-by: Jeremy Cline <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01Documentation: dpaa2: Use correct heading adornmentIoana Ciornei1-0/+1
Add overline heading adornment to document title in order to comply with kernel doc requirements. Fixes: 60b9131 staging: fsl-mc: Convert documentation to rst format Signed-off-by: Ioana Ciornei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01net: stmmac: Fix WoL for PCI-based setupsJose Abreu1-2/+38
WoL won't work in PCI-based setups because we are not saving the PCI EP state before entering suspend state and not allowing D3 wake. Fix this by using a wrapper around stmmac_{suspend/resume} which correctly sets the PCI EP state. Signed-off-by: Jose Abreu <[email protected]> Cc: David S. Miller <[email protected]> Cc: Joao Pinto <[email protected]> Cc: Giuseppe Cavallaro <[email protected]> Cc: Alexandre Torgue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01bonding: avoid lockdep confusion in bond_get_stats()Eric Dumazet1-2/+12
syzbot found that the following sequence produces a LOCKDEP splat [1] ip link add bond10 type bond ip link add bond11 type bond ip link set bond11 master bond10 To fix this, we can use the already provided nest_level. This patch also provides correct nesting for dev->addr_list_lock [1] WARNING: possible recursive locking detected 4.18.0-rc6+ #167 Not tainted -------------------------------------------- syz-executor751/4439 is trying to acquire lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 but task is already holding lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&bond->stats_lock)->rlock); lock(&(&bond->stats_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by syz-executor751/4439: #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 #2: (____ptrval____) (rcu_read_lock){....}, at: bond_get_stats+0x0/0x560 include/linux/compiler.h:215 stack backtrace: CPU: 0 PID: 4439 Comm: syz-executor751 Not tainted 4.18.0-rc6+ #167 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1765 [inline] check_deadlock kernel/locking/lockdep.c:1809 [inline] validate_chain kernel/locking/lockdep.c:2405 [inline] __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 bond_get_stats+0x232/0x560 drivers/net/bonding/bond_main.c:3432 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 rtnl_fill_stats+0x4d/0xac0 net/core/rtnetlink.c:1169 rtnl_fill_ifinfo+0x1aa6/0x3fb0 net/core/rtnetlink.c:1611 rtmsg_ifinfo_build_skb+0xc8/0x190 net/core/rtnetlink.c:3268 rtmsg_ifinfo_event.part.30+0x45/0xe0 net/core/rtnetlink.c:3300 rtmsg_ifinfo_event net/core/rtnetlink.c:3297 [inline] rtnetlink_event+0x144/0x170 net/core/rtnetlink.c:4716 notifier_call_chain+0x180/0x390 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735 call_netdevice_notifiers net/core/dev.c:1753 [inline] netdev_features_change net/core/dev.c:1321 [inline] netdev_change_features+0xb3/0x110 net/core/dev.c:7759 bond_compute_features.isra.47+0x585/0xa50 drivers/net/bonding/bond_main.c:1120 bond_enslave+0x1b25/0x5da0 drivers/net/bonding/bond_main.c:1755 bond_do_ioctl+0x7cb/0xae0 drivers/net/bonding/bond_main.c:3528 dev_ifsioc+0x43c/0xb30 net/core/dev_ioctl.c:327 dev_ioctl+0x1b5/0xcc0 net/core/dev_ioctl.c:493 sock_do_ioctl+0x1d3/0x3e0 net/socket.c:992 sock_ioctl+0x30d/0x680 net/socket.c:1093 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440859 Code: e8 2c af 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc51a92878 EFLAGS: 00000213 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440859 RDX: 0000000020000040 RSI: 0000000000008990 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000022d5880 R11: 0000000000000213 R12: 0000000000007390 R13: 0000000000401db0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Eric Dumazet <[email protected]> Cc: Jay Vosburgh <[email protected]> Cc: Veaceslav Falico <[email protected]> Cc: Andy Gospodarek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-01perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.hArnaldo Carvalho de Melo1-0/+3
The next example scripts need the definition for the BPF functions, i.e. things like BPF_FUNC_probe_read, and in time will require lots of other definitions found in uapi/linux/bpf.h, so include it from the bpf.h file included from the eBPF scripts build with clang via '-e bpf_script.c' like in this example: $ tail -8 tools/perf/examples/bpf/5sec.c #include <bpf.h> int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec) { return sec == 5; } license(GPL); $ That 'bpf.h' include in the 5sec.c eBPF example will come from a set of header files crafted for building eBPF objects, that in a end-user system will come from: /usr/lib/perf/include/bpf/bpf.h And will include <uapi/linux/bpf.h> either from the place where the kernel was built, or from a kernel-devel rpm package like: -working-directory /lib/modules/4.17.9-100.fc27.x86_64/build That is set up by tools/perf/util/llvm-utils.c, and can be overriden by setting the 'kbuild-dir' variable in the "llvm" ~/.perfconfig file, like: # cat ~/.perfconfig [llvm] kbuild-dir = /home/foo/git/build/linux This usually doesn't need any change, just documenting here my findings while working with this code. In the future we may want to instead just use what is in /usr/include/linux/bpf.h, that comes from the UAPI provided from the kernel sources, for now, to avoid getting the kernel's non-UAPI "linux/bpf.h" file, that will cause clang to fail and is not what we want anyway (no BPF function definitions, etc), do it explicitely by asking for "uapi/linux/bpf.h". Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-01perf tools: Allow overriding MAX_NR_CPUS at compile timeChristophe Leroy1-0/+2
After update of kernel, the perf tool doesn't run anymore on my 32MB RAM powerpc board, but still runs on a 128MB RAM board: ~# strace perf execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} --- +++ killed by SIGSEGV +++ Segmentation fault objdump -x shows that .bss section has a huge size of 24Mbytes: 27 .bss 016baca8 101cebb8 101cebb8 001cd988 2**3 With especially the following objects having quite big size: 10205f80 l O .bss 00140000 runtime_cycles_stats 10345f80 l O .bss 00140000 runtime_stalled_cycles_front_stats 10485f80 l O .bss 00140000 runtime_stalled_cycles_back_stats 105c5f80 l O .bss 00140000 runtime_branches_stats 10705f80 l O .bss 00140000 runtime_cacherefs_stats 10845f80 l O .bss 00140000 runtime_l1_dcache_stats 10985f80 l O .bss 00140000 runtime_l1_icache_stats 10ac5f80 l O .bss 00140000 runtime_ll_cache_stats 10c05f80 l O .bss 00140000 runtime_itlb_cache_stats 10d45f80 l O .bss 00140000 runtime_dtlb_cache_stats 10e85f80 l O .bss 00140000 runtime_cycles_in_tx_stats 10fc5f80 l O .bss 00140000 runtime_transaction_stats 11105f80 l O .bss 00140000 runtime_elision_stats 11245f80 l O .bss 00140000 runtime_topdown_total_slots 11385f80 l O .bss 00140000 runtime_topdown_slots_retired 114c5f80 l O .bss 00140000 runtime_topdown_slots_issued 11605f80 l O .bss 00140000 runtime_topdown_fetch_bubbles 11745f80 l O .bss 00140000 runtime_topdown_recovery_bubbles This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus to 1024"), because many tables are sized with MAX_NR_CPUS This patch gives the opportunity to redefine MAX_NR_CPUS via $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1 Signed-off-by: Christophe Leroy <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-01powerpc/64s/radix: Fix missing global invalidations when removing coproFrederic Barrat1-12/+21
With the optimizations for TLB invalidation from commit 0cef77c7798a ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask"), the scope of a TLBI (global vs. local) can now be influenced by the value of the 'copros' counter of the memory context. When calling mm_context_remove_copro(), the 'copros' counter is decremented first before flushing. It may have the unintended side effect of sending local TLBIs when we explicitly need global invalidations in this case. Thus breaking any nMMU user in a bad and unpredictable way. Fix it by flushing first, before updating the 'copros' counter, so that invalidations will be global. Fixes: 0cef77c7798a ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask") Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Tested-by: Vaibhav Jain <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-08-01gpiolib-acpi: make sure we trigger edge events at least once on bootBenjamin Tissoires1-1/+55
On some systems using edge triggered ACPI Event Interrupts, the initial state at boot is not setup by the firmware, instead relying on the edge irq event handler running at least once to setup the initial state. 2 known examples of this are: 1) The Surface 3 has its _LID state controlled by an ACPI operation region triggered by a GPIO event: OperationRegion (GPOR, GeneralPurposeIo, Zero, One) Field (GPOR, ByteAcc, NoLock, Preserve) { Connection ( GpioIo (Shared, PullNone, 0x0000, 0x0000, IoRestrictionNone, "\\_SB.GPO0", 0x00, ResourceConsumer, , ) { // Pin list 0x004C } ), HELD, 1 } Method (_E4C, 0, Serialized) // _Exx: Edge-Triggered GPE { If ((HELD == One)) { ^^LID.LIDB = One } Else { ^^LID.LIDB = Zero Notify (LID, 0x80) // Status Change } Notify (^^PCI0.SPI1.NTRG, One) // Device Check } Currently, the state of LIDB is wrong until the user actually closes or open the cover. We need to trigger the GPIO event once to update the internal ACPI state. Coincidentally, this also enables the Surface 2 integrated HID sensor hub which also requires an ACPI gpio operation region to start initialization. 2) Various Bay Trail based tablets come with an external USB mux and TI T1210B USB phy to enable USB gadget mode. The mux is controlled by a GPIO which is controlled by an edge triggered ACPI Event Interrupt which monitors the micro-USB ID pin. When the tablet is connected to a PC (or no cable is plugged in), the ID pin is high and the tablet should be in gadget mode. But the GPIO controlling the mux is initialized by the firmware so that the USB data lines are muxed to the host controller. This means that if the user wants to use gadget mode, the user needs to first plug in a host-cable to force the ID pin low and then unplug it and connect the tablet to a PC, to get the ACPI event handler to run and switch the mux to device mode, This commit fixes both by running the event-handler once on boot. Note that the running of the event-handler is done from a late_initcall, this is done because the handler AML code may rely on OperationRegions registered by other builtin drivers. This avoids errors like these: [ 0.133026] ACPI Error: No handler for Region [XSCG] ((____ptrval____)) [GenericSerialBus] (20180531/evregion-132) [ 0.133036] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20180531/exfldio-265) [ 0.133046] ACPI Error: Method parse/execution failed \_SB.GPO2._E12, AE_NOT_EXIST (20180531/psparse-516) Signed-off-by: Benjamin Tissoires <[email protected]> [hdegoede: Document BYT USB mux reliance on initial trigger] [hdegoede: Run event handler from a late_initcall, rather then immediately] Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Mika Westerberg <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2018-08-01Merge branch 'pm-tools'Rafael J. Wysocki2-65/+59
Merge turbostat utility fixes for final 4.18: - Fix the -S option on 1-CPU systems. - Fix computations using incorrect processor core counts. - Fix the x2apic debug message. - Fix logical node enumeration to allow for non-sequential physical nodes. - Fix reported family on modern AMD processors. - Clarify the RAPL column information in the man page. * pm-tools: tools/power turbostat: version 18.07.27 tools/power turbostat: Read extended processor family from CPUID tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes tools/power turbostat: fix x2apic debug message output file tools/power turbostat: fix bogus summary values tools/power turbostat: fix -S on UP systems tools/power turbostat: Update turbostat(8) RAPL throttling column description
2018-08-01Merge tag 'drm-misc-fixes-2018-07-27' of ↵Dave Airlie4-4/+21
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes pull request for v4.18-rc7: - Small fixes to drm_atomic_helper_async_check(). (bbrezillon) - Fix error handling in drm_legacy_addctx(). (Nicholas) - Handle register reset on hotplug in adv7511. (seanpaul) Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-07-31enic: do not call enic_change_mtu in enic_probeGovindarajulu Varadarajan1-1/+1
In commit ab123fe071c9 ("enic: handle mtu change for vf properly") ASSERT_RTNL() is added to _enic_change_mtu() to prevent it from being called without rtnl held. enic_probe() calls enic_change_mtu() without rtnl held. At this point netdev is not registered yet. Remove call to enic_change_mtu and assign the mtu to netdev->mtu. Fixes: ab123fe071c9 ("enic: handle mtu change for vf properly") Signed-off-by: Govindarajulu Varadarajan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-31ipv4: frags: handle possible skb truesize changeEric Dumazet1-0/+5
ip_frag_queue() might call pskb_pull() on one skb that is already in the fragment queue. We need to take care of possible truesize change, or we might have an imbalance of the netns frags memory usage. IPv6 is immune to this bug, because RFC5722, Section 4, amended by Errata ID 3089 states : When reassembling an IPv6 datagram, if one or more its constituent fragments is determined to be an overlapping fragment, the entire datagram (and any constituent fragments) MUST be silently discarded. Fixes: 158f323b9868 ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-31inet: frag: enforce memory limits earlierEric Dumazet1-3/+3
We currently check current frags memory usage only when a new frag queue is created. This allows attackers to first consume the memory budget (default : 4 MB) creating thousands of frag queues, then sending tiny skbs to exceed high_thresh limit by 2 to 3 order of magnitude. Note that before commit 648700f76b03 ("inet: frags: use rhashtables for reassembly units"), work queue could be starved under DOS, getting no cpu cycles. After commit 648700f76b03, only the per frag queue timer can eventually remove an incomplete frag queue and its skbs. Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Jann Horn <[email protected]> Cc: Florian Westphal <[email protected]> Cc: Peter Oskolkov <[email protected]> Cc: Paolo Abeni <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-31Merge tag 'mlx5-fixes-2018-07-31' of ↵David S. Miller4-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-07-31 The following series includes four mlx5 fixes. Please pull and let me know if there's any problem. For -stable v4.14 net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager For -stable v4.16 net/mlx5e: Set port trust mode to PCP as default For -stable v4.17 net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow ==================== Signed-off-by: David S. Miller <[email protected]>
2018-07-31Merge tag 'audit-pr-20180731' of ↵Linus Torvalds1-4/+9
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "A single small audit fix to guard against memory allocation failures when logging information about a kernel module load. It's small, easy to understand, and self-contained; while nothing is zero risk, this should be pretty low" * tag 'audit-pr-20180731' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: fix potential null dereference 'context->module.name'
2018-07-31nohz: Fix local_timer_softirq_pending()Anna-Maria Gleixner1-1/+1
local_timer_softirq_pending() checks whether the timer softirq is pending with: local_softirq_pending() & TIMER_SOFTIRQ. This is wrong because TIMER_SOFTIRQ is the softirq number and not a bitmask. So the test checks for the wrong bit. Use BIT(TIMER_SOFTIRQ) instead. Fixes: 5d62c183f9e9 ("nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()") Signed-off-by: Anna-Maria Gleixner <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Reviewed-by: Daniel Bristot de Oliveira <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-31net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flowFeras Daoud1-0/+4
After introduction of the cited commit, mlx5e_build_nic_params receives the netdevice mtu in order to set the sw_mtu of mlx5e_params. For enhanced IPoIB, the netdevice mtu is not set in this stage, therefore, the initial sw_mtu equals zero. As a result, the hw_mtu of the receive queue will be calculated incorrectly causing traffic issues. To fix this issue, query for port mtu before building the nic params. Fixes: 472a1e44b349 ("net/mlx5e: Save MTU in channels params") Signed-off-by: Feras Daoud <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-07-31net/mlx5e: Fix null pointer access when setting MTU of vport representorAdi Nissim1-1/+2
MTU helper function is used by both conventional mlx5e instances (PF/VF) and the eswitch representors. The representor shouldn't change the nic vport context MTU, the VF is responsible for that. Therefore set_mtu_cb has a null value when changing the representor MTU. Fixes: 250a42b6a764 ("net/mlx5e: Support configurable MTU for vport representors") Signed-off-by: Adi Nissim <[email protected]> Reviewed-by: Yevgeny Kliteynik <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>