aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-06-15cifs: remove smb2_send_recv()Ronnie Sahlberg3-42/+116
Now that we have the plumbing to pass request without an rfc1002 header all the way down to the point we write to the socket we no longer need the smb2_send_recv() function. Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2018-06-15cifs: push rfc1002 generation down the stackRonnie Sahlberg6-135/+99
Move the generation of the 4 byte length field down the stack and generate it immediately before we start writing the data to the socket. Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Aurelien Aptel <[email protected]> Signed-off-by: Steve French <[email protected]>
2018-06-15smb3: increase initial number of credits requested to allow writeSteve French1-2/+3
Compared to other clients the Linux smb3 client ramps up credits very slowly, taking more than 128 operations before a maximum size write could be sent (since the number of credits requested is only 2 per small operation, causing the credit limit to grow very slowly). This lack of credits initially would impact large i/o performance, when large i/o is tried early before enough credits are built up. Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2018-06-15cifs: minor documentation updatesSteve French3-11/+16
Various minor cifs/smb3 documentation updates Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2018-06-15cifs: add lease tracking to the cached root fidRonnie Sahlberg6-20/+58
Use a read lease for the cached root fid so that we can detect when the content of the directory changes (via a break) at which time we close the handle. On next access to the root the handle will be reopened and cached again. Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2018-06-15smb3: note that smb3.11 posix extensions mount option is experimentalSteve French1-1/+4
Signed-off-by: Steve French <[email protected]>
2018-06-15afs: Show all of a server's addresses in /proc/fs/afs/serversDavid Howells1-2/+8
Show all of a server's addresses in /proc/fs/afs/servers, placing the second plus addresses on padded lines of their own. The current address is marked with a star. Signed-off-by: David Howells <[email protected]>
2018-06-15afs: Handle CONFIG_PROC_FS=nDavid Howells2-2/+10
The AFS filesystem depends at the moment on /proc for configuration and also presents information that way - however, this causes a compilation failure if procfs is disabled. Fix it so that the procfs bits aren't compiled in if procfs is disabled. This means that you can't configure the AFS filesystem directly, but it is still usable provided that an up-to-date keyutils is installed to look up cells by SRV or AFSDB DNS records. Reported-by: Al Viro <[email protected]> Signed-off-by: David Howells <[email protected]>
2018-06-15proc: Make inline name size calculation automaticDavid Howells4-12/+16
Make calculation of the size of the inline name in struct proc_dir_entry automatic, rather than having to manually encode the numbers and failing to allow for lockdep. Require a minimum inline name size of 33+1 to allow for names that look like two hex numbers with a dash between. Reported-by: Al Viro <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-06-15orangefs: simplify compat ioctl handlingAl Viro1-42/+12
no need to mess with copy_in_user(), etc... Signed-off-by: Al Viro <[email protected]>
2018-06-15signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()Al Viro1-25/+25
Signed-off-by: Al Viro <[email protected]>
2018-06-14hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offloadHaiyang Zhang1-13/+13
These fields in struct ndis_ipsecv2_offload and struct ndis_rsc_offload are one byte according to the specs. This patch defines them with the right size. These structs are not in use right now, but will be used soon. Signed-off-by: Haiyang Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14rds: avoid unenecessary cong_update in loop transportSantosh Shilimkar3-0/+11
Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: [email protected] Reviewed-by: Sowmini Varadhan <[email protected]> Signed-off-by: Santosh Shilimkar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-15Merge branch 'drm-next-4.18' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie43-255/+372
into drm-next Fixes for 4.18. Highlights: - Fixes for gfxoff on Raven - Remove an ATPX quirk now that the root cause is fixed - Runtime PM fixes - Vega20 register header update - Wattman fixes - Misc bug fixes Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-06-14Merge branch 'l2tp-fixes'David S. Miller1-0/+26
Guillaume Nault says: ==================== l2tp: pppol2tp_connect() fixes This series fixes a few remaining issues with pppol2tp_connect(). It doesn't try to prevent invalid configurations that have no effect on kernel's reliability. That would be work for a future patch set. Patch 2 is the most important as it avoids an invalid pointer dereference crashing the kernel. It depends on patch 1 for correctly identifying L2TP session types. Patches 3 and 4 avoid creating stale tunnels and sessions. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-06-14l2tp: clean up stale tunnel or session in pppol2tp_connect's error pathGuillaume Nault1-0/+10
pppol2tp_connect() may create a tunnel or a session. Remove them in case of error. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14l2tp: prevent pppol2tp_connect() from creating kernel socketsGuillaume Nault1-0/+9
If 'fd' is negative, l2tp_tunnel_create() creates a tunnel socket using the configuration passed in 'tcfg'. Currently, pppol2tp_connect() sets the relevant fields to zero, tricking l2tp_tunnel_create() into setting up an unusable kernel socket. We can't set 'tcfg' with the required fields because there's no way to get them from the current connect() parameters. So let's restrict kernel sockets creation to the netlink API, which is the original use case. Fixes: 789a4a2c61d8 ("l2tp: Add support for static unmanaged L2TPv3 tunnels") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14l2tp: only accept PPP sessions in pppol2tp_connect()Guillaume Nault1-0/+6
l2tp_session_priv() returns a struct pppol2tp_session pointer only for PPPoL2TP sessions. In particular, if the session is an L2TP_PWTYPE_ETH pseudo-wire, l2tp_session_priv() returns a pointer to an l2tp_eth_sess structure, which is much smaller than struct pppol2tp_session. This leads to invalid memory dereference when trying to lock ps->sk_lock. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14l2tp: fix pseudo-wire type for sessions created by pppol2tp_connect()Guillaume Nault1-0/+1
Define cfg.pw_type so that the new session is created with its .pwtype field properly set (L2TP_PWTYPE_PPP). Not setting the pseudo-wire type had several annoying effects: * Invalid value returned in the L2TP_ATTR_PW_TYPE attribute when dumping sessions with the netlink API. * Impossibility to delete the session using the netlink API (because l2tp_nl_cmd_session_delete() gets the deletion callback function from an array indexed by the session's pseudo-wire type). Also, there are several cases where we should check a session's pseudo-wire type. For example, pppol2tp_connect() should refuse to connect a session that is not PPPoL2TP, but that requires the session's .pwtype field to be properly set. Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14eventpoll: switch to ->poll_maskBen Noordhuis1-5/+10
Signed-off-by: Ben Noordhuis <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-06-14aio: only return events requested in poll_mask() for IOCB_CMD_POLLChristoph Hellwig1-2/+2
The ->poll_mask() operation has a mask of events that the caller is interested in, but not all implementations might take it into account. Mask the return value to only the requested events, similar to what the poll and epoll code does. Reported-by: Avi Kivity <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-06-14Merge branch 'emaclite-fixes'David S. Miller1-8/+4
Radhey Shyam Pandey says: ==================== emaclite bug fixes and code cleanup This patch series fixes bug in emaclite remove and mdio_setup routines. It does minor code cleanup. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-06-14net: emaclite: Remove xemaclite_mdio_setup return checkRadhey Shyam Pandey1-3/+1
Errors are already reported in xemaclite_mdio_setup so avoid reporting it again. Signed-off-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: Michal Simek <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14net: emaclite: Remove unused 'has_mdio' flag.Radhey Shyam Pandey1-2/+0
Remove unused 'has_mdio' flag. Signed-off-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: Michal Simek <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14net: emaclite: Fix MDIO bus unregister bugRadhey Shyam Pandey1-1/+1
Since 'has_mdio' flag is not used,sequence insmod->rmmod-> insmod leads to failure as MDIO unregister doesn't happen in .remove(). Fix it by checking MII bus pointer instead. Signed-off-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: Michal Simek <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14net: emaclite: Fix position of lp->mii_bus assignmentRadhey Shyam Pandey1-2/+2
To ensure MDIO bus is not double freed in remove() path assign lp->mii_bus after MDIO bus registration. Signed-off-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: Michal Simek <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14eventfd: only return events requested in poll_mask()Avi Kivity1-2/+2
The ->poll_mask() operation has a mask of events that the caller is interested in, but we're returning all events regardless. Change to return only the events the caller is interested in. This fixes aio IO_CMD_POLL returning immediately when called with POLLIN on an eventfd, since an eventfd is almost always ready for a write. Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-06-14aio: mark __aio_sigset::sigmask constAvi Kivity1-1/+1
io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-06-14tcp: verify the checksum of the first data segment in a new connectionFrank van der Linden2-0/+8
commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") introduced an optimization for the handling of child sockets created for a new TCP connection. But this optimization passes any data associated with the last ACK of the connection handshake up the stack without verifying its checksum, because it calls tcp_child_process(), which in turn calls tcp_rcv_state_process() directly. These lower-level processing functions do not do any checksum verification. Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to fix this. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Frank van der Linden <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Tested-by: Balbir Singh <[email protected]> Reviewed-by: Balbir Singh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-14net: qcom/emac: Add missing of_node_put()YueHaibing1-0/+1
Add missing of_node_put() call for device node returned by of_parse_phandle(). Signed-off-by: YueHaibing <[email protected]> Acked-by: Timur Tabi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds56-212/+397
Merge more updates from Andrew Morton: - MM remainders - various misc things - kcov updates * emailed patches from Andrew Morton <[email protected]>: (27 commits) lib/test_printf.c: call wait_for_random_bytes() before plain %p tests hexagon: drop the unused variable zero_page_mask hexagon: fix printk format warning in setup.c mm: fix oom_kill event handling treewide: use PHYS_ADDR_MAX to avoid type casting ULLONG_MAX mm: use octal not symbolic permissions ipc: use new return type vm_fault_t sysvipc/sem: mitigate semnum index against spectre v1 fault-injection: reorder config entries arm: port KCOV to arm sched/core / kcov: avoid kcov_area during task switch kcov: prefault the kcov_area kcov: ensure irq code sees a valid area kernel/relay.c: change return type to vm_fault_t exofs: avoid VLA in structures coredump: fix spam with zero VMA process fat: use fat_fs_error() instead of BUG_ON() in __fat_get_block() proc: skip branch in /proc/*/* lookup mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns mm/memblock: add missing include <linux/bootmem.h> ...
2018-06-15lib/test_printf.c: call wait_for_random_bytes() before plain %p testsThierry Escande1-0/+7
If the test_printf module is loaded before the crng is initialized, the plain 'p' tests will fail because the printed address will not be hashed and the buffer will contain '(ptrval)' instead. This patch adds a call to wait_for_random_bytes() before plain 'p' tests to make sure the crng is initialized. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thierry Escande <[email protected]> Acked-by: Tobin C. Harding <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: David Miller <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15hexagon: drop the unused variable zero_page_maskAnshuman Khandual2-4/+0
Hexagon arch does not seem to have subscribed to _HAVE_COLOR_ZERO_PAGE framework. Hence zero_page_mask variable is not needed. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15hexagon: fix printk format warning in setup.cRandy Dunlap1-1/+1
Fix printk format warning in hexagon/kernel/setup.c: ../arch/hexagon/kernel/setup.c: In function 'setup_arch': ../arch/hexagon/kernel/setup.c:69:2: warning: format '%x' expects argument of type 'unsigned int', but argument 2 has type 'long unsigned int' [-Wformat] where: extern unsigned long __phys_offset; #define PHYS_OFFSET __phys_offset Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15mm: fix oom_kill event handlingRoman Gushchin3-7/+27
Commit e27be240df53 ("mm: memcg: make sure memory.events is uptodate when waking pollers") converted most of memcg event counters to per-memcg atomics, which made them less confusing for a user. The "oom_kill" counter remained untouched, so now it behaves differently than other counters (including "oom"). This adds nothing but confusion. Let's fix this by adding the MEMCG_OOM_KILL event, and follow the MEMCG_OOM approach. This also removes a hack from count_memcg_event_mm(), introduced earlier specially for the OOM_KILL counter. [[email protected]: fix for droppage of memcg-replace-mm-owner-with-mm-memcg.patch] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Acked-by: Konstantin Khlebnikov <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15treewide: use PHYS_ADDR_MAX to avoid type casting ULLONG_MAXStefan Agner9-13/+13
With PHYS_ADDR_MAX there is now a type safe variant for all bits set. Make use of it. Patch created using a semantic patch as follows: // <smpl> @@ typedef phys_addr_t; @@ -(phys_addr_t)ULLONG_MAX +PHYS_ADDR_MAX // </smpl> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stefan Agner <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Catalin Marinas <[email protected]> [arm64] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15mm: use octal not symbolic permissionsJoe Perches15-66/+63
mm/*.c files use symbolic and octal styles for permissions. Using octal and not symbolic permissions is preferred by many as more readable. https://lkml.org/lkml/2016/8/2/1945 Prefer the direct use of octal for permissions. Done using $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/*.c and some typing. Before: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l 44 After: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l 86 Miscellanea: o Whitespace neatening around these conversions. Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Acked-by: David Rientjes <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15ipc: use new return type vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Commit 1c8f422059ae ("mm: change return type to vm_fault_t") Link: http://lkml.kernel.org/r/20180425043413.GA21467@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15sysvipc/sem: mitigate semnum index against spectre v1Davidlohr Bueso1-4/+14
Both smatch and coverity are reporting potential issues with spectre variant 1 with the 'semnum' index within the sma->sems array, ie: ipc/sem.c:388 sem_lock() warn: potential spectre issue 'sma->sems' ipc/sem.c:641 perform_atomic_semop_slow() warn: potential spectre issue 'sma->sems' ipc/sem.c:721 perform_atomic_semop() warn: potential spectre issue 'sma->sems' Avoid any possible speculation by using array_index_nospec() thus ensuring the semnum value is bounded to [0, sma->sem_nsems). With the exception of sem_lock() all of these are slowpaths. Link: http://lkml.kernel.org/r/20180423171131.njs4rfm2yzyeg6do@linux-n805 Signed-off-by: Davidlohr Bueso <[email protected]> Reported-by: Dan Carpenter <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Gustavo A. R. Silva" <[email protected]> Cc: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15fault-injection: reorder config entriesMikulas Patocka1-18/+18
Reorder Kconfig entries, so that menuconfig displays proper indentation. Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1804251601160.30569@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Mikulas Patocka <[email protected]> Acked-by: Randy Dunlap <[email protected]> Tested-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15arm: port KCOV to armDmitry Vyukov4-1/+16
KCOV is code coverage collection facility used, in particular, by syzkaller system call fuzzer. There is some interest in using syzkaller on arm devices. So port KCOV to arm. On implementation level this merely declares that KCOV is supported and disables instrumentation of 3 special cases. Reasons for disabling are commented in code. Tested with qemu-system-arm/vexpress-a15. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dmitry Vyukov <[email protected]> Acked-by: Mark Rutland <[email protected]> Cc: Russell King <[email protected]> Cc: Abbott Liu <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Koguchi Takuo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15sched/core / kcov: avoid kcov_area during task switchMark Rutland4-2/+20
During a context switch, we first switch_mm() to the next task's mm, then switch_to() that new task. This means that vmalloc'd regions which had previously been faulted in can transiently disappear in the context of the prev task. Functions instrumented by KCOV may try to access a vmalloc'd kcov_area during this window, and as the fault handling code is instrumented, this results in a recursive fault. We must avoid accessing any kcov_area during this window. We can do so with a new flag in kcov_mode, set prior to switching the mm, and cleared once the new task is live. Since task_struct::kcov_mode isn't always a specific enum kcov_mode value, this is made an unsigned int. The manipulation is hidden behind kcov_{prepare,finish}_switch() helpers, which are empty for !CONFIG_KCOV kernels. The code uses macros because I can't use static inline functions without a circular include dependency between <linux/sched.h> and <linux/kcov.h>, since the definition of task_struct uses things defined in <linux/kcov.h> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Rutland <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15kcov: prefault the kcov_areaMark Rutland1-0/+16
On many architectures the vmalloc area is lazily faulted in upon first access. This is problematic for KCOV, as __sanitizer_cov_trace_pc accesses the (vmalloc'd) kcov_area, and fault handling code may be instrumented. If an access to kcov_area faults, this will result in mutual recursion through the fault handling code and __sanitizer_cov_trace_pc(), eventually leading to stack corruption and/or overflow. We can avoid this by faulting in the kcov_area before __sanitizer_cov_trace_pc() is permitted to access it. Once it has been faulted in, it will remain present in the process page tables, and will not fault again. [[email protected]: code cleanup] [[email protected]: add comment explaining kcov_fault_in_area()] [[email protected]: fancier code comment from Mark] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Rutland <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15kcov: ensure irq code sees a valid areaMark Rutland1-1/+2
Patch series "kcov: fix unexpected faults". These patches fix a few issues where KCOV code could trigger recursive faults, discovered while debugging a patch enabling KCOV for arch/arm: * On CONFIG_PREEMPT kernels, there's a small race window where __sanitizer_cov_trace_pc() can see a bogus kcov_area. * Lazy faulting of the vmalloc area can cause mutual recursion between fault handling code and __sanitizer_cov_trace_pc(). * During the context switch, switching the mm can cause the kcov_area to be transiently unmapped. These are prerequisites for enabling KCOV on arm, but the issues themsevles are generic -- we just happen to avoid them by chance rather than design on x86-64 and arm64. This patch (of 3): For kernels built with CONFIG_PREEMPT, some C code may execute before or after the interrupt handler, while the hardirq count is zero. In these cases, in_task() can return true. A task can be interrupted in the middle of a KCOV_DISABLE ioctl while it resets the task's kcov data via kcov_task_init(). Instrumented code executed during this period will call __sanitizer_cov_trace_pc(), and as in_task() returns true, will inspect t->kcov_mode before trying to write to t->kcov_area. In kcov_init_task() we update t->kcov_{mode,area,size} with plain stores, which may be re-ordered, torn, etc. Thus __sanitizer_cov_trace_pc() may see bogus values for any of these fields, and may attempt to write to memory which is not mapped. Let's avoid this by using WRITE_ONCE() to set t->kcov_mode, with a barrier() to ensure this is ordered before we clear t->kov_{area,size}. This ensures that any code execute while kcov_init_task() is preempted will either see valid values for t->kcov_{area,size}, or will see that t->kcov_mode is KCOV_MODE_DISABLED, and bail out without touching t->kcov_area. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Rutland <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15kernel/relay.c: change return type to vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f422059ae ("mm: change return type to vm_fault_t") Link: http://lkml.kernel.org/r/20180510140335.GA25363@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15exofs: avoid VLA in structuresKees Cook3-67/+115
On the quest to remove all VLAs from the kernel[1] this adjusts several cases where allocation is made after an array of structures that points back into the allocation. The allocations are changed to perform explicit calculations instead of using a Variable Length Array in a structure. Additionally, this lets Clang compile this code now, since Clang does not support VLAIS[2]. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com [2] https://lkml.kernel.org/r/CA+55aFy6h1c3_rP_bXFedsTXzwW+9Q9MfJaW7GUmMBrAp-fJ9A@mail.gmail.com [[email protected]: v2] Link: http://lkml.kernel.org/r/20180418163546.GA45794@beast Link: http://lkml.kernel.org/r/20180327203904.GA1151@beast Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Cc: Boaz Harrosh <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15coredump: fix spam with zero VMA processAlexey Dobriyan1-8/+9
Nobody ever tried to self destruct by unmapping whole address space at once: munmap((void *)0, (1ULL << 47) - 4096); Doing this produces 2 warnings for zero-length vmalloc allocations: a.out[1353]: segfault at 7f80bcc4b757 ip 00007f80bcc4b757 sp 00007fff683939b8 error 14 a.out: vmalloc: allocation failure: 0 bytes, mode:0xcc0(GFP_KERNEL), nodemask=(null) ... a.out: vmalloc: allocation failure: 0 bytes, mode:0xcc0(GFP_KERNEL), nodemask=(null) ... Fix is to switch to kvmalloc(). Steps to reproduce: // vsyscall=none #include <sys/mman.h> #include <sys/resource.h> int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){RLIM_INFINITY, RLIM_INFINITY}); munmap((void *)0, (1ULL << 47) - 4096); return 0; } Link: http://lkml.kernel.org/r/20180410180353.GA2515@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15fat: use fat_fs_error() instead of BUG_ON() in __fat_get_block()OGAWA Hirofumi1-1/+7
If file size and FAT cluster chain is not matched (corrupted image), we can hit BUG_ON(!phys) in __fat_get_block(). So, use fat_fs_error() instead. [[email protected]: fix printk warning] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: OGAWA Hirofumi <[email protected]> Reported-by: Anatoly Trosinenko <[email protected]> Tested-by: Anatoly Trosinenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15proc: skip branch in /proc/*/* lookupAlexey Dobriyan1-6/+3
Code is structured like this: for ( ... p < last; p++) { if (memcmp == 0) break; } if (p >= last) ERROR OK gcc doesn't see that if if lookup succeeds than post loop branch will never be taken and skip it. [[email protected]: proc_pident_instantiate() no longer takes an inode*] Link: http://lkml.kernel.org/r/20180423213954.GD9043@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-15mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdownsMel Gorman1-4/+0
Commit 5d1904204c99 ("mremap: fix race between mremap() and page cleanning") fixed races between mremap and other operations for both file-backed and anonymous mappings. The file-backed was the most critical as it allowed the possibility that data could be changed on a physical page after page_mkclean returned which could trigger data loss or data integrity issues. A customer reported that the cost of the TLBs for anonymous regressions was excessive and resulting in a 30-50% drop in performance overall since this commit on a microbenchmark. Unfortunately I neither have access to the test-case nor can I describe what it does other than saying that mremap operations dominate heavily. This patch removes the LATENCY_LIMIT to handle TLB flushes on a PMD boundary instead of every 64 pages to reduce the number of TLB shootdowns by a factor of 8 in the ideal case. LATENCY_LIMIT was almost certainly used originally to limit the PTL hold times but the latency savings are likely offset by the cost of IPIs in many cases. This patch is not reported to completely restore performance but gets it within an acceptable percentage. The given metric here is simply described as "higher is better". Baseline that was known good 002: Metric: 91.05 004: Metric: 109.45 008: Metric: 73.08 016: Metric: 58.14 032: Metric: 61.09 064: Metric: 57.76 128: Metric: 55.43 Current 001: Metric: 54.98 002: Metric: 56.56 004: Metric: 41.22 008: Metric: 35.96 016: Metric: 36.45 032: Metric: 35.71 064: Metric: 35.73 128: Metric: 34.96 With patch 001: Metric: 61.43 002: Metric: 81.64 004: Metric: 67.92 008: Metric: 51.67 016: Metric: 50.47 032: Metric: 52.29 064: Metric: 50.01 128: Metric: 49.04 So for low threads, it's not restored but for larger number of threads, it's closer to the "known good" baseline. Using a different mremap-intensive workload that is not representative of the real workload there is little difference observed outside of noise in the headline metrics However, the TLB shootdowns are reduced by 11% on average and at the peak, TLB shootdowns were reduced by 21%. Interrupts were sampled every second while the workload ran to get those figures. It's known that the figures will vary as the non-representative load is non-deterministic. An alternative patch was posted that should have significantly reduced the TLB flushes but unfortunately it does not perform as well as this version on the customer test case. If revisited, the two patches can stack on top of each other. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mel Gorman <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Aaron Lu <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>