aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-04-30Merge tag 'wireless-drivers-for-davem-2019-04-30' of ↵David S. Miller17-35/+88
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 5.1 Third set of fixes for 5.1. iwlwifi * fix an oops when creating debugfs entries * fix bug when trying to capture debugging info while in rfkill * prevent potential uninitialized memory dumps into debugging logs * fix some initialization parameters for AX210 devices * fix an oops with non-MSIX devices * fix an oops when we receive a packet with bogus lengths * fix a bug that prevented 5350 devices from working * fix a small merge damage from the previous series mwifiex * fig regression with resume on SDIO ath10k * fix locking problem with crashdump * fix warnings during suspend and resume Also note that this pull conflicts with net-next. And I want to emphasie that it's really net-next, so when you pull this to net tree it should go without conflicts. Stephen reported the conflict here: https://lkml.kernel.org/r/[email protected] In iwlwifi oddly commit 154d4899e411 adds the IS_ERR_OR_NULL() in wireless-drivers but commit c9af7528c331 removes the whole check in wireless-drivers-next. The fix is easy, just drop the whole check for mvmvif->dbgfs_dir in iwlwifi/mvm/debugfs-vif.c, it's unneeded anyway. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-30Merge tag 'usb-5.1-rc8' of ↵Linus Torvalds10-45/+46
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for a bunch of warnings/errors that the syzbot has been finding with it's new-found ability to stress-test the USB layer. All of these are tiny, but fix real issues, and are marked for stable as well. All of these have had lots of testing in linux-next as well" * tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: w1 ds2490: Fix bug caused by improper use of altsetting array USB: yurex: Fix protection fault after device removal usb: usbip: fix isoc packet num validation in get_pipe USB: core: Fix bug caused by duplicate interface PM usage counter USB: dummy-hcd: Fix failure to give back unlinked URBs USB: core: Fix unterminated string returned by usb_string()
2019-04-30io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUPStefan Bühler1-2/+0
There is no operation to order with afterwards, and removing the flag is not critical in any way. There will always be a "race condition" where the application will trigger IORING_ENTER_SQ_WAKEUP when it isn't actually needed. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: remove unnecessary barrier after incrementing dropped counterStefan Bühler1-2/+0
smp_store_release in io_commit_sqring already orders the store to dropped before the update to SQ head. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: remove unnecessary barrier before reading SQ tailStefan Bühler1-2/+0
There is no operation before to order with. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: remove unnecessary barrier after updating SQ headStefan Bühler1-6/+0
There is no operation afterwards to order with. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: remove unnecessary barrier before reading cq headStefan Bühler1-2/+5
The memory operations before reading cq head are unrelated and we don't care about their order. Document that the control dependency in combination with READ_ONCE and WRITE_ONCE forms a barrier we need. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: remove unnecessary barrier before wq_has_sleeperStefan Bühler1-7/+4
wq_has_sleeper has a full barrier internally. The smp_rmb barrier in io_uring_poll synchronizes with it. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: fix notes on barriersStefan Bühler1-9/+110
The application reading the CQ ring needs a barrier to pair with the smp_store_release in io_commit_cqring, not the barrier after it. Also a write barrier *after* writing something (but not *before* writing anything interesting) doesn't order anything, so an smp_wmb() after writing SQ tail is not needed. Additionally consider reading SQ head and writing CQ tail in the notes. Also add some clarifications how the various other fields in the ring buffers are used. Signed-off-by: Stefan Bühler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30io_uring: fix handling SQEs requesting NOWAITStefan Bühler1-7/+10
Not all request types set REQ_F_FORCE_NONBLOCK when they needed async punting; reverse logic instead and set REQ_F_NOWAIT if request mustn't be punted. Signed-off-by: Stefan Bühler <[email protected]> Merged with my previous patch for this. Signed-off-by: Jens Axboe <[email protected]>
2019-04-30Merge tag 'selinux-pr-20190429' of ↵Linus Torvalds3-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small patch for the stable folks to fix a problem when building against the latest glibc. I'll be honest and say that I'm not really thrilled with the idea of sending this up right now, but Greg is a little annoyed so here I figured I would at least send this" * tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: use kernel linux/socket.h for genheaders and mdp
2019-04-30l2ip: fix possible use-after-freeEric Dumazet1-4/+4
Before taking a refcount on a rcu protected structure, we need to make sure the refcount is not zero. syzbot reported : refcount_t: increment on 0; use-after-free. WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline] WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 23533 Comm: syz-executor.2 Not tainted 5.1.0-rc7+ #93 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x45 kernel/panic.c:571 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline] RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Code: 1d 98 2b 2a 06 31 ff 89 de e8 db 2c 40 fe 84 db 75 dd e8 92 2b 40 fe 48 c7 c7 20 7a a1 87 c6 05 78 2b 2a 06 01 e8 7d d9 12 fe <0f> 0b eb c1 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 RSP: 0018:ffff888069f0fba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 000000000000f353 RSI: ffffffff815afcb6 RDI: ffffed100d3e1f67 RBP: ffff888069f0fbb8 R08: ffff88809b1845c0 R09: ffffed1015d23ef1 R10: ffffed1015d23ef0 R11: ffff8880ae91f787 R12: ffff8880a8f26968 R13: 0000000000000004 R14: dffffc0000000000 R15: ffff8880a49a6440 l2tp_tunnel_inc_refcount net/l2tp/l2tp_core.h:240 [inline] l2tp_tunnel_get+0x250/0x580 net/l2tp/l2tp_core.c:173 pppol2tp_connect+0xc00/0x1c70 net/l2tp/l2tp_ppp.c:702 __sys_connect+0x266/0x330 net/socket.c:1808 __do_sys_connect net/socket.c:1819 [inline] __se_sys_connect net/socket.c:1816 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1816 Fixes: 54652eb12c1b ("l2tp: hold tunnel while looking up sessions in l2tp_netlink") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Cc: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable ↵Marc Zyngier3-8/+70
counters Instead of always going via arch_counter_get_cntvct_stable to access the counter workaround, let's have arch_timer_read_counter point to the right method. For that, we need to track whether any CPU in the system has a workaround for the counter. This is done by having an atomic variable tracking this. Acked-by: Mark Rutland <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30clocksource/arm_arch_timer: Remove use of workaround static keyMarc Zyngier2-22/+7
The use of a static key in a hotplug path has proved to be a real nightmare, and makes it impossible to have scream-free lockdep kernel. Let's remove the static key altogether, and focus on something saner. Acked-by: Mark Rutland <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30clocksource/arm_arch_timer: Drop use of static key in arch_timer_reg_read_stableMarc Zyngier1-14/+28
Let's start with the removal of the arch_timer_read_ool_enabled static key in arch_timer_reg_read_stable. It is not a fast path, and we can simplify things a bit. Acked-by: Mark Rutland <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30clocksource/arm_arch_timer: Direcly assign set_next_event workaroundMarc Zyngier3-38/+28
When a given timer is affected by an erratum and requires an alternative implementation of set_next_event, we do a rather complicated dance to detect and call the workaround on each set_next_event call. This is clearly idiotic, as we can perfectly detect whether this CPU requires a workaround while setting up the clock event device. This only requires the CPU-specific detection to be done a bit earlier, and we can then safely override the set_next_event pointer if we have a workaround associated to that CPU. Acked-by: Mark Rutland <[email protected]> Acked-by; Daniel Lezcano <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30arm64: Use arch_timer_read_counter instead of arch_counter_get_cntvctMarc Zyngier1-2/+2
Only arch_timer_read_counter will guarantee that workarounds are applied. So let's use this one instead of arch_counter_get_cntvct. Acked-by: Mark Rutland <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30appletalk: Set error code if register_snap_client failedYueHaibing1-0/+1
If register_snap_client fails in atalk_init, error code should be set, otherwise it will triggers NULL pointer dereference while unloading module. Fixes: 9804501fa122 ("appletalk: Fix potential NULL pointer dereference in unregister_snap_client") Signed-off-by: YueHaibing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30watchdog/sbsa: Use arch_timer_read_counter instead of arch_counter_get_cntvctMarc Zyngier1-1/+1
Only arch_timer_read_counter will guarantee that workarounds are applied. So let's use this one instead of arch_counter_get_cntvct. Acked-by: Mark Rutland <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30ARM: vdso: Remove dependency with the arch_timer driver internalsMarc Zyngier2-2/+5
The VDSO code uses the kernel helper that was originally designed to abstract the access between 32 and 64bit systems. It worked so far because this function is declared as 'inline'. As we're about to revamp that part of the code, the VDSO would break. Let's fix it by doing what should have been done from the start, a proper system register access. Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfcDan Carpenter1-0/+6
The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc(). We can't pass unclamped values to test_bit() or it results in an out of bounds access beyond the end of the bitmap. Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30rxrpc: Fix net namespace cleanupDavid Howells1-16/+16
In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the ->calls list is empty, emitting error messages if not, and (2) wait for the RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0). To avoid taking the call_lock, the function prechecks ->calls and if empty, it returns to avoid taking the lock - this is wrong, however: it still needs to go and do the second phase and wait for ->nr_calls to become 0. Without this, the rxrpc_net struct may get deallocated before we get to the RCU cleanup for the last calls. This can lead to: Slab corruption (Not tainted): kmalloc-16k start=ffff88802b178000, len=16384 050: 6b 6b 6b 6b 6b 6b 6b 6b 61 6b 6b 6b 6b 6b 6b 6b kkkkkkkkakkkkkkk Note the "61" at offset 0x58. This corresponds to the ->nr_calls member of struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k slab). Fix this by flipping the condition on the if-statement, putting the locked section inside the if-body and dropping the return from there. The function will then always go on to wait for the RCU cleanup on outstanding calls. Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1Marc Zyngier3-7/+18
Neoverse-N1 is also affected by ARM64_ERRATUM_1188873, so let's add it to the list of affected CPUs. Signed-off-by: Marc Zyngier <[email protected]> [will: Update silicon-errata.txt] Signed-off-by: Will Deacon <[email protected]>
2019-04-30arm64: Add part number for Neoverse N1Marc Zyngier1-0/+2
New CPU, new part number. You know the drill. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30arm64: Make ARM64_ERRATUM_1188873 depend on COMPATMarc Zyngier1-0/+1
Since ARM64_ERRATUM_1188873 only affects AArch32 EL0, it makes some sense that it should depend on COMPAT. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32Marc Zyngier2-17/+17
We currently deal with ARM64_ERRATUM_1188873 by always trapping EL0 accesses for both instruction sets. Although nothing wrong comes out of that, people trying to squeeze the last drop of performance from buggy HW find this over the top. Oh well. Let's change the mitigation by flipping the counter enable bit on return to userspace. Non-broken HW gets an extra branch on the fast path, which is hopefully not the end of the world. The arch timer workaround is also removed. Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30ALSA: hda/realtek - Apply the fixup for ASUS Q325UARTakashi Iwai1-0/+4
Some ASUS models like Q325UAR with ALC295 codec requires the same fixup that has been applied to ALC294 codec. Just copy the entry with the pin matching to cover ALC295 too. BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784485 Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-04-30Merge branch 'master' of ↵David S. Miller13-49/+89
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-04-30 1) Fix an out-of-bound array accesses in __xfrm_policy_unlink. From YueHaibing. 2) Reset the secpath on failure in the ESP GRO handlers to avoid dereferencing an invalid pointer on error. From Myungho Jung. 3) Add and revert a patch that tried to add rcu annotations to netns_xfrm. From Su Yanjun. 4) Wait for rcu callbacks before freeing xfrm6_tunnel_spi_kmem. From Su Yanjun. 5) Fix forgotten vti4 ipip tunnel deregistration. From Jeremy Sowden: 6) Remove some duplicated log messages in vti4. From Jeremy Sowden. 7) Don't use IPSEC_PROTO_ANY when flushing states because this will flush only IPsec portocol speciffic states. IPPROTO_ROUTING states may remain in the lists when doing net exit. Fix this by replacing IPSEC_PROTO_ANY with zero. From Cong Wang. 8) Add length check for UDP encapsulation to fix "Oversized IP packet" warnings on receive side. From Sabrina Dubroca. 9) Fix xfrm interface lookup when the interface is associated to a vrf layer 3 master device. From Martin Willi. 10) Reload header pointers after pskb_may_pull() in _decode_session4(), otherwise we may read from uninitialized memory. 11) Update the documentation about xfrm[46]_gc_thresh, it is not used anymore after the flowcache removal. From Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-30Revert "drm/qxl: drop prime import/export callbacks"Gerd Hoffmann2-0/+16
This reverts commit f4c34b1e2a37d5676180901fa6ff188bcb6371f8. Simliar to commit a0cecc23cfcb Revert "drm/virtio: drop prime import/export callbacks". We have to do the same with qxl, for the same reasons (it breaks DRI3). Drop the WARN_ON_ONCE(). Fixes: f4c34b1e2a37d5676 ("drm/qxl: drop prime import/export callbacks") Signed-off-by: Gerd Hoffmann <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Daniel Vetter <[email protected]>
2019-04-30arm64: mm: Remove pte_unmap_nested()Qian Cai1-2/+0
As of commit ece0e2b6406a ("mm: remove pte_*map_nested()"), pte_unmap_nested() is no longer used and can be removed from the arm64 code. Signed-off-by: Qian Cai <[email protected]> [will: also remove pte_offset_map_nested()] Signed-off-by: Will Deacon <[email protected]>
2019-04-30arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variableQian Cai1-1/+2
When building with -Wunused-but-set-variable, the compiler shouts about a number of pte_unmap() users, since this expands to an empty macro on arm64: | mm/gup.c: In function 'gup_pte_range': | mm/gup.c:1727:16: warning: variable 'ptem' set but not used | [-Wunused-but-set-variable] | mm/gup.c: At top level: | mm/memory.c: In function 'copy_pte_range': | mm/memory.c:821:24: warning: variable 'orig_dst_pte' set but not used | [-Wunused-but-set-variable] | mm/memory.c:821:9: warning: variable 'orig_src_pte' set but not used | [-Wunused-but-set-variable] | mm/swap_state.c: In function 'swap_ra_info': | mm/swap_state.c:641:15: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable] | mm/madvise.c: In function 'madvise_free_pte_range': | mm/madvise.c:318:9: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable] Rewrite pte_unmap() as a static inline function, which silences the warnings. Signed-off-by: Qian Cai <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-30x86/alternatives: Add comment about module removal racesNadav Amit1-0/+5
Add a comment to clarify that users of text_poke() must ensure that no races with module removal take place. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/kprobes: Use vmalloc special flagRick Edgecombe1-6/+1
Use new flag VM_FLUSH_RESET_PERMS for handling freeing of special permissioned memory in vmalloc and remove places where memory was set NX and RW before freeing which is no longer needed. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/ftrace: Use vmalloc special flagRick Edgecombe1-8/+6
Use new flag VM_FLUSH_RESET_PERMS for handling freeing of special permissioned memory in vmalloc and remove places where memory was set NX and RW before freeing which is no longer needed. Tested-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30bpf: Use vmalloc special flagRick Edgecombe2-15/+3
Use new flag VM_FLUSH_RESET_PERMS for handling freeing of special permissioned memory in vmalloc and remove places where memory was set RW before freeing which is no longer needed. Don't track if the memory is RO anymore because it is now tracked in vmalloc. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30modules: Use vmalloc special flagRick Edgecombe1-38/+39
Use new flag for handling freeing of special permissioned memory in vmalloc and remove places where memory was set RW before freeing which is no longer needed. Since freeing of VM_FLUSH_RESET_PERMS memory is not supported in an interrupt by vmalloc, the freeing of init sections is moved to a work queue. Instead of call_rcu it now uses synchronize_rcu() in the work queue. Lastly, there is now a WARN_ON in module_memfree since it should not be called in an interrupt with special memory as is required for VM_FLUSH_RESET_PERMS. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30mm/vmalloc: Add flag for freeing of special permsissionsRick Edgecombe2-19/+109
Add a new flag VM_FLUSH_RESET_PERMS, for enabling vfree operations to immediately clear executable TLB entries before freeing pages, and handle resetting permissions on the directmap. This flag is useful for any kind of memory with elevated permissions, or where there can be related permissions changes on the directmap. Today this is RO+X and RO memory. Although this enables directly vfreeing non-writeable memory now, non-writable memory cannot be freed in an interrupt because the allocation itself is used as a node on deferred free list. So when RO memory needs to be freed in an interrupt the code doing the vfree needs to have its own work queue, as was the case before the deferred vfree list was added to vmalloc. For architectures with set_direct_map_ implementations this whole operation can be done with one TLB flush when centralized like this. For others with directmap permissions, currently only arm64, a backup method using set_memory functions is used to reset the directmap. When arm64 adds set_direct_map_ functions, this backup can be removed. When the TLB is flushed to both remove TLB entries for the vmalloc range mapping and the direct map permissions, the lazy purge operation could be done to try to save a TLB flush later. However today vm_unmap_aliases could flush a TLB range that does not include the directmap. So a helper is added with extra parameters that can allow both the vmalloc address and the direct mapping to be flushed during this operation. The behavior of the normal vm_unmap_aliases function is unchanged. Suggested-by: Dave Hansen <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Suggested-by: Will Deacon <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30mm/hibernation: Make hibernation handle unmapped pagesRick Edgecombe4-20/+14
Make hibernate handle unmapped pages on the direct map when CONFIG_ARCH_HAS_SET_ALIAS=y is set. These functions allow for setting pages to invalid configurations, so now hibernate should check if the pages have valid mappings and handle if they are unmapped when doing a hibernate save operation. Previously this checking was already done when CONFIG_DEBUG_PAGEALLOC=y was configured. It does not appear to have a big hibernating performance impact. The speed of the saving operation before this change was measured as 819.02 MB/s, and after was measured at 813.32 MB/s. Before: [ 4.670938] PM: Wrote 171996 kbytes in 0.21 seconds (819.02 MB/s) After: [ 4.504714] PM: Wrote 178932 kbytes in 0.22 seconds (813.32 MB/s) Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/mm/cpa: Add set_direct_map_*() functionsRick Edgecombe5-3/+30
Add two new functions set_direct_map_default_noflush() and set_direct_map_invalid_noflush() for setting the direct map alias for the page to its default valid permissions and to an invalid state that cannot be cached in a TLB, respectively. These functions do not flush the TLB. Note, __kernel_map_pages() does something similar but flushes the TLB and doesn't reset the permission bits to default on all architectures. Also add an ARCH config ARCH_HAS_SET_DIRECT_MAP for specifying whether these have an actual implementation or a default empty one. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/alternatives: Remove the return value of text_poke_*()Nadav Amit2-9/+6
The return value of text_poke_early() and text_poke_bp() is useless. Remove it. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/jump-label: Remove support for custom text pokerNadav Amit1-16/+10
There are only two types of text poking: early and breakpoint based. The use of a function pointer to perform text poking complicates the code and is probably inefficient due to the use of indirect branches. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/modules: Avoid breaking W^X while loading modulesNadav Amit4-8/+28
When modules and BPF filters are loaded, there is a time window in which some memory is both writable and executable. An attacker that has already found another vulnerability (e.g., a dangling pointer) might be able to exploit this behavior to overwrite kernel code. Prevent having writable executable PTEs in this stage. In addition, avoiding having W+X mappings can also slightly simplify the patching of modules code on initialization (e.g., by alternatives and static-key), as would be done in the next patch. This was actually the main motivation for this patch. To avoid having W+X mappings, set them initially as RW (NX) and after they are set as RO set them as X as well. Setting them as executable is done as a separate step to avoid one core in which the old PTE is cached (hence writable), and another which sees the updated PTE (executable), which would break the W^X protection. Suggested-by: Thomas Gleixner <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/kprobes: Set instruction page as executableNadav Amit1-4/+20
Set the page as executable after allocation. This patch is a preparatory patch for a following patch that makes module allocated pages non-executable. While at it, do some small cleanup of what appears to be unnecessary masking. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/ftrace: Set trampoline pages as executableNadav Amit1-0/+8
Since alloc_module() will not set the pages as executable soon, set ftrace trampoline pages as executable after they are allocated. For the time being, do not change ftrace to use the text_poke() interface. As a result, ftrace still breaks W^X. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/kgdb: Avoid redundant comparison of patched codeNadav Amit1-13/+1
text_poke() already ensures that the written value is the correct one and fails if that is not the case. There is no need for an additional comparison. Remove it. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/alternatives: Use temporary mm for text pokingNadav Amit3-26/+86
text_poke() can potentially compromise security as it sets temporary PTEs in the fixmap. These PTEs might be used to rewrite the kernel code from other cores accidentally or maliciously, if an attacker gains the ability to write onto kernel memory. Moreover, since remote TLBs are not flushed after the temporary PTEs are removed, the time-window in which the code is writable is not limited if the fixmap PTEs - maliciously or accidentally - are cached in the TLB. To address these potential security hazards, use a temporary mm for patching the code. Finally, text_poke() is also not conservative enough when mapping pages, as it always tries to map 2 pages, even when a single one is sufficient. So try to be more conservative, and do not map more than needed. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/alternatives: Initialize temporary mm for patchingNadav Amit5-0/+48
To prevent improper use of the PTEs that are used for text patching, the next patches will use a temporary mm struct. Initailize it by copying the init mm. The address that will be used for patching is taken from the lower area that is usually used for the task memory. Doing so prevents the need to frequently synchronize the temporary-mm (e.g., when BPF programs are installed), since different PGDs are used for the task memory. Finally, randomize the address of the PTEs to harden against exploits that use these PTEs. Suggested-by: Andy Lutomirski <[email protected]> Tested-by: Masami Hiramatsu <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30fork: Provide a function for copying init_mmNadav Amit2-6/+19
Provide a function for copying init_mm. This function will be later used for setting a temporary mm. Tested-by: Masami Hiramatsu <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30uprobes: Initialize uprobes earlierNadav Amit3-5/+9
In order to have a separate address space for text poking, we need to duplicate init_mm early during start_kernel(). This, however, introduces a problem since uprobes functions are called from dup_mmap(), but uprobes is still not initialized in this early stage. Since uprobes initialization is necassary for fork, and since all the dependant initialization has been done when fork is initialized (percpu and vmalloc), move uprobes initialization to fork_init(). It does not seem uprobes introduces any security problem for the poking_mm. Crash and burn if uprobes initialization fails, similarly to other early initializations. Change the init_probes() name to probes_init() to match other early initialization functions name convention. Reported-by: kernel test robot <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rick Edgecombe <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/mm: Save debug registers when loading a temporary mmNadav Amit1-0/+23
Prevent user watchpoints from mistakenly firing while the temporary mm is being used. As the addresses of the temporary mm might overlap those of the user-process, this is necessary to prevent wrong signals or worse things from happening. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>