aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2019-05-01devlink: Change devlink health locking mechanismMoshe Shemesh1-0/+1
The devlink health reporters create/destroy and user commands currently use the devlink->lock as a locking mechanism. Different reporters have different rules in the driver and are being created/destroyed during different stages of driver load/unload/running. So during execution of a reporter recover the flow can go through another reporter's destroy and create. Such flow leads to deadlock trying to lock a mutex already held. With the new locking mechanism the different reporters share mutex lock only to protect access to shared reporters list. Added refcount per reporter, to protect the reporters from destroy while being used. Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-01iomap: Add a page_prepare callbackAndreas Gruenbacher1-5/+17
Move the page_done callback into a separate iomap_page_ops structure and add a page_prepare calback to be called before the next page is written to. In gfs2, we'll want to start a transaction in page_prepare and end it in page_done. Other filesystems that implement data journaling will require the same kind of mechanism. Signed-off-by: Andreas Gruenbacher <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2019-05-01iov_iter: fix iov_iter_typeMing Lei1-1/+1
Commit 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag") introduces one extra flag of ITER_BVEC_FLAG_NO_REF, and this flag is stored into iter->type. However, iov_iter_type() doesn't consider the new added flag, fix it by masking this flag in iov_iter_type(). Fixes: 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag") Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-05-01Merge branch 'for-next/mitigations' of ↵Will Deacon1-0/+24
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core
2019-05-01sctp: avoid running the sctp state machine recursivelyXin Long1-1/+0
Ying triggered a call trace when doing an asconf testing: BUG: scheduling while atomic: swapper/12/0/0x10000100 Call Trace: <IRQ> [<ffffffffa4375904>] dump_stack+0x19/0x1b [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72 [<ffffffffa437b93a>] __schedule+0x9ba/0xa00 [<ffffffffa3cd5326>] __cond_resched+0x26/0x30 [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50 [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200 [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0 [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp] [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp] [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp] [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp] [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp] [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp] [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp] As it shows, the first sctp_do_sm() running under atomic context (NET_RX softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later, and this flag is supposed to be used in non-atomic context only. Besides, sctp_do_sm() was called recursively, which is not expected. Vlad tried to fix this recursive call in Commit c0786693404c ("sctp: Fix oops when sending queued ASCONF chunks") by introducing a new command SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will be called in this command again. To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st sctp_do_sm() directly. Reported-by: Ying Xu <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-01soc: ti: Add MSI domain bus support for Interrupt AggregatorLokesh Vutla3-0/+34
With the system coprocessor managing the range allocation of the inputs to Interrupt Aggregator, it is difficult to represent the device IRQs from DT. The suggestion is to use MSI in such cases where devices wants to allocate and group interrupts dynamically. Create a MSI domain bus layer that allocates and frees MSIs for a device. APIs that are implemented: - ti_sci_inta_msi_create_irq_domain() that creates a MSI domain - ti_sci_inta_msi_domain_alloc_irqs() that creates MSIs for the specified device and resource. - ti_sci_inta_msi_domain_free_irqs() frees the irqs attached to the device. - ti_sci_inta_msi_get_virq() for getting the virq attached to a specific event. Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-01genirq: Introduce irq_chip_{request,release}_resource_parent() apisLokesh Vutla1-0/+2
Introduce irq_chip_{request,release}_resource_parent() apis so that these can be used in hierarchical irqchips. Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-01firmware: ti_sci: Add helper apis to manage resourcesLokesh Vutla1-0/+54
Each resource with in the device can be uniquely identified as defined by TISCI. Since this is generic across the devices, resource allocation also can be made generic instead of each client driver handling the resource. So add helper apis to manage the resource. Signed-off-by: Lokesh Vutla <[email protected]> Acked-by: Nishanth Menon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-01firmware: ti_sci: Add support for IRQ managementLokesh Vutla1-0/+26
TISCI abstracts the handling of IRQ routes where interrupt sources are not directly connected to host interrupt controller. Add support for the set of TISCI commands for requesting and releasing IRQs. Signed-off-by: Lokesh Vutla <[email protected]> Acked-by: Nishanth Menon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-01firmware: ti_sci: Add support for RM core opsLokesh Vutla1-0/+27
TISCI provides support for getting the resources(IRQ, RING etc..) assigned to a specific device. These resources can be handled by the client and in turn sends TISCI cmd to configure the resources. It is very important that client should keep track on usage of these resources. Add support for TISCI commands to get resource ranges. Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Peter Ujfalusi <[email protected]> Acked-by: Nishanth Menon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-01firmware: ti_sci: Add support to get TISCI handle using of_phandleGrygorii Strashko1-0/+17
TISCI has been updated to have support for Resource management(like interrupts etc..). And there can be multiple device instances of a resource type in a SoC. So every driver corresponding to a resource type should get a TISCI handle so that it can make TISCI calls. And each DT node corresponding to a device should exist under its corresponding bus node as per the SoC architecture. But existing apis in TISCI library assumes that all TISCI users are child nodes of TISCI. Which is not true in the above case. So introduce (devm_)ti_sci_get_by_phandle() apis that can be used by TISCI users to get TISCI handle using of phandle property. Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: Lokesh Vutla <[email protected]> Acked-by: Nishanth Menon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-04-30net: dsa: Remove legacy probing supportAndrew Lunn1-23/+0
Now that all drivers can be probed using more traditional methods, remove the legacy probe code. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30net: dsa: Add helper function to retrieve VLAN awareness settingVladimir Oltean1-0/+10
Since different types of hardware may or may not support this setting per-port, DSA keeps it either in dsa_switch or in dsa_port. While drivers may know the characteristics of their hardware and retrieve it from the correct place without the need of helpers, it is cumbersone to find out an unambigous answer from generic DSA code. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30net: dsa: Keep the vlan_filtering setting in dsa_switch if it's globalVladimir Oltean1-0/+5
The current behavior is not as obvious as one would assume (which is that, if the driver set vlan_filtering_is_global = 1, then checking any dp->vlan_filtering would yield the same result). Only the ports which are actively enslaved into a bridge would have vlan_filtering set. This makes it tricky for drivers to check what the global state is. So fix this and make the struct dsa_switch hold this global setting. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30net: dsa: Be aware of switches where VLAN filtering is a global settingVladimir Oltean1-0/+5
On some switches, the action of whether to parse VLAN frame headers and use that information for ingress admission is configurable, but not per port. Such is the case for the Broadcom BCM53xx and the NXP SJA1105 families, for example. In that case, DSA can prevent the bridge core from trying to apply different VLAN filtering settings on net devices that belong to the same switch. Signed-off-by: Vladimir Oltean <[email protected]> Suggested-by: Florian Fainelli <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30net: dsa: Store vlan_filtering as a property of dsa_portVladimir Oltean1-0/+1
This allows drivers to query the VLAN setting imposed by the bridge driver directly from DSA, instead of keeping their own state based on the .port_vlan_filtering callback. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30block: remove the unused blk_queue_dma_pad functionChristoph Hellwig1-1/+0
Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30block: add a SPDX tag to blk-mq-rdma.hChristoph Hellwig1-0/+1
This file has no copyright notice, but was added as part of a commit adding another file using the default kernel GPLv2 license. Add a matching SPDX tag. Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30sed-opal.h: remove redundant licence boilerplateChristoph Hellwig1-9/+0
The file already has the correct SPDX header. Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30block: switch all files cleared marked as GPLv2 or later to SPDX tagsChristoph Hellwig1-15/+1
All these files have some form of the usual GPLv2 or later boilerplate. Switch them to use SPDX tags instead. Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30block: switch all files cleared marked as GPLv2 to SPDX tagsChristoph Hellwig3-37/+3
All these files have some form of the usual GPLv2 boilerplate. Switch them to use SPDX tags instead. Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30KVM: Introduce a new guest mapping APIKarimAllah Ahmed1-0/+28
In KVM, specially for nested guests, there is a dominant pattern of: => map guest memory -> do_something -> unmap guest memory In addition to all this unnecessarily noise in the code due to boiler plate code, most of the time the mapping function does not properly handle memory that is not backed by "struct page". This new guest mapping API encapsulate most of this boiler plate code and also handles guest memory that is not backed by "struct page". The current implementation of this API is using memremap for memory that is not backed by a "struct page" which would lead to a huge slow-down if it was used for high-frequency mapping operations. The API does not have any effect on current setups where guest memory is backed by a "struct page". Further patches are going to also introduce a pfn-cache which would significantly improve the performance of the memremap case. Signed-off-by: KarimAllah Ahmed <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-30KVM: x86: Inject PMI for KVM guestLuwei Kang1-0/+1
Inject a PMI for KVM guest when Intel PT working in Host-Guest mode and Guest ToPA entry memory buffer was completely filled. Signed-off-by: Luwei Kang <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-30Merge tag 'kvm-s390-next-5.2-1' of ↵Paolo Bonzini1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Features and fixes for 5.2 - VSIE crypto fixes - new guest features for gen15 - disable halt polling for nested virtualization with overcommit
2019-04-30Merge tag 'usb-5.1-rc8' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for a bunch of warnings/errors that the syzbot has been finding with it's new-found ability to stress-test the USB layer. All of these are tiny, but fix real issues, and are marked for stable as well. All of these have had lots of testing in linux-next as well" * tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: w1 ds2490: Fix bug caused by improper use of altsetting array USB: yurex: Fix protection fault after device removal usb: usbip: fix isoc packet num validation in get_pipe USB: core: Fix bug caused by duplicate interface PM usage counter USB: dummy-hcd: Fix failure to give back unlinked URBs USB: core: Fix unterminated string returned by usb_string()
2019-04-30block: remove the __bio_add_pc_page exportChristoph Hellwig1-3/+0
The same page optimization is a rather odd corner case, which is not used outside bio.c and which really should not be used outside of bio.c either - we have better highlevel helpers like the rq/bio mapping helpers. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30block: remove the i argument to bio_for_each_segment_allChristoph Hellwig1-3/+2
We only have two callers that need the integer loop iterator, and they can easily maintain it themselves. Suggested-by: Matthew Wilcox <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: David Sterba <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Acked-by: Coly Li <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30Merge branch 'master' of ↵David S. Miller1-93/+23
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-04-30 1) A lot of work to remove indirections from the xfrm code. From Florian Westphal. 2) Support ESP offload in combination with gso partial. From Boris Pismenny. 3) Remove some duplicated code from vti4. From Jeremy Sowden. Please note that there is merge conflict between commit: 8742dc86d0c7 ("xfrm4: Fix uninitialized memory read in _decode_session4") from the ipsec tree and commit: c53ac41e3720 ("xfrm: remove decode_session indirection from afinfo_policy") from the ipsec-next tree. The merge conflict will appear when those trees get merged during the merge window. The conflict can be solved as it is done in linux-next: https://lkml.org/lkml/2019/4/25/1207 Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-30Merge branch 'master' of ↵David S. Miller1-1/+19
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-04-30 1) Fix an out-of-bound array accesses in __xfrm_policy_unlink. From YueHaibing. 2) Reset the secpath on failure in the ESP GRO handlers to avoid dereferencing an invalid pointer on error. From Myungho Jung. 3) Add and revert a patch that tried to add rcu annotations to netns_xfrm. From Su Yanjun. 4) Wait for rcu callbacks before freeing xfrm6_tunnel_spi_kmem. From Su Yanjun. 5) Fix forgotten vti4 ipip tunnel deregistration. From Jeremy Sowden: 6) Remove some duplicated log messages in vti4. From Jeremy Sowden. 7) Don't use IPSEC_PROTO_ANY when flushing states because this will flush only IPsec portocol speciffic states. IPPROTO_ROUTING states may remain in the lists when doing net exit. Fix this by replacing IPSEC_PROTO_ANY with zero. From Cong Wang. 8) Add length check for UDP encapsulation to fix "Oversized IP packet" warnings on receive side. From Sabrina Dubroca. 9) Fix xfrm interface lookup when the interface is associated to a vrf layer 3 master device. From Martin Willi. 10) Reload header pointers after pskb_may_pull() in _decode_session4(), otherwise we may read from uninitialized memory. 11) Update the documentation about xfrm[46]_gc_thresh, it is not used anymore after the flowcache removal. From Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-30netfilter: nft_ct: Add ct id supportBrett Mastbergen1-0/+2
The 'id' key returns the unique id of the conntrack entry as returned by nf_ct_get_id(). Signed-off-by: Brett Mastbergen <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30netfilter: add API to manage NAT helpers.Flavio Leitner1-1/+21
The API allows a conntrack helper to indicate its corresponding NAT helper which then can be loaded and reference counted. Signed-off-by: Flavio Leitner <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30netfilter: use macros to create module aliases.Flavio Leitner1-0/+4
Each NAT helper creates a module alias which follows a pattern. Use macros for consistency. Signed-off-by: Flavio Leitner <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30netfilter: conntrack: limit sysctl setting for boolean optionsTonghao Zhang1-3/+3
We use the zero and one to limit the boolean options setting. After this patch we only set 0 or 1 to boolean options for nf conntrack sysctl. Signed-off-by: Tonghao Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30netfilter: nf_tables: drop include of module.h from nf_tables.hPaul Gortmaker1-1/+2
Ideally, header files under include/linux shouldn't be adding includes of other headers, in anticipation of their consumers, but just the headers needed for the header itself to pass parsing with CPP. The module.h is particularly bad in this sense, as it itself does include a whole bunch of other headers, due to the complexity of module support. Since nf_tables.h is not going into a module struct looking for specific fields, we can just let it know that module is a struct, just like about 60 other include/linux headers already do. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30netfilter: nf_tables: relocate header content to consumerPaul Gortmaker1-17/+0
The nf_tables.h header is used in a lot of files, but it turns out that there is only one actual user of nft_expr_clone(). Hence we relocate that function to be with the one consumer of it and avoid having to process it with CPP for all the other files. This will also enable a reduction in the other headers that the nf_tables.h itself has to include just to be stand-alone, hence a pending further significant reduction in the CPP content that needs to get processed for each netfilter file. Note that the explicit "inline" has been dropped as part of this relocation. In similar changes to this, I believe Dave has asked this be done, so we free up gcc to make the choice of whether to inline or not. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30bpf: Use vmalloc special flagRick Edgecombe1-14/+3
Use new flag VM_FLUSH_RESET_PERMS for handling freeing of special permissioned memory in vmalloc and remove places where memory was set RW before freeing which is no longer needed. Don't track if the memory is RO anymore because it is now tracked in vmalloc. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30mm/vmalloc: Add flag for freeing of special permsissionsRick Edgecombe1-0/+15
Add a new flag VM_FLUSH_RESET_PERMS, for enabling vfree operations to immediately clear executable TLB entries before freeing pages, and handle resetting permissions on the directmap. This flag is useful for any kind of memory with elevated permissions, or where there can be related permissions changes on the directmap. Today this is RO+X and RO memory. Although this enables directly vfreeing non-writeable memory now, non-writable memory cannot be freed in an interrupt because the allocation itself is used as a node on deferred free list. So when RO memory needs to be freed in an interrupt the code doing the vfree needs to have its own work queue, as was the case before the deferred vfree list was added to vmalloc. For architectures with set_direct_map_ implementations this whole operation can be done with one TLB flush when centralized like this. For others with directmap permissions, currently only arm64, a backup method using set_memory functions is used to reset the directmap. When arm64 adds set_direct_map_ functions, this backup can be removed. When the TLB is flushed to both remove TLB entries for the vmalloc range mapping and the direct map permissions, the lazy purge operation could be done to try to save a TLB flush later. However today vm_unmap_aliases could flush a TLB range that does not include the directmap. So a helper is added with extra parameters that can allow both the vmalloc address and the direct mapping to be flushed during this operation. The behavior of the normal vm_unmap_aliases function is unchanged. Suggested-by: Dave Hansen <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Suggested-by: Will Deacon <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30mm/hibernation: Make hibernation handle unmapped pagesRick Edgecombe1-12/+6
Make hibernate handle unmapped pages on the direct map when CONFIG_ARCH_HAS_SET_ALIAS=y is set. These functions allow for setting pages to invalid configurations, so now hibernate should check if the pages have valid mappings and handle if they are unmapped when doing a hibernate save operation. Previously this checking was already done when CONFIG_DEBUG_PAGEALLOC=y was configured. It does not appear to have a big hibernating performance impact. The speed of the saving operation before this change was measured as 819.02 MB/s, and after was measured at 813.32 MB/s. Before: [ 4.670938] PM: Wrote 171996 kbytes in 0.21 seconds (819.02 MB/s) After: [ 4.504714] PM: Wrote 178932 kbytes in 0.22 seconds (813.32 MB/s) Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/mm/cpa: Add set_direct_map_*() functionsRick Edgecombe1-0/+11
Add two new functions set_direct_map_default_noflush() and set_direct_map_invalid_noflush() for setting the direct map alias for the page to its default valid permissions and to an invalid state that cannot be cached in a TLB, respectively. These functions do not flush the TLB. Note, __kernel_map_pages() does something similar but flushes the TLB and doesn't reset the permission bits to default on all architectures. Also add an ARCH config ARCH_HAS_SET_DIRECT_MAP for specifying whether these have an actual implementation or a default empty one. Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30x86/modules: Avoid breaking W^X while loading modulesNadav Amit1-0/+1
When modules and BPF filters are loaded, there is a time window in which some memory is both writable and executable. An attacker that has already found another vulnerability (e.g., a dangling pointer) might be able to exploit this behavior to overwrite kernel code. Prevent having writable executable PTEs in this stage. In addition, avoiding having W+X mappings can also slightly simplify the patching of modules code on initialization (e.g., by alternatives and static-key), as would be done in the next patch. This was actually the main motivation for this patch. To avoid having W+X mappings, set them initially as RW (NX) and after they are set as RO set them as X as well. Setting them as executable is done as a separate step to avoid one core in which the old PTE is cached (hence writable), and another which sees the updated PTE (executable), which would break the W^X protection. Suggested-by: Thomas Gleixner <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Rik van Riel <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30fork: Provide a function for copying init_mmNadav Amit1-0/+1
Provide a function for copying init_mm. This function will be later used for setting a temporary mm. Tested-by: Masami Hiramatsu <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30uprobes: Initialize uprobes earlierNadav Amit1-0/+5
In order to have a separate address space for text poking, we need to duplicate init_mm early during start_kernel(). This, however, introduces a problem since uprobes functions are called from dup_mmap(), but uprobes is still not initialized in this early stage. Since uprobes initialization is necassary for fork, and since all the dependant initialization has been done when fork is initialized (percpu and vmalloc), move uprobes initialization to fork_init(). It does not seem uprobes introduces any security problem for the poking_mm. Crash and burn if uprobes initialization fails, similarly to other early initializations. Change the init_probes() name to probes_init() to match other early initialization functions name convention. Reported-by: kernel test robot <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rick Edgecombe <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30mm/tlb: Provide default nmi_uaccess_okay()Nadav Amit1-0/+9
x86 has an nmi_uaccess_okay(), but other architectures do not. Arch-independent code might need to know whether access to user addresses is ok in an NMI context or in other code whose execution context is unknown. Specifically, this function is needed for bpf_probe_write_user(). Add a default implementation of nmi_uaccess_okay() for architectures that do not have such a function. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-30KVM: Introduce a 'release' method for KVM devicesCédric Le Goater1-0/+9
When a P9 sPAPR VM boots, the CAS negotiation process determines which interrupt mode to use (XICS legacy or XIVE native) and invokes a machine reset to activate the chosen mode. To be able to switch from one interrupt mode to another, we introduce the capability to release a KVM device without destroying the VM. The KVM device interface is extended with a new 'release' method which is called when the file descriptor of the device is closed. Once 'release' is called, the 'destroy' method will not be called anymore as the device is removed from the device list of the VM. Cc: Paolo Bonzini <[email protected]> Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: Introduce a 'mmap' method for KVM devicesCédric Le Goater1-0/+1
Some KVM devices will want to handle special mappings related to the underlying HW. For instance, the XIVE interrupt controller of the POWER9 processor has MMIO pages for thread interrupt management and for interrupt source control that need to be exposed to the guest when the OS has the required support. Cc: Paolo Bonzini <[email protected]> Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Introduce a new capability KVM_CAP_PPC_IRQ_XIVECédric Le Goater1-0/+1
The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to let QEMU connect the vCPU presenters to the XIVE KVM device if required. The capability is not advertised for now as the full support for the XIVE native exploitation mode is not yet available. When this is case, the capability will be advertised on PowerNV Hypervisors only. Nested guests (pseries KVM Hypervisor) are not supported. Internally, the interface to the new KVM device is protected with a new interrupt mode: KVMPPC_IRQ_XIVE. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation modeCédric Le Goater1-0/+2
This is the basic framework for the new KVM device supporting the XIVE native exploitation mode. The user interface exposes a new KVM device to be created by QEMU, only available when running on a L0 hypervisor. Support for nested guests is not available yet. The XIVE device reuses the device structure of the XICS-on-XIVE device as they have a lot in common. That could possibly change in the future if the need arise. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30Merge branch 'opp/linux-next' of ↵Rafael J. Wysocki1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Pull operating performance points (OPP) framework changes for v5.2 from Viresh Kumar: "This pull request contains: - New helper in OPP core to find best matching frequency for a voltage value." * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: OPP: Introduce dev_pm_opp_find_freq_ceil_by_volt()
2019-04-30USB: serial: drop unused iflag macroJohan Hovold1-3/+0
Drop the RELEVANT_IFLAG() macro which essentially hasn't been used for over a decade except in some remnant debug printks that were recently removed. Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Johan Hovold <[email protected]>
2019-04-30USB: serial: clean up throttle handlingJohan Hovold1-4/+1
Clean up the throttle implementation by dropping the redundant throttle_req flag which was a remnant from back when there was only a single read URB. Also convert the throttled flag to an atomic bit flag. Signed-off-by: Johan Hovold <[email protected]>