aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-20arm64: link with -z norelro regardless of CONFIG_RELOCATABLENick Desaulniers1-2/+2
With CONFIG_EXPERT=y, CONFIG_KASAN=y, CONFIG_RANDOMIZE_BASE=n, CONFIG_RELOCATABLE=n, we observe the following failure when trying to link the kernel image with LD=ld.lld: error: section: .exit.data is not contiguous with other relro sections ld.lld defaults to -z relro while ld.bfd defaults to -z norelro. This was previously fixed, but only for CONFIG_RELOCATABLE=y. Fixes: 3bbd3db86470 ("arm64: relocatable: fix inconsistencies in linker script and options") Signed-off-by: Nick Desaulniers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-10-20arm64: Fix a broken copyright header in gen_vdso_offsets.shPalmer Dabbelt1-1/+1
I was going to copy this but I didn't want to chase around the build system stuff so I did it a different way. Signed-off-by: Palmer Dabbelt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-10-20PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()Hou Zhiqiang1-0/+11
NXP Layerscape (ls1028a, ls2088a), dra7xxx and imx6 platforms are either programmed or statically configured to forward the error triggered by a link-down state (eg no connected endpoint device) on the system bus for PCI configuration transactions; these errors are reported as an SError at system level, which is fatal. Enumerating a PCI tree when the PCIe link is down is not sensible either, so even if the link-up check is racy (link can go down after map_bus() is called) add a link-up check in map_bus() to prevent issuing configuration transactions when the link is down. SError report: SError Interrupt on CPU2, code 0xbf000002 -- SError CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc5-next-20200914-00001-gf965d3ec86fa #67 Hardware name: LS1046A RDB Board (DT) pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) pc : pci_generic_config_read+0x3c/0xe0 lr : pci_generic_config_read+0x24/0xe0 sp : ffff80001003b7b0 x29: ffff80001003b7b0 x28: ffff80001003ba74 x27: ffff000971d96800 x26: ffff00096e77e0a8 x25: ffff80001003b874 x24: ffff80001003b924 x23: 0000000000000004 x22: 0000000000000000 x21: 0000000000000000 x20: ffff80001003b874 x19: 0000000000000004 x18: ffffffffffffffff x17: 00000000000000c0 x16: fffffe0025981840 x15: ffffb94c75b69948 x14: 62203a383634203a x13: 666e6f635f726568 x12: 202c31203d207265 x11: 626d756e3e2d7375 x10: 656877202c307830 x9 : 203d206e66766564 x8 : 0000000000000908 x7 : 0000000000000908 x6 : ffff800010900000 x5 : ffff00096e77e080 x4 : 0000000000000000 x3 : 0000000000000003 x2 : 84fa3440ff7e7000 x1 : 0000000000000000 x0 : ffff800010034000 Kernel panic - not syncing: Asynchronous SError Interrupt CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc5-next-20200914-00001-gf965d3ec86fa #67 Hardware name: LS1046A RDB Board (DT) Call trace: dump_backtrace+0x0/0x1c0 show_stack+0x18/0x28 dump_stack+0xd8/0x134 panic+0x180/0x398 add_taint+0x0/0xb0 arm64_serror_panic+0x78/0x88 do_serror+0x68/0x180 el1_error+0x84/0x100 pci_generic_config_read+0x3c/0xe0 dw_pcie_rd_other_conf+0x78/0x110 pci_bus_read_config_dword+0x88/0xe8 pci_bus_generic_read_dev_vendor_id+0x30/0x1b0 pci_bus_read_dev_vendor_id+0x4c/0x78 pci_scan_single_device+0x80/0x100 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hou Zhiqiang <[email protected]> [[email protected]: rewrote the commit log, remove Fixes tag] Signed-off-by: Lorenzo Pieralisi <[email protected]>
2020-10-20dma-mapping: move more functions to dma-map-ops.hChristoph Hellwig3-25/+24
Due to a mismerge a bunch of prototypes that should have moved to dma-map-ops.h are still in dma-mapping.h, fix that up. Signed-off-by: Christoph Hellwig <[email protected]>
2020-10-20xen/events: block rogue events for some timeJuergen Gross2-6/+24
In order to avoid high dom0 load due to rogue guests sending events at high frequency, block those events in case there was no action needed in dom0 to handle the events. This is done by adding a per-event counter, which set to zero in case an EOI without the XEN_EOI_FLAG_SPURIOUS is received from a backend driver, and incremented when this flag has been set. In case the counter is 2 or higher delay the EOI by 1 << (cnt - 2) jiffies, but not more than 1 second. In order not to waste memory shorten the per-event refcnt to two bytes (it should normally never exceed a value of 2). Add an overflow check to evtchn_get() to make sure the 2 bytes really won't overflow. This is part of XSA-332. Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: defer eoi in case of excessive number of eventsJuergen Gross5-32/+216
In case rogue guests are sending events at high frequency it might happen that xen_evtchn_do_upcall() won't stop processing events in dom0. As this is done in irq handling a crash might be the result. In order to avoid that, delay further inter-domain events after some time in xen_evtchn_do_upcall() by forcing eoi processing into a worker on the same cpu, thus inhibiting new events coming in. The time after which eoi processing is to be delayed is configurable via a new module parameter "event_loop_timeout" which specifies the maximum event loop time in jiffies (default: 2, the value was chosen after some tests showing that a value of 2 was the lowest with an only slight drop of dom0 network throughput while multiple guests performed an event storm). How long eoi processing will be delayed can be specified via another parameter "event_eoi_delay" (again in jiffies, default 10, again the value was chosen after testing with different delay values). This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: use a common cpu hotplug hook for event channelsJuergen Gross3-21/+47
Today only fifo event channels have a cpu hotplug callback. In order to prepare for more percpu (de)init work move that callback into events_base.c and add percpu_init() and percpu_deinit() hooks to struct evtchn_ops. This is part of XSA-332. Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: switch user event channels to lateeoi modelJuergen Gross1-4/+3
Instead of disabling the irq when an event is received and enabling it again when handled by the user process use the lateeoi model. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Tested-by: Stefano Stabellini <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/pciback: use lateeoi irq bindingJuergen Gross4-19/+56
In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving pcifront use the lateeoi irq binding for pciback and unmask the event channel only just before leaving the event handling function. Restructure the handling to support that scheme. Basically an event can come in for two reasons: either a normal request for a pciback action, which is handled in a worker, or in case the guest has finished an AER request which was requested by pciback. When an AER request is issued to the guest and a normal pciback action is currently active issue an EOI early in order to be able to receive another event when the AER request has been finished by the guest. Let the worker processing the normal requests run until no further request is pending, instead of starting a new worker ion that case. Issue the EOI only just before leaving the worker. This scheme allows to drop calling the generic function xen_pcibk_test_and_schedule_op() after processing of any request as the handling of both request types is now separated more cleanly. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/pvcallsback: use lateeoi irq bindingJuergen Gross1-30/+46
In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving pvcallsfront use the lateeoi irq binding for pvcallsback and unmask the event channel only after handling all write requests, which are the ones coming in via an irq. This requires modifying the logic a little bit to not require an event for each write request, but to keep the ioworker running until no further data is found on the ring page to be processed. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/scsiback: use lateeoi irq bindingJuergen Gross1-10/+13
In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving scsifront use the lateeoi irq binding for scsiback and unmask the event channel only just before leaving the event handling function. In case of a ring protocol error don't issue an EOI in order to avoid the possibility to use that for producing an event storm. This at once will result in no further call of scsiback_irq_fn(), so the ring_error struct member can be dropped and scsiback_do_cmd_fn() can signal the protocol error via a negative return value. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/netback: use lateeoi irq bindingJuergen Gross4-14/+86
In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving netfront use the lateeoi irq binding for netback and unmask the event channel only just before going to sleep waiting for new events. Make sure not to issue an EOI when none is pending by introducing an eoi_pending element to struct xenvif_queue. When no request has been consumed set the spurious flag when sending the EOI for an interrupt. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/blkback: use lateeoi irq bindingJuergen Gross2-8/+19
In order to reduce the chance for the system becoming unresponsive due to event storms triggered by a misbehaving blkfront use the lateeoi irq binding for blkback and unmask the event channel only after processing all pending requests. As the thread processing requests is used to do purging work in regular intervals an EOI may be sent only after having received an event. If there was no pending I/O request flag the EOI as spurious. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: add a new "late EOI" evtchn frameworkJuergen Gross2-17/+155
In order to avoid tight event channel related IRQ loops add a new framework of "late EOI" handling: the IRQ the event channel is bound to will be masked until the event has been handled and the related driver is capable to handle another event. The driver is responsible for unmasking the event channel via the new function xen_irq_lateeoi(). This is similar to binding an event channel to a threaded IRQ, but without having to structure the driver accordingly. In order to support a future special handling in case a rogue guest is sending lots of unsolicited events, add a flag to xen_irq_lateeoi() which can be set by the caller to indicate the event was a spurious one. This is part of XSA-332. Cc: [email protected] Reported-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: fix race in evtchn_fifo_unmask()Juergen Gross1-4/+9
Unmasking a fifo event channel can result in unmasking it twice, once directly in the kernel and once via a hypercall in case the event was pending. Fix that by doing the local unmask only if the event is not pending. This is part of XSA-332. Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]>
2020-10-20xen/events: add a proper barrier to 2-level uevent unmaskingJuergen Gross1-0/+2
A follow-up patch will require certain write to happen before an event channel is unmasked. While the memory barrier is not strictly necessary for all the callers, the main one will need it. In order to avoid an extra memory barrier when using fifo event channels, mandate evtchn_unmask() to provide write ordering. The 2-level event handling unmask operation is missing an appropriate barrier, so add it. Fifo event channels are fine in this regard due to using sync_cmpxchg(). This is part of XSA-332. Cc: [email protected] Suggested-by: Julien Grall <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Julien Grall <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20xen/events: avoid removing an event channel while handling itJuergen Gross1-5/+36
Today it can happen that an event channel is being removed from the system while the event handling loop is active. This can lead to a race resulting in crashes or WARN() splats when trying to access the irq_info structure related to the event channel. Fix this problem by using a rwlock taken as reader in the event handling loop and as writer when deallocating the irq_info structure. As the observed problem was a NULL dereference in evtchn_from_irq() make this function more robust against races by testing the irq_info pointer to be not NULL before dereferencing it. And finally make all accesses to evtchn_to_irq[row][col] atomic ones in order to avoid seeing partial updates of an array element in irq handling. Note that irq handling can be entered only for event channels which have been valid before, so any not populated row isn't a problem in this regard, as rows are only ever added and never removed. This is XSA-331. Cc: [email protected] Reported-by: Marek Marczykowski-Górecki <[email protected]> Reported-by: Jinoh Kang <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Reviewed-by: Wei Liu <[email protected]>
2020-10-20ARM/sa1111: add a missing include of dma-map-ops.hChristoph Hellwig1-1/+1
Ensure the dmabounce functions are available for all Kconfig permutations. Fixes: 0a0f0d8be76d ("dma-mapping: split <linux/dma-mapping.h>") Reported-by: kernel test robot <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2020-10-20smb3.1.1: do not fail if no encryption required but server doesn't support itSteve French1-3/+13
There are cases where the server can return a cipher type of 0 and it not be an error. For example server supported no encryption types (e.g. server completely disabled encryption), or the server and client didn't support any encryption types in common (e.g. if a server only supported AES256_CCM). In those cases encryption would not be supported, but that can be ok if the client did not require encryption on mount and it should not return an error. In the case in which mount requested encryption ("seal" on mount) then checks later on during tree connection will return the proper rc, but if seal was not requested by client, since server is allowed to return 0 to indicate no supported cipher, we should not fail mount. Reported-by: Pavel Shilovsky <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2020-10-19nexthop: Fix performance regression in nexthop deletionIdo Schimmel1-1/+1
While insertion of 16k nexthops all using the same netdev ('dummy10') takes less than a second, deletion takes about 130 seconds: # time -p ip -b nexthop.batch real 0.29 user 0.01 sys 0.15 # time -p ip link set dev dummy10 down real 131.03 user 0.06 sys 0.52 This is because of repeated calls to synchronize_rcu() whenever a nexthop is removed from a nexthop group: # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K ... b'finish_task_switch' b'schedule' b'schedule_timeout' b'wait_for_completion' b'__wait_rcu_gp' b'synchronize_rcu.part.0' b'synchronize_rcu' b'__remove_nexthop' b'remove_nexthop' b'nexthop_flush_dev' b'nh_netdev_event' b'raw_notifier_call_chain' b'call_netdevice_notifiers_info' b'__dev_notify_flags' b'dev_change_flags' b'do_setlink' b'__rtnl_newlink' b'rtnl_newlink' b'rtnetlink_rcv_msg' b'netlink_rcv_skb' b'rtnetlink_rcv' b'netlink_unicast' b'netlink_sendmsg' b'____sys_sendmsg' b'___sys_sendmsg' b'__sys_sendmsg' b'__x64_sys_sendmsg' b'do_syscall_64' b'entry_SYSCALL_64_after_hwframe' - ip (277) 126554955 Since nexthops are always deleted under RTNL, synchronize_net() can be used instead. It will call synchronize_rcu_expedited() which only blocks for several microseconds as opposed to multiple milliseconds like synchronize_rcu(). With this patch deletion of 16k nexthops takes less than a second: # time -p ip link set dev dummy10 down real 0.12 user 0.00 sys 0.04 Tested with fib_nexthops.sh which includes torture tests that prompted the initial change: # ./fib_nexthops.sh ... Tests passed: 134 Tests failed: 0 Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Reviewed-by: David Ahern <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19Merge tag 'riscv-for-linus-5.10-mw0' of ↵Linus Torvalds31-241/+1212
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of cleanups and new features: - A handful of cleanups for our page fault handling - Improvements to how we fill out cacheinfo - Support for EFI-based systems" * tag 'riscv-for-linus-5.10-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (22 commits) RISC-V: Add page table dump support for uefi RISC-V: Add EFI runtime services RISC-V: Add EFI stub support. RISC-V: Add PE/COFF header for EFI stub RISC-V: Implement late mapping page table allocation functions RISC-V: Add early ioremap support RISC-V: Move DT mapping outof fixmap RISC-V: Fix duplicate included thread_info.h riscv/mm/fault: Set FAULT_FLAG_INSTRUCTION flag in do_page_fault() riscv/mm/fault: Fix inline placement in vmalloc_fault() declaration riscv: Add cache information in AUX vector riscv: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO riscv: Set more data to cacheinfo riscv/mm/fault: Move access error check to function riscv/mm/fault: Move FAULT_FLAG_WRITE handling in do_page_fault() riscv/mm/fault: Simplify mm_fault_error() riscv/mm/fault: Move fault error handling to mm_fault_error() riscv/mm/fault: Simplify fault error handling riscv/mm/fault: Move vmalloc fault handling to vmalloc_fault() riscv/mm/fault: Move bad area handling to bad_area() ...
2020-10-19Merge tag 'm68knommu-for-v5.10' of ↵Linus Torvalds6-559/+402
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: "A collection of fixes for 5.10: - switch to using asm-generic uaccess code - fix sparse warnings in signal code - fix compilation of ColdFire MMC support - support sysrq in ColdFire serial driver" * tag 'm68knommu-for-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: serial: mcf: add sysrq capability m68knommu: include SDHC support only when hardware has it m68knommu: fix sparse warnings in signal code m68knommu: switch to using asm-generic/uaccess.h
2020-10-19net: dsa: seville: the packet buffer is 2 megabits, not megabytesMaxim Kochetkov1-1/+1
The VSC9953 Seville switch has 2 megabits of buffer split into 4360 words of 60 bytes each. 2048 * 1024 is 2 megabytes instead of 2 megabits. 2 megabits is (2048 / 8) * 1024 = 256 * 1024. Signed-off-by: Maxim Kochetkov <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Fixes: a63ed92d217f ("net: dsa: seville: fix buffer size of the queue system") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19selftests: rtnetlink: load fou module for kci_test_encap_fou() testPo-Hsu Lin2-0/+6
The kci_test_encap_fou() test from kci_test_encap() in rtnetlink.sh needs the fou module to work. Otherwise it will fail with: $ ip netns exec "$testns" ip fou add port 7777 ipproto 47 RTNETLINK answers: No such file or directory Error talking to the kernel Add the CONFIG_NET_FOU into the config file as well. Which needs at least to be set as a loadable module. Fixes: 6227efc1a20b ("selftests: rtnetlink.sh: add vxlan and fou test cases") Signed-off-by: Po-Hsu Lin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19net: dsa: tag_ksz: KSZ8795 and KSZ9477 also use tail tagsChristian Eggers1-0/+2
The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and KSZ9893 switches also use tail tags. Fixes: 7a6ffe764be3 ("net: dsa: point out the tail taggers") Signed-off-by: Christian Eggers <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19net: korina: cast KSEG0 address to pointer in kfreeValentin Vidic1-2/+2
Fixes gcc warning: passing argument 1 of 'kfree' makes pointer from integer without a cast Fixes: 3af5f0f5c74e ("net: korina: fix kfree of rx/tx descriptor array") Reported-by: kernel test robot <[email protected]> Signed-off-by: Valentin Vidic <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19r8169: fix operation under forced interrupt threadingHeiner Kallweit1-4/+4
For several network drivers it was reported that using __napi_schedule_irqoff() is unsafe with forced threading. One way to fix this is switching back to __napi_schedule, but then we lose the benefit of the irqoff version in general. As stated by Eric it doesn't make sense to make the minimal hard irq handlers in drivers using NAPI a thread. Therefore ensure that the hard irq handler is never thread-ified. Fixes: 9a899a35b0d6 ("r8169: switch to napi_schedule_irqoff") Link: https://lkml.org/lkml/2020/10/18/19 Signed-off-by: Heiner Kallweit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-19bpf: selftest: Ensure the return value of the bpf_per_cpu_ptr() must be checkedMartin KaFai Lau2-18/+70
This patch tests all pointers returned by bpf_per_cpu_ptr() must be tested for NULL first before it can be accessed. This patch adds a subtest "null_check", so it moves the ".data..percpu" existence check to the very beginning and before doing any subtest. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-10-19bpf: selftest: Ensure the return value of bpf_skc_to helpers must be checkedMartin KaFai Lau1-0/+25
This patch tests: int bpf_cls(struct __sk_buff *skb) { /* REG_6: sk * REG_7: tp * REG_8: req_sk */ sk = skb->sk; if (!sk) return 0; tp = bpf_skc_to_tcp_sock(sk); req_sk = bpf_skc_to_tcp_request_sock(sk); if (!req_sk) return 0; /* !tp has not been tested, so verifier should reject. */ return *(__u8 *)tp; } Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-10-19bpf: Enforce id generation for all may-be-null register typeMartin KaFai Lau1-6/+5
The commit af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper") introduces RET_PTR_TO_BTF_ID_OR_NULL and the commit eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") introduces RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL. Note that for RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, the reg0->type could become PTR_TO_MEM_OR_NULL which is not covered by BPF_PROBE_MEM. The BPF_REG_0 will then hold a _OR_NULL pointer type. This _OR_NULL pointer type requires the bpf program to explicitly do a NULL check first. After NULL check, the verifier will mark all registers having the same reg->id as safe to use. However, the reg->id is not set for those new _OR_NULL return types. One of the ways that may be wrong is, checking NULL for one btf_id typed pointer will end up validating all other btf_id typed pointers because all of them have id == 0. The later tests will exercise this path. To fix it and also avoid similar issue in the future, this patch moves the id generation logic out of each individual RET type test in check_helper_call(). Instead, it does one reg_type_may_be_null() test and then do the id generation if needed. This patch also adds a WARN_ON_ONCE in mark_ptr_or_null_reg() to catch future breakage. The _OR_NULL pointer usage in the bpf_iter_reg.ctx_arg_info is fine because it just happens that the existing id generation after check_ctx_access() has covered it. It is also using the reg_type_may_be_null() to decide if id generation is needed or not. Fixes: af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper") Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-10-19Merge tag 'xfs-5.10-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds37-509/+1023
Pull more xfs updates from Darrick Wong: "The second large pile of new stuff for 5.10, with changes even more monumental than last week! We are formally announcing the deprecation of the V4 filesystem format in 2030. All users must upgrade to the V5 format, which contains design improvements that greatly strengthen metadata validation, supports reflink and online fsck, and is the intended vehicle for handling timestamps past 2038. We're also deprecating the old Irix behavioral tweaks in September 2025. Coming along for the ride are two design changes to the deferred metadata ops subsystem. One of the improvements is to retain correct logical ordering of tasks and subtasks, which is a more logical design for upper layers of XFS and will become necessary when we add atomic file range swaps and commits. The second improvement to deferred ops improves the scalability of the log by helping the log tail to move forward during long-running operations. This reduces log contention when there are a large number of threads trying to run transactions. In addition to that, this fixes numerous small bugs in log recovery; refactors logical intent log item recovery to remove the last remaining place in XFS where we could have nested transactions; fixes a couple of ways that intent log item recovery could fail in ways that wouldn't have happened in the regular commit paths; fixes a deadlock vector in the GETFSMAP implementation (which improves its performance by 20%); and fixes serious bugs in the realtime growfs, fallocate, and bitmap handling code. Summary: - Deprecate the V4 filesystem format, some disused mount options, and some legacy sysctl knobs now that we can support dates into the 25th century. Note that removal of V4 support will not happen until the early 2030s. - Fix some probles with inode realtime flag propagation. - Fix some buffer handling issues when growing a rt filesystem. - Fix a problem where a BMAP_REMAP unmap call would free rt extents even though the purpose of BMAP_REMAP is to avoid freeing the blocks. - Strengthen the dabtree online scrubber to check hash values on child dabtree blocks. - Actually log new intent items created as part of recovering log intent items. - Fix a bug where quotas weren't attached to an inode undergoing bmap intent item recovery. - Fix a buffer overrun problem with specially crafted log buffer headers. - Various cleanups to type usage and slightly inaccurate comments. - More cleanups to the xattr, log, and quota code. - Don't run the (slower) shared-rmap operations on attr fork mappings. - Fix a bug where we failed to check the LSN of finobt blocks during replay and could therefore overwrite newer data with older data. - Clean up the ugly nested transaction mess that log recovery uses to stage intent item recovery in the correct order by creating a proper data structure to capture recovered chains. - Use the capture structure to resume intent item chains with the same log space and block reservations as when they were captured. - Fix a UAF bug in bmap intent item recovery where we failed to maintain our reference to the incore inode if the bmap operation needed to relog itself to continue. - Rearrange the defer ops mechanism to finish newly created subtasks of a parent task before moving on to the next parent task. - Automatically relog intent items in deferred ops chains if doing so would help us avoid pinning the log tail. This will help fix some log scaling problems now and will facilitate atomic file updates later. - Fix a deadlock in the GETFSMAP implementation by using an internal memory buffer to reduce indirect calls and copies to userspace, thereby improving its performance by ~20%. - Fix various problems when calling growfs on a realtime volume would not fully update the filesystem metadata. - Fix broken Kconfig asking about deprecated XFS when XFS is disabled" * tag 'xfs-5.10-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (48 commits) xfs: fix Kconfig asking about XFS_SUPPORT_V4 when XFS_FS=n xfs: fix high key handling in the rt allocator's query_range function xfs: annotate grabbing the realtime bitmap/summary locks in growfs xfs: make xfs_growfs_rt update secondary superblocks xfs: fix realtime bitmap/summary file truncation when growing rt volume xfs: fix the indent in xfs_trans_mod_dquot xfs: do the ASSERT for the arguments O_{u,g,p}dqpp xfs: fix deadlock and streamline xfs_getfsmap performance xfs: limit entries returned when counting fsmap records xfs: only relog deferred intent items if free space in the log gets low xfs: expose the log push threshold xfs: periodically relog deferred intent items xfs: change the order in which child and parent defer ops are finished xfs: fix an incore inode UAF in xfs_bui_recover xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering xfs: clean up bmap intent item recovery checking xfs: xfs_defer_capture should absorb remaining transaction reservation xfs: xfs_defer_capture should absorb remaining block reservations xfs: proper replay of deferred ops queued during log recovery xfs: remove XFS_LI_RECOVERED ...
2020-10-19Merge tag 'fuse-update-5.10' of ↵Linus Torvalds20-496/+2689
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Support directly accessing host page cache from virtiofs. This can improve I/O performance for various workloads, as well as reducing the memory requirement by eliminating double caching. Thanks to Vivek Goyal for doing most of the work on this. - Allow automatic submounting inside virtiofs. This allows unique st_dev/ st_ino values to be assigned inside the guest to files residing on different filesystems on the host. Thanks to Max Reitz for the patches. - Fix an old use after free bug found by Pradeep P V K. * tag 'fuse-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits) virtiofs: calculate number of scatter-gather elements accurately fuse: connection remove fix fuse: implement crossmounts fuse: Allow fuse_fill_super_common() for submounts fuse: split fuse_mount off of fuse_conn fuse: drop fuse_conn parameter where possible fuse: store fuse_conn in fuse_req fuse: add submount support to <uapi/linux/fuse.h> fuse: fix page dereference after free virtiofs: add logic to free up a memory range virtiofs: maintain a list of busy elements virtiofs: serialize truncate/punch_hole and dax fault path virtiofs: define dax address space operations virtiofs: add DAX mmap support virtiofs: implement dax read/write operations virtiofs: introduce setupmapping/removemapping commands virtiofs: implement FUSE_INIT map_alignment field virtiofs: keep a list of free dax memory ranges virtiofs: add a mount option to enable dax virtiofs: set up virtio_fs dax_device ...
2020-10-20selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load ↵Michael Neuling1-2/+6
workaround alignment_handler currently only tests the unaligned cases but it can also be useful for testing the workaround for the P9N DD2.1 vector CI load issue fixed by p9_hmi_special_emu(). This workaround was introduced in 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue"). This changes the loop to start from offset 0 rather than 1 so that we test the kernel emulation in p9_hmi_special_emu(). Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-20powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulationMichael Neuling1-1/+1
__get_user_atomic_128_aligned() stores to kaddr using stvx which is a VMX store instruction, hence kaddr must be 16 byte aligned otherwise the store won't occur as expected. Unfortunately when we call __get_user_atomic_128_aligned() in p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't guaranteed to be 16B aligned. This means that the write to vbuf in __get_user_atomic_128_aligned() has the bottom bits of the address truncated. This results in other local variables being overwritten. Also vbuf will not contain the correct data which results in the userspace emulation being wrong and hence undetected user data corruption. In the past we've been mostly lucky as vbuf has ended up aligned but this is fragile and isn't always true. CONFIG_STACKPROTECTOR in particular can change the stack arrangement enough that our luck runs out. This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal. The fix is to align vbuf to a 16 byte boundary. Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue") Cc: [email protected] # v4.15+ Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19Merge tag 'zonefs-5.10-rc1' of ↵Linus Torvalds3-13/+233
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs updates from Damien Le Moal: "Add an 'explicit-open' mount option to automatically issue a REQ_OP_ZONE_OPEN command to the device whenever a sequential zone file is open for writing for the first time. This avoids 'insufficient zone resources' errors for write operations on some drives with limited zone resources or on ZNS drives with a limited number of active zones. From Johannes" * tag 'zonefs-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: document the explicit-open mount option zonefs: open/close zone on file open/close zonefs: provide no-lock zonefs_io_error variant zonefs: introduce helper for zone management
2020-10-19rtc: r9701: set rangeAlexandre Belloni1-5/+3
Set range and remove the set_time check. This is a classic BCD RTC. Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: r9701: convert to devm_rtc_allocate_deviceAlexandre Belloni1-3/+3
This allows further improvement of the driver. Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: r9701: stop setting RWKCNTAlexandre Belloni1-1/+0
tm_wday is never checked for validity and it is not read back in r9701_get_datetime. Avoid setting it to stop tripping static checkers: drivers/rtc/rtc-r9701.c:109 r9701_set_datetime() error: undefined (user controlled) shift '1 << dt->tm_wday' Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: r9701: remove useless memsetAlexandre Belloni1-2/+0
The RTC core already sets to zero the struct rtc_tie it passes to the driver, avoid doing it a second time. Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: r9701: stop setting a default timeAlexandre Belloni1-22/+0
It doesn't make sense to set the RTC to a default value at probe time. Let the core handle invalid date and time. Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: r9701: remove leftover commentAlexandre Belloni1-4/+0
Commit 22652ba72453 ("rtc: stop validating rtc_time in .read_time") removed the code but not the associated comment. Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19rtc: rv3032: Add a driver for Microcrystal RV-3032Alexandre Belloni3-0/+936
New driver for the Microcrystal RV-3032, including support for: - Date/time - Alarms - Low voltage detection - Trickle charge - Trimming - Clkout - RAM - EEPROM - Temperature sensor Signed-off-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19dt-bindings: rtc: rv3032: add RV-3032 bindingsAlexandre Belloni1-0/+64
Document the Microcrystal RV-3032 device tree bindings Signed-off-by: Alexandre Belloni <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19dt-bindings: rtc: add trickle-voltage-millivoltAlexandre Belloni1-0/+6
Some RTCs have a trickle charge that is able to output different voltages depending on the type of the connected auxiliary power (battery, supercap, ...). Add a property allowing to specify the necessary voltage. Signed-off-by: Alexandre Belloni <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19cifs: Return the error from crypt_message when enc/dec key not found.Shyam Prasad N1-1/+1
In crypt_message, when smb2_get_enc_key returns error, we need to return the error back to the caller. If not, we end up processing the message further, causing a kernel oops due to unwarranted access of memory. Call Trace: smb3_receive_transform+0x120/0x870 [cifs] cifs_demultiplex_thread+0xb53/0xc20 [cifs] ? cifs_handle_standard+0x190/0x190 [cifs] kthread+0x116/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x1f/0x30 Signed-off-by: Shyam Prasad N <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]> CC: Stable <[email protected]> Signed-off-by: Steve French <[email protected]>
2020-10-19smb3.1.1: set gcm256 when requestedSteve French4-6/+17
update smb encryption code to set 32 byte key length and to set gcm256 when requested on mount. Signed-off-by: Steve French <[email protected]>
2020-10-19smb3.1.1: rename nonces used for GCM and CCM encryptionSteve French2-6/+6
Now that 256 bit encryption can be negotiated, update names of the nonces to match the updated official protocol documentation (e.g. AES_GCM_NONCE instead of AES_128GCM_NONCE) since they apply to both 128 bit and 256 bit encryption. Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2020-10-19smb3.1.1: print warning if server does not support requested encryption typeSteve French1-2/+13
If server does not support AES-256-GCM and it was required on mount, print warning message. Also log and return a different error message (EOPNOTSUPP) when encryption mechanism is not supported vs the case when an unknown unrequested encryption mechanism could be returned (EINVAL). Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2020-10-19io_uring: fix racy REQ_F_LINK_TIMEOUT clearingPavel Begunkov1-4/+13
io_link_timeout_fn() removes REQ_F_LINK_TIMEOUT from the link head's flags, it's not atomic and may race with what the head is doing. If io_link_timeout_fn() doesn't clear the flag, as forced by this patch, then it may happen that for "req -> link_timeout1 -> link_timeout2", __io_kill_linked_timeout() would find link_timeout2 and try to cancel it, so miscounting references. Teach it to ignore such double timeouts by marking the active one with a new flag in io_prep_linked_timeout(). Signed-off-by: Pavel Begunkov <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-10-19io_uring: do poll's hash_node init in common codePavel Begunkov1-2/+1
Move INIT_HLIST_NODE(&req->hash_node) into __io_arm_poll_handler(), so that it doesn't duplicated and common poll code would be responsible for it. Signed-off-by: Pavel Begunkov <[email protected]> Signed-off-by: Jens Axboe <[email protected]>