aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2020-05-27RDMA/cma: Connect ECE to rdma_acceptLeon Romanovsky2-0/+4
The rdma_accept() is called by both passive and active sides of CMID connection to mark readiness to start data transfer. For passive side, this is called explicitly, for active side, it is called implicitly while receiving REP message. Provide ECE data to rdma_accept function needed for passive side to send that REP message. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27RDMA/cm: Send and receive ECE parameter over the wireLeon Romanovsky1-1/+8
ECE parameters are exchanged through REQ->REP/SIDR_REP messages, this patch adds the data to provide to other side of CMID communication channel. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27RDMA/ucma: Deliver ECE parameters through UCMA eventsLeon Romanovsky2-0/+2
Passive side of CMID connection receives ECE request through REQ message and needs to respond with relevant REP message which will be forwarded to active side. The UCMA events interface is responsible for such communication with the user space (librdmacm). Extend it to provide ECE wire data. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27RDMA/ucma: Extend ucma_connect to receive ECE parametersLeon Romanovsky2-0/+9
Active side of CMID initiates connection through librdmacm's rdma_connect() and kernel's ucma_connect(). Extend UCMA interface to handle those new parameters. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27RDMA/cm: Add Enhanced Connection Establishment (ECE) bitsLeon Romanovsky1-0/+6
Extend REQ (request for communications), REP (reply to request for communication), rejected reason and SIDR_REP (service ID resolution response) structures with hardware vendor ID bits according to IBTA v1.4. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27Merge branch 'mellanox/mlx5-next' into rdma.git for/nextJason Gunthorpe2-23/+17
From the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies in following patches * branch 'mellanox/mlx5-next': net/mlx5: Add ability to read and write ECE options net/mlx5: Add support for RDMA TX FT headers modifying net/mlx5: Move iseg access helper routines close to mlx5_core driver net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27RDMA/core: Use sizeof_field() helperGustavo A. R. Silva1-6/+6
Make use of the sizeof_field() helper instead of an open-coded version. Link: https://lore.kernel.org/r/20200527144152.GA22605@embeddedor Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-27net/mlx5: Add ability to read and write ECE optionsLeon Romanovsky1-10/+16
The end result of RDMA-CM ECE handshake is ECE options, which is needed to be used while configuring data QPs. Such options can come in any QP state, so add in/out fields to set and query ECE options. OUT fields: * create_qp() - default ECE options for that type of QP. * modify_qp() - enabled ECE options after QP state transition. IN fields: * create_qp() - create QP with this ECE option. * modify_qp() - requested options. For unconnected QPs, the FW will return an error if ECE is already configured with any options that not equal to previously set. Reviewed-by: Mark Zhang <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2020-05-21IB/uverbs: Introduce create/destroy QP commands over ioctlYishai Hadas2-0/+37
Introduce create/destroy QP commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/uverbs: Introduce create/destroy WQ commands over ioctlYishai Hadas1-0/+25
Introduce create/destroy WQ commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/uverbs: Introduce create/destroy SRQ commands over ioctlYishai Hadas1-0/+27
Introduce create/destroy SRQ commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/uverbs: Move QP, SRQ, WQ type and flags to UAPIYishai Hadas2-19/+58
These constants are going to be used in the ioctl interface in coming patches so they are part of the UAPI, place them in the correct header for clarity. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/uverbs: Extend CQ to get its own asynchronous event FDYishai Hadas1-0/+1
Extend CQ to get its own asynchronous event FD. The event FD is an optional attribute, in case wasn't given the ufile event FD will be used. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21RDMA/core: Allow the ioctl layer to abort a fully created uobjectJason Gunthorpe4-2/+11
While creating a uobject every create reaches a point where the uobject is fully initialized. For ioctls that go on to copy_to_user this means they need to open code the destruction of a fully created uobject - ie the RDMA_REMOVE_DESTROY sort of flow. Open coding this creates bugs, eg the CQ does not properly flush the events list when it does its error unwind. Provide a uverbs_finalize_uobj_create() function which indicates that the uobject is fully initialized and that abort should call to destroy_hw to destroy the uobj->object and related. Methods can call this function if they go on to have error cases after setting uobj->object. Once done those error cases can simply do return, without an error unwind. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21Merge tag 'v5.7-rc6' into rdma.git for-nextJason Gunthorpe112-286/+522
Linux 5.7-rc6 Conflict in drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c resolved by deleting dr_cq_event, matching how netdev resolved it. Required for dependencies in the following patches. Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/ipoib: Increase ipoib Datagram mode MTU's upper limitKaike Wan2-9/+78
Currently the ipoib UD mtu is restricted to 4K bytes. Remove this limitation so that the IPOIB module can potentially use an MTU (in UD mode) that is bounded by the MTU of the underlying device. A field is added to the ib_port_attr structure to indicate the maximum physical MTU the underlying device supports. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Dennis Dalessandro <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Sadanand Warrier <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPsGary Leshner2-4/+4
Adds capability to create a qpn to be recognized as an accelerated UD QP for ipoib. This is accomplished by reserving 0x81 in byte[0] of the qpn as the prefix for these qp types and reserving qpns between 0x810000 and 0x81ffff. The hfi1 capability mask already contained a flag for the VNIC netdev. This has been renamed and extended to include both VNIC and ipoib. The rvt code to allocate qps now recognizes this flag and sets 0x81 into byte[0] of the qpn. The code to allocate qpns is modified to reset the qpn numbering when it is detected that a value is located in byte[0] for a UD QP and it is a qpn being requested for net dev use. If it is a regular UD QP then it is allowable to have bits set in byte[0] of the qpn and provide the previously normal behavior. The code to free the qpn now checks for the AIP prefix value of 0x81 and removes it from the qpn before being freed so that the lower 16 bit number can be reused. This patch requires minor changes in the IB core and ipoib to facilitate the creation of accelerated UP QPs. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Dennis Dalessandro <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Gary Leshner <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/hfi1: Remove module parameter for KDETH qpnsGary Leshner1-1/+28
The module parameter for KDETH qpns is being removed in favor of always using the default value of 0x80 as the qpn prefix. Defines have been added for various KDETH values including the prefix of 0x80. The reserved range now starts at the base value for KDETH qpns (0x80) and extends up to and including the last qpn for other reserved QP prefixed types. Adjust other QP prefixed define names to match KDETH defined names. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Dennis Dalessandro <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Gary Leshner <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/hfi1: Add functions to transmit datagram ipoib packetsGary Leshner1-0/+1
This patch implements the mechanism to accelerate the transmit side of a multiple transmit queue RDMA netdev by submitting the packets to the SDMA engine directly instead of sending through the verbs layer. This patch also changes the UD/SEND_ONLY op to output the entropy value in byte 0 of deth[1]. UD/SEND_ONLY_WITH_IMMEDIATE uses the previous behavior with no entropy value being output. The code in the ipoib rdma netdev which submits tx requests upon successful submission will call trace_sdma_output_ibhdr to output the ibhdr to the trace buffer. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Marciniszyn <[email protected]> Reviewed-by: Dennis Dalessandro <[email protected]> Signed-off-by: Gary Leshner <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21IB/hfi1: Add accelerated IP capability bitKaike Wan1-1/+2
The accelerated IP capability bit is added to allow users to control which feature is enabled and disabled. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Dennis Dalessandro <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Kaike Wan <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-18net/mlx5: Move iseg access helper routines close to mlx5_core driverParav Pandit1-10/+0
Only mlx5_core driver handles fw initialization check and command interface revision check. Hence move them inside the mlx5_core driver where it is used. This avoid exposing these helpers to all mlx5 drivers. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-05-18net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bitsRaed Salem1-3/+1
Remove the "metadata_reg_b" field and all uses of this field in code to match the device specification. As this field is not in use in SW steering it is safe to remove it. Signed-off-by: Raed Salem <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-05-17RDMA/core: Consolidate ib_create_srq flowsJason Gunthorpe1-15/+12
The uverbs layer largely duplicate the code in ib_create_srq(), with the slight difference that it passes in a udata. Move all the code together into ib_create_srq_user() and provide an inline for kernel users, similar to other create calls. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-17Merge tag 'usb-5.7-rc6' of ↵Linus Torvalds1-13/+95
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of USB fixes for 5.7-rc6 The "largest" in here is a bunch of raw-gadget fixes and api changes as the driver just showed up in -rc1 and work has been done to fix up some uapi issues found with the original submission, before it shows up in a -final release. Other than that, a bunch of other small USB gadget fixes, xhci fixes, some quirks, andother tiny fixes for reported issues. All of these have been in linux-next with no reported issues" * tag 'usb-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits) USB: gadget: fix illegal array access in binding with UDC usb: core: hub: limit HUB_QUIRK_DISABLE_AUTOSUSPEND to USB5534B USB: usbfs: fix mmap dma mismatch usb: host: xhci-plat: keep runtime active when removing host usb: xhci: Fix NULL pointer dereference when enqueuing trbs from urb sg list usb: cdns3: gadget: make a bunch of functions static usb: mtu3: constify struct debugfs_reg32 usb: gadget: udc: atmel: Make some symbols static usb: raw-gadget: fix null-ptr-deref when reenabling endpoints usb: raw-gadget: documentation updates usb: raw-gadget: support stalling/halting/wedging endpoints usb: raw-gadget: fix gadget endpoint selection usb: raw-gadget: improve uapi headers comments usb: typec: mux: intel: Fix DP_HPD_LVL bit field usb: raw-gadget: fix return value of ep read ioctls usb: dwc3: select USB_ROLE_SWITCH usb: gadget: legacy: fix error return code in gncm_bind() usb: gadget: legacy: fix error return code in cdc_bind() usb: gadget: legacy: fix redundant initialization warnings usb: gadget: tegra-xudc: Fix idle suspend/resume ...
2020-05-17Merge tag 'x86_urgent_for_v5.7-rc7' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "A single fix for early boot crashes of kernels built with gcc10 and stack protector enabled" * tag 'x86_urgent_for_v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix early boot crash on gcc-10, third try
2020-05-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+3
Pull kvm fixes from Paolo Bonzini: "A new testcase for guest debugging (gdbstub) that exposed a bunch of bugs, mostly for AMD processors. And a few other x86 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c KVM: SVM: Disable AVIC before setting V_IRQ KVM: Introduce kvm_make_all_cpus_request_except() KVM: VMX: pass correct DR6 for GD userspace exit KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6 KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6 KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on KVM: selftests: Add KVM_SET_GUEST_DEBUG test KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG KVM: x86: fix DR6 delivery for various cases of #DB injection KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
2020-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds8-9/+22
Pull networking fixes from David Miller: 1) Fix sk_psock reference count leak on receive, from Xiyu Yang. 2) CONFIG_HNS should be invisible, from Geert Uytterhoeven. 3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this, from Maciej Żenczykowski. 4) ipv4 route redirect backoff wasn't actually enforced, from Paolo Abeni. 5) Fix netprio cgroup v2 leak, from Zefan Li. 6) Fix infinite loop on rmmod in conntrack, from Florian Westphal. 7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet. 8) Various bpf probe handling fixes, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits) selftests: mptcp: pm: rm the right tmp file dpaa2-eth: properly handle buffer size restrictions bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range bpf: Restrict bpf_probe_read{, str}() only to archs where they work MAINTAINERS: Mark networking drivers as Maintained. ipmr: Add lockdep expression to ipmr_for_each_table macro ipmr: Fix RCU list debugging warning drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810 tcp: fix error recovery in tcp_zerocopy_receive() MAINTAINERS: Add Jakub to networking drivers. MAINTAINERS: another add of Karsten Graul for S390 networking drivers: ipa: fix typos for ipa_smp2p structure doc pppoe: only process PADT targeted at local interfaces selftests/bpf: Enforce returning 0 for fentry/fexit programs bpf: Enforce returning 0 for fentry/fexit progs net: stmmac: fix num_por initialization security: Fix the default value of secid_to_secctx hook libbpf: Fix register naming in PT_REGS s390 macros ...
2020-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-1/+1
Alexei Starovoitov says: ==================== pull-request: bpf 2020-05-15 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 2 day(s) which contain a total of 14 files changed, 137 insertions(+), 43 deletions(-). The main changes are: 1) Fix secid_to_secctx LSM hook default value, from Anders. 2) Fix bug in mmap of bpf array, from Andrii. 3) Restrict bpf_probe_read to archs where they work, from Daniel. 4) Enforce returning 0 for fentry/fexit progs, from Yonghong. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-15Merge tag 'sound-5.7-rc6' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Things look good and calming down; the only change to ALSA core is the fix for racy rawmidi buffer accesses spotted by syzkaller, and the rest are all small device-specific quirks for HD-audio and USB-audio devices" * tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Limit int mic boost for Thinkpad T530 ALSA: hda/realtek - Add COEF workaround for ASUS ZenBook UX431DA ALSA: hda/realtek: Enable headset mic of ASUS UX581LV with ALC295 ALSA: hda/realtek - Enable headset mic of ASUS UX550GE with ALC295 ALSA: hda/realtek - Enable headset mic of ASUS GL503VM with ALC295 ALSA: hda/realtek: Add quirk for Samsung Notebook ALSA: rawmidi: Fix racy buffer resize under concurrent accesses ALSA: usb-audio: add mapping for ASRock TRX40 Creator ALSA: hda/realtek - Fix S3 pop noise on Dell Wyse Revert "ALSA: hda/realtek: Fix pop noise on ALC225" ALSA: firewire-lib: fix 'function sizeof not defined' error of tracepoints format ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
2020-05-15Merge tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-0/+3
Pull drm fixes from Dave Airlie: "As mentioned last week an i915 PR came in late, but I left it, so the i915 bits of this cover 2 weeks, which is why it's likely a bit larger than usual. Otherwise it's mostly amdgpu fixes, one tegra fix, one meson fix. i915: - Handle idling during i915_gem_evict_something busy loops (Chris) - Mark current submissions with a weak-dependency (Chris) - Propagate error from completed fences (Chris) - Fixes on execlist to avoid GPU hang situation (Chris) - Fixes couple deadlocks (Chris) - Timeslice preemption fixes (Chris) - Fix Display Port interrupt handling on Tiger Lake (Imre) - Reduce debug noise around Frame Buffer Compression (Peter) - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan) - Avoid dereferencing a dead context (Chris) tegra: - tegra120/4 smmu fixes amdgpu: - Clockgating fixes - Fix fbdev with scatter/gather display - S4 fix for navi - Soft recovery for gfx10 - Freesync fixes - Atomic check cursor fix - Add a gfxoff quirk - MST fix amdkfd: - Fix GEM reference counting meson: - error code propogation fix" * tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm: (29 commits) drm/i915: Handle idling during i915_gem_evict_something busy loops drm/meson: pm resume add return errno branch drm/amd/amdgpu: Update update_config() logic drm/amd/amdgpu: add raven1 part to the gfxoff quirk list drm/i915: Mark concurrent submissions with a weak-dependency drm/i915: Propagate error from completed fences drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of inheritance. drm/amd/display: add basic atomic check for cursor plane drm/amd/display: Fix vblank and pageflip event handling for FreeSync drm/amdgpu: implement soft_recovery for gfx10 drm/amdgpu: enable hibernate support on Navi1X drm/amdgpu: Use GEM obj reference for KFD BOs drm/amdgpu: force fbdev into vram drm/amd/powerplay: perform PG ungate prior to CG ungate drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate drm/i915/execlists: Track inflight CCID drm/i915/execlists: Avoid reusing the same logical CCID drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane ...
2020-05-15x86: Fix early boot crash on gcc-10, third tryBorislav Petkov1-0/+6
... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value. The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10: Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call. To fix that, the initial attempt was to mark the one function which generates the stack canary with: __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused) however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options. The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs. The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm(""). This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc. That should hold for a long time, but hey we said that about the other two solutions too so... Reported-by: Sergei Trofimovich <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Kalle Valo <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-05-14net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810Kevin Lo1-0/+1
Set the correct bit when checking for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54810 PHY. Fixes: 0ececcfc9267 ("net: phy: broadcom: Allow BCM54810 to use bcm54xx_adjust_rxrefclk()") Signed-off-by: Kevin Lo <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-1/+2
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix gcc-10 compilation warning in nf_conntrack, from Arnd Bergmann. 2) Add NF_FLOW_HW_PENDING to avoid races between stats and deletion commands, from Paul Blakey. 3) Remove WQ_MEM_RECLAIM from the offload workqueue, from Roi Dayan. 4) Infinite loop when removing nf_conntrack module, from Florian Westphal. 5) Set NF_FLOW_TEARDOWN bit on expiration to avoid races when refreshing the timeout from the software path. 6) Missing nft_set_elem_expired() check in the rbtree, from Phil Sutter. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-14security: Fix the default value of secid_to_secctx hookAnders Roxell1-1/+1
security_secid_to_secctx is called by the bpf_lsm hook and a successful return value (i.e 0) implies that the parameter will be consumed by the LSM framework. The current behaviour return success when the pointer isn't initialized when CONFIG_BPF_LSM is enabled, with the default return from kernel/bpf/bpf_lsm.c. This is the internal error: [ 1229.341488][ T2659] usercopy: Kernel memory exposure attempt detected from null address (offset 0, size 280)! [ 1229.374977][ T2659] ------------[ cut here ]------------ [ 1229.376813][ T2659] kernel BUG at mm/usercopy.c:99! [ 1229.378398][ T2659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1229.380348][ T2659] Modules linked in: [ 1229.381654][ T2659] CPU: 0 PID: 2659 Comm: systemd-journal Tainted: G B W 5.7.0-rc5-next-20200511-00019-g864e0c6319b8-dirty #13 [ 1229.385429][ T2659] Hardware name: linux,dummy-virt (DT) [ 1229.387143][ T2659] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 1229.389165][ T2659] pc : usercopy_abort+0xc8/0xcc [ 1229.390705][ T2659] lr : usercopy_abort+0xc8/0xcc [ 1229.392225][ T2659] sp : ffff000064247450 [ 1229.393533][ T2659] x29: ffff000064247460 x28: 0000000000000000 [ 1229.395449][ T2659] x27: 0000000000000118 x26: 0000000000000000 [ 1229.397384][ T2659] x25: ffffa000127049e0 x24: ffffa000127049e0 [ 1229.399306][ T2659] x23: ffffa000127048e0 x22: ffffa000127048a0 [ 1229.401241][ T2659] x21: ffffa00012704b80 x20: ffffa000127049e0 [ 1229.403163][ T2659] x19: ffffa00012704820 x18: 0000000000000000 [ 1229.405094][ T2659] x17: 0000000000000000 x16: 0000000000000000 [ 1229.407008][ T2659] x15: 0000000000000000 x14: 003d090000000000 [ 1229.408942][ T2659] x13: ffff80000d5b25b2 x12: 1fffe0000d5b25b1 [ 1229.410859][ T2659] x11: 1fffe0000d5b25b1 x10: ffff80000d5b25b1 [ 1229.412791][ T2659] x9 : ffffa0001034bee0 x8 : ffff00006ad92d8f [ 1229.414707][ T2659] x7 : 0000000000000000 x6 : ffffa00015eacb20 [ 1229.416642][ T2659] x5 : ffff0000693c8040 x4 : 0000000000000000 [ 1229.418558][ T2659] x3 : ffffa0001034befc x2 : d57a7483a01c6300 [ 1229.420610][ T2659] x1 : 0000000000000000 x0 : 0000000000000059 [ 1229.422526][ T2659] Call trace: [ 1229.423631][ T2659] usercopy_abort+0xc8/0xcc [ 1229.425091][ T2659] __check_object_size+0xdc/0x7d4 [ 1229.426729][ T2659] put_cmsg+0xa30/0xa90 [ 1229.428132][ T2659] unix_dgram_recvmsg+0x80c/0x930 [ 1229.429731][ T2659] sock_recvmsg+0x9c/0xc0 [ 1229.431123][ T2659] ____sys_recvmsg+0x1cc/0x5f8 [ 1229.432663][ T2659] ___sys_recvmsg+0x100/0x160 [ 1229.434151][ T2659] __sys_recvmsg+0x110/0x1a8 [ 1229.435623][ T2659] __arm64_sys_recvmsg+0x58/0x70 [ 1229.437218][ T2659] el0_svc_common.constprop.1+0x29c/0x340 [ 1229.438994][ T2659] do_el0_svc+0xe8/0x108 [ 1229.440587][ T2659] el0_svc+0x74/0x88 [ 1229.441917][ T2659] el0_sync_handler+0xe4/0x8b4 [ 1229.443464][ T2659] el0_sync+0x17c/0x180 [ 1229.444920][ T2659] Code: aa1703e2 aa1603e1 910a8260 97ecc860 (d4210000) [ 1229.447070][ T2659] ---[ end trace 400497d91baeaf51 ]--- [ 1229.448791][ T2659] Kernel panic - not syncing: Fatal exception [ 1229.450692][ T2659] Kernel Offset: disabled [ 1229.452061][ T2659] CPU features: 0x240002,20002004 [ 1229.453647][ T2659] Memory Limit: none [ 1229.455015][ T2659] ---[ end Kernel panic - not syncing: Fatal exception ]--- Rework the so the default return value is -EOPNOTSUPP. There are likely other callbacks such as security_inode_getsecctx() that may have the same problem, and that someone that understand the code better needs to audit them. Thank you Arnd for helping me figure out what went wrong. Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Anders Roxell <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: James Morris <[email protected]> Cc: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-05-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+2
Merge misc fixes from Andrew Morton: "7 fixes" * emailed patches from Andrew Morton <[email protected]>: kasan: add missing functions declarations to kasan.h kasan: consistently disable debugging features ipc/util.c: sysvipc_find_ipc() incorrectly updates position index userfaultfd: fix remap event with MREMAP_DONTUNMAP mm/gup: fix fixup_user_fault() on multiple retries epoll: call final ep_events_available() check under the lock mm, memcg: fix inconsistent oom event behavior
2020-05-14Merge tag 'trace-v5.7-rc5' of ↵Linus Torvalds1-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing fixes from Steven Rostedt: "Various tracing fixes: - Fix a crash when having function tracing and function stack tracing on the command line. The ftrace trampolines are created as executable and read only. But the stack tracer tries to modify them with text_poke() which expects all kernel text to still be writable at boot. Keep the trampolines writable at boot, and convert them to read-only with the rest of the kernel. - A selftest was triggering in the ring buffer iterator code, that is no longer valid with the update of keeping the ring buffer writable while a iterator is reading. Just bail after three failed attempts to get an event and remove the warning and disabling of the ring buffer. - While modifying the ring buffer code, decided to remove all the unnecessary BUG() calls" * tag 'trace-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Remove all BUG() calls ring-buffer: Don't deactivate the ring buffer on failed iterator reads x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up
2020-05-14mm, memcg: fix inconsistent oom event behaviorYafang Shao1-0/+2
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changed the behavior of memcg events, which will now consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg. Here's an example on why this behavior is inconsistent in cgroup1. root memcg / memcg foo / memcg bar Suppose there's an oom_kill in memcg bar, then the oon_kill will be root memcg : memory.oom_control(oom_kill) 0 / memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1 For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. Then the user has to know whether the memcg is root or not. If we can't fully support it in cgroup1, for example by adding memory.events.local into cgroup1 as well, then let's don't touch its original behavior. Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Chris Down <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-05-14usb: raw-gadget: support stalling/halting/wedging endpointsAndrey Konovalov1-0/+15
Raw Gadget is currently unable to stall/halt/wedge gadget endpoints, which is required for proper emulation of certain USB classes. This patch adds a few more ioctls: - USB_RAW_IOCTL_EP0_STALL allows to stall control endpoint #0 when there's a pending setup request for it. - USB_RAW_IOCTL_SET/CLEAR_HALT/WEDGE allow to set/clear halt/wedge status on non-control non-isochronous endpoints. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Signed-off-by: Andrey Konovalov <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>
2020-05-14usb: raw-gadget: fix gadget endpoint selectionAndrey Konovalov1-3/+69
Currently automatic gadget endpoint selection based on required features doesn't work. Raw Gadget tries iterating over the list of available endpoints and finding one that has the right direction and transfer type. Unfortunately selecting arbitrary gadget endpoints (even if they satisfy feature requirements) doesn't work, as (depending on the UDC driver) they might have fixed addresses, and one also needs to provide matching endpoint addresses in the descriptors sent to the host. The composite framework deals with this by assigning endpoint addresses in usb_ep_autoconfig() before enumeration starts. This approach won't work with Raw Gadget as the endpoints are supposed to be enabled after a set_configuration/set_interface request from the host, so it's too late to patch the endpoint descriptors that had already been sent to the host. For Raw Gadget we take another approach. Similarly to GadgetFS, we allow the user to make the decision as to which gadget endpoints to use. This patch adds another Raw Gadget ioctl USB_RAW_IOCTL_EPS_INFO that exposes information about all non-control endpoints that a currently connected UDC has. This information includes endpoints addresses, as well as their capabilities and limits to allow the user to choose the most fitting gadget endpoint. The USB_RAW_IOCTL_EP_ENABLE ioctl is updated to use the proper endpoint validation routine usb_gadget_ep_match_desc(). These changes affect the portability of the gadgets that use Raw Gadget when running on different UDCs. Nevertheless, as long as the user relies on the information provided by USB_RAW_IOCTL_EPS_INFO to dynamically choose endpoint addresses, UDC-agnostic gadgets can still be written with Raw Gadget. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Signed-off-by: Andrey Konovalov <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>
2020-05-14usb: raw-gadget: improve uapi headers commentsAndrey Konovalov1-10/+11
Fix typo "trasferred" => "transferred". Don't call USB requests URBs. Fix comment style. Signed-off-by: Andrey Konovalov <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>
2020-05-14Merge tag 'drm/tegra/for-5.7-fixes' of ↵Dave Airlie1-0/+3
git://anongit.freedesktop.org/tegra/linux into drm-fixes drm/tegra: Fixes for v5.7 This contains a pair of patches which fix SMMU support on Tegra124 and Tegra210 for host1x and the Tegra DRM driver. Signed-off-by: Dave Airlie <[email protected]> From: Thierry Reding <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-05-13RDMA/mlx5: Add support for drop action in DV steeringDaria Velikovsky1-0/+1
When drop action is used the matching packet will stop processing in steering and will be dropped. This functionality will allow users to drop matching packets. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daria Velikovsky <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-13RDMA/mlx5: Add support in steering default missMaor Gottlieb1-0/+5
User can configure default miss rule in order to skip matching in the user domain and forward the packet to the kernel steering domain. When user requests a default miss rule, we add steering rule to forward the traffic to the next namespace. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-13Merge branch 'mellanox/mlx5-next' into rdma.git for/nextJason Gunthorpe4-35/+36
From the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies in following patches * branch 'mellanox/mlx5-next': net/mlx5: Add support in forward to namespace {IB/net}/mlx5: Simplify don't trap code net/mlx5: Replace zero-length array with flexible-array Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-13net/mlx5: Add support in forward to namespaceMaor Gottlieb1-0/+1
Currently, fs_core supports rule of forward the traffic to continue matching in the next priority, now we add support to forward the traffic matching in the next namespace. Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Reviewed-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2020-05-13IB/rdmavt: Replace zero-length array with flexible-arrayGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Link: https://lore.kernel.org/r/20200507185342.GA14476@embeddedor Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-12RDMA/ucma: Return stable IB device index as identifierLeon Romanovsky1-0/+4
The librdmacm uses node_guid as identifier to correlate between IB devices and CMA devices. However FW resets cause to such "connection" to be lost and require from the user to restart its application. Extend UCMA to return IB device index, which is stable identifier. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-12x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot upSteven Rostedt (VMware)1-0/+23
Booting one of my machines, it triggered the following crash: Kernel/User page tables isolation: enabled ftrace: allocating 36577 entries in 143 pages Starting tracer 'function' BUG: unable to handle page fault for address: ffffffffa000005c #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 2014067 P4D 2014067 PUD 2015063 PMD 7b253067 PTE 7b252061 Oops: 0003 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-test+ #24 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 RIP: 0010:text_poke_early+0x4a/0x58 Code: 34 24 48 89 54 24 08 e8 bf 72 0b 00 48 8b 34 24 48 8b 4c 24 08 84 c0 74 0b 48 89 df f3 a4 48 83 c4 10 5b c3 9c 58 fa 48 89 df <f3> a4 50 9d 48 83 c4 10 5b e9 d6 f9 ff ff 0 41 57 49 RSP: 0000:ffffffff82003d38 EFLAGS: 00010046 RAX: 0000000000000046 RBX: ffffffffa000005c RCX: 0000000000000005 RDX: 0000000000000005 RSI: ffffffff825b9a90 RDI: ffffffffa000005c RBP: ffffffffa000005c R08: 0000000000000000 R09: ffffffff8206e6e0 R10: ffff88807b01f4c0 R11: ffffffff8176c106 R12: ffffffff8206e6e0 R13: ffffffff824f2440 R14: 0000000000000000 R15: ffffffff8206eac0 FS: 0000000000000000(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa000005c CR3: 0000000002012000 CR4: 00000000000006b0 Call Trace: text_poke_bp+0x27/0x64 ? mutex_lock+0x36/0x5d arch_ftrace_update_trampoline+0x287/0x2d5 ? ftrace_replace_code+0x14b/0x160 ? ftrace_update_ftrace_func+0x65/0x6c __register_ftrace_function+0x6d/0x81 ftrace_startup+0x23/0xc1 register_ftrace_function+0x20/0x37 func_set_flag+0x59/0x77 __set_tracer_option.isra.19+0x20/0x3e trace_set_options+0xd6/0x13e apply_trace_boot_options+0x44/0x6d register_tracer+0x19e/0x1ac early_trace_init+0x21b/0x2c9 start_kernel+0x241/0x518 ? load_ucode_intel_bsp+0x21/0x52 secondary_startup_64+0xa4/0xb0 I was able to trigger it on other machines, when I added to the kernel command line of both "ftrace=function" and "trace_options=func_stack_trace". The cause is the "ftrace=function" would register the function tracer and create a trampoline, and it will set it as executable and read-only. Then the "trace_options=func_stack_trace" would then update the same trampoline to include the stack tracer version of the function tracer. But since the trampoline already exists, it updates it with text_poke_bp(). The problem is that text_poke_bp() called while system_state == SYSTEM_BOOTING, it will simply do a memcpy() and not the page mapping, as it would think that the text is still read-write. But in this case it is not, and we take a fault and crash. Instead, lets keep the ftrace trampolines read-write during boot up, and then when the kernel executable text is set to read-only, the ftrace trampolines get set to read-only as well. Link: https://lkml.kernel.org/r/[email protected] Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()") Acked-by: Peter Zijlstra <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-05-12tcp: fix SO_RCVLOWAT hangs with fat skbsEric Dumazet1-0/+13
We autotune rcvbuf whenever SO_RCVLOWAT is set to account for 100% overhead in tcp_set_rcvlowat() This works well when skb->len/skb->truesize ratio is bigger than 0.5 But if we receive packets with small MSS, we can end up in a situation where not enough bytes are available in the receive queue to satisfy RCVLOWAT setting. As our sk_rcvbuf limit is hit, we send zero windows in ACK packets, preventing remote peer from sending more data. Even autotuning does not help, because it only triggers at the time user process drains the queue. If no EPOLLIN is generated, this can not happen. Note poll() has a similar issue, after commit c7004482e8dc ("tcp: Respect SO_RCVLOWAT in tcp_poll().") Fixes: 03f45c883c6f ("tcp: avoid extra wakeups for SO_RCVLOWAT users") Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-12ptp: fix struct member comment for do_aux_workJacob Keller1-4/+4
The do_aux_work callback had documentation in the structure comment which referred to it as "do_work". Signed-off-by: Jacob Keller <[email protected]> Cc: Richard Cochran <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: David S. Miller <[email protected]>