Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull nfsd fix from Bruce Fields:
"One fix for a problem introduced in the most recent merge window and
found by Dave Jones and KASAN"
* tag 'nfsd-4.13-1' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix a memory scribble in the callback channel
|
|
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by creating __hfsplus_set_posix_acl() function that does
not call posix_acl_update_mode() and use it when inheriting ACLs. That
prevents SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: [email protected]
Signed-off-by: Jan Kara <[email protected]>
|
|
This patch fixes an issue that unexpected behavior happens when
both the interrupt handler and renesas_usb3_ep_enable() are called.
In this case, since usb3_start_pipen() checked the usb3_ep->started,
but the flags was not protected. So, this patch protects the flag
by usb3->lock. Since renesas_usb3_ep_enable() for EP0 will be not called,
this patch doesn't take care of usb3_start_pipe0().
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
The dedicated dmac can transfer a zero-length-packet (zlp) if some bits
of the USB_COM_CON register. However, the commit 2d4aa21a73ba ("usb:
gadget: udc: renesas_usb3: add support for dedicated DMAC") didn't set
the bits to 1. So, this patch fixes it.
Fixes: 2d4aa21a73b ("usb: gadget: udc: renesas_usb3: add support for dedicated DMAC)
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
The commit 2d4aa21a73ba ("usb: gadget: udc: renesas_usb3: add support
for dedicated DMAC") has a bug in the renesas_usb3_dma_free_prd().
The size of dma_free_coherent() should be the same with dma_alloc_coherent()
Otherwise, this code causes a WARNING by mm/page_alloc.c when
renesas_usb3_dma_free_prd() is called. So, this patch fixes it.
Fixes: 2d4aa21a73ba ("usb: gadget: udc: renesas_usb3: add support for dedicated DMAC")
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
There's a bug in PEBs event enabling code, that prevents PEBS
freq events to work properly after non freq PEBS event was run.
freq events - perf_event_attr::freq set
-F <freq> option of perf record
PEBS events - perf_event_attr::precise_ip > 0
default for perf record
Like in following example with CPU 0 busy, we expect ~10000 samples
for following perf tool run:
# perf record -F 10000 -C 0 sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.640 MB perf.data (10031 samples) ]
Everything's fine, but once we run non freq PEBS event like:
# perf record -c 10000 -C 0 sleep 1
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 1.053 MB perf.data (20061 samples) ]
the freq events start to fail like this:
# perf record -F 10000 -C 0 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.185 MB perf.data (40 samples) ]
The issue is in non freq PEBs event initialization of debug_store reset
field, which value is used to auto-reload the counter value after PEBS
event drain. This value is not being used for PEBS freq events, but once
we run non freq event it stays in debug_store data and screws the
sample_freq counting for PEBS freq events.
Setting the reset field to 0 for freq events.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Add perf core PMU support for Intel Goldmont Plus CPU cores:
- The init code is based on Goldmont.
- There is a new cache event list, based on the Goldmont cache event
list.
- All four general-purpose performance counters support PEBS.
- The first general-purpose performance counter is for reduced skid
PEBS mechanism. Using :ppp to indicate the event which want to do
reduced skid PEBS.
- Goldmont Plus has 4-wide pipeline for Topdown
Signed-off-by: Kan Liang <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.
The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.
Signed-off-by: Harry Pan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
According to ECMA-130 standard maximum valid track number is 99. Since
'session' mount option starts indexing at 0 (and we add 1 to the passed
number), we should refuse value 99. Also the condition in
isofs_get_last_session() unnecessarily repeats the check - remove it.
Reported-by: David Howells <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
Currently even with STRICT_KERNEL_RWX we leave the __init text marked
executable after init, which is bad.
Add a hook to mark it NX (no-execute) before we free it, and implement
it for radix and hash.
Note that we use __init_end as the end address, not _einittext,
because overlaps_kernel_text() uses __init_end, because there are
additional executable sections other than .init.text between
__init_begin and __init_end.
Tested on radix and hash with:
0:mon> p $__init_begin
*** 400 exception occurred
Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs")
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When changing a file's acl mask, reiserfs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
Signed-off-by: Ernesto A. Fernández <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
When changing a file's acl mask, ext2_set_acl() will first set the group
bits of i_mode to the value of the mask, and only then set the actual
extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the file
had no acl attribute to begin with, the system will from now on assume
that the mask permission bits are actual group permission bits, potentially
granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
[JK: Rebased on top of "ext2: Don't clear SGID when inheriting ACLs"]
Signed-off-by: Ernesto A. Fernández <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
Move the core logic into a helper, so we can use it for changing other
permissions.
We also change the logic to align start down, and end up. This means
calling the function with a range will expand that range to be at
least 1 mmu_linear_psize page in size. We need that so we can use it
on __init_begin ... __init_end which is not a full page in size.
This should always work for _stext/__init_begin, because we align
__init_begin to _stext + 16M in the linker script.
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Move the core logic into a helper, so we can use it for changing permissions
other than _PAGE_WRITE.
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
A recent commit:
d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant")
introduced a VM_WARN_ON(!in_atomic()) which generates false positives
on every VM entry on !CONFIG_PREEMPT_COUNT kernels.
Replace it with a test for preemptible(), which appears to match the
original intent and works across different CONFIG_PREEMPT* variations.
Signed-off-by: Roman Kagan <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant")
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Since commit d59f6617eef0f ("genirq: Allow fwnode to carry name
information only") the irqdomain core sets the names of irq domains.
When the name is allocated the new IRQ_DOMAIN_NAME_ALLOCATED flag is
set. Replacing the allocated name with a constant one is not a good
idea, since calling the new irq_domain_update_bus_token() API, added to
the MIPS GIC driver by commit 96f0d93a487e1 ("irqchip/MSI: Use
irq_domain_update_bus_token instead of an open coded access") will
attempt to kfree the pointer, and result in a kernel OOPS.
Fix this by removing the names, now that they are set by the irqdomain
core. This effectively reverts commit 21c57fd13589 ("irqchip/mips-gic:
Populate irq_domain names").
Fixes: d59f6617eef0f ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Matt Redfearn <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
|
|
Add support for USB Device TP-Link TL-WN722N v2.
VendorID: 0x2357, ProductID: 0x010c
Signed-off-by: Michael Gugino <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch makes use of functions added in the previous patch. It
registers ldisc during init of main speakup module and unregisters it
during exit. It also removes the code to register ldisc every time a
synth module is loaded. This way we only register the ldisc once when
main speakup module is loaded. Since main speakup module is required by
all synth modules, it is only unloaded when all synths have been
unloaded. Therefore we unregister the ldisc once, when all speakup
related references to the ldisc have returned. In unlikely scenario of
something outside speakup using the ldisc, the ldisc refcount check in
tty_unregister_ldisc will ensure that it is not unregistered while in
use.
The function to register ldisc doesn't cause speakup init function to
fail. That is different from current behaviour where failure to register
ldisc results in failure to load the specific synth module. This is
because speakup module is also required by those synths which don't use
tty and ldisc. We don't want to prevent those modules from loading when
ldisc fails to register. The synth modules will correctly fail when
trying to set N_SPEAKUP to tty, if ldisc registrationi had failed.
Signed-off-by: Okash Khawaja <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the above two functions and makes them available to
main.c where they will be called during init and exit functions of
main speakup module. Following patch will make use of them.
Signed-off-by: Okash Khawaja <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Speakup opens tty using tty_open_by_driver. When closing, it calls
tty_ldisc_release but doesn't close and remove the tty itself. As a
result, that tty cannot be opened from user space. This patch calls
tty_release_struct which ensures that tty is safely removed and freed
up. It also calls tty_ldisc_release, so speakup doesn't need to call it.
Signed-off-by: Okash Khawaja <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
If vesafb is enabled in the config then /dev/fb0 is created by vesa
and this sm750 driver gets fb1, fb2. But we need to be fb0 and fb1 to
effectively work with xorg.
So if it has been alloted fb1, then try to remove the other fb0.
In the previous send, why #ifdef is used was asked.
https://lkml.org/lkml/2017/6/25/57
Answered at: https://lkml.org/lkml/2017/6/25/69
Also pasting here for reference.
'Did a quick research into "why".
The patch d8801e4df91e ("x86/PCI: Set IORESOURCE_ROM_SHADOW only for the
default VGA device") has started setting IORESOURCE_ROM_SHADOW in flags
for a default VGA device and that is being done only for x86.
And so, we will need that #ifdef to check IORESOURCE_ROM_SHADOW as that
needs to be checked only for a x86 and not for other arch.'
Cc: <[email protected]> # v4.4+
Signed-off-by: Teddy Wang <[email protected]>
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A previous optimisation incorrectly assumed the PAPR hcall does
not use r12, and clobbers it upon entry. In fact it is used as
an input. This can result in KVM guests crashing (observed with
PR KVM).
Instead of using r12 to save r13, tihs patch saves r13 in ctr.
This is more costly, but not as slow as using the SPRG.
Fixes: acd7d8cef0153 ("powerpc/64s: Optimize hypercall/syscall entry")
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We now get a helpful warning for code that calls copy_{from,to}_iter
without checking the return value, introduced by commit aa28de275a24
("iov_iter/hardening: move object size checks to inlined part").
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c: In function 'kiblnd_send':
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c:1643:2: error: ignoring return value of 'copy_from_iter', declared with attribute warn_unused_result [-Werror=unused-result]
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c: In function 'kiblnd_recv':
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c:1744:3: error: ignoring return value of 'copy_to_iter', declared with attribute warn_unused_result [-Werror=unused-result]
In case we get short copies here, we may get incorrect behavior.
I've added failure handling for both rx and tx now, returning
-EFAULT as expected.
Cc: [email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: James Simmons <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
As per USB spec, multiple-bytes fields are stored
in little-endian order. Use CPU<->LE helpers for
such fields.
Signed-off-by: Ruslan Bilovol <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
As per USB spec, multiple-bytes fields are stored
in little-endian order. Use CPU<->LE helpers for
such fields.
Signed-off-by: Ruslan Bilovol <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
USB spec says that multiple byte fields are stored in
little-endian order (see chapter 8.1 of USB2.0 spec and
chapter 7.1 of USB3.0 spec), thus mark such fields as LE
for UAC1 and UAC2 headers
Signed-off-by: Ruslan Bilovol <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
Fixes the following Sparse warnings:
>> drivers/usb/gadget/udc/snps_udc_plat.c:31:6: sparse: symbol 'start_udc' was not declared. Should it be static?
>> drivers/usb/gadget/udc/snps_udc_plat.c:41:6: sparse: symbol 'stop_udc' was not declared. Should it be static?
>> drivers/usb/gadget/udc/snps_udc_plat.c:79:6: sparse: symbol 'udc_drd_work' was not declared. Should it be static?
Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
Reseted DEVADDR field in DCFG to zero on USB RESET.
Device address in DCFG register does not reset to zero,
which required to pass enumeration, after disconnect and
reconnect.
Acked-by: John Youn <[email protected]>
Signed-off-by: Minas Harutyunyan <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
dxfer_len is an unsigned int and we always assign a value > 0 to it, so
it doesn't make any sense to check if it is < 0. We can't really check
dxferp as well as we have both NULL and not NULL cases in the possible
call paths.
So just return true for SG_DXFER_FROM_DEV transfer in
sg_is_valid_dxfer().
Signed-off-by: Johannes Thumshirn <[email protected]>
Reported-by: Colin Ian King <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Cc: Douglas Gilbert <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The smartpqi firmware will bypass the cache for any request larger than
1MB, so we should cap the request size to avoid any performance
degradation in kernels later than v4.3
This degradation is caused from d2be537c3ba3568acd79cd178327b842e60d035e,
which changed max_sectors_kb to 1280k, but the hardware is able to
work fine with it, so the true fix should be from smartpqi driver.
Signed-off-by: Yadan Fan <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Acked-by: Don Brace <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The hpsa firmware will bypass the cache for any request larger than 1MB,
so we should cap the request size to avoid any performance degradation
in kernels later than v4.3
This degradation is caused from d2be537c3ba3568acd79cd178327b842e60d035e,
which changed max_sectors_kb to 1280k, but the hardware is able to work
fine with it, so the true fix should be from hpsa driver.
Signed-off-by: Yadan Fan <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Acked-by: Don Brace <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Before retrying to flush data or dentry pages, we need to release cpu in order
to prevent watchdog.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch includes seq_file.h to avoid compile error.
Signed-off-by: Eric Biggers <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
POWER9 DD2 can see spurious PMU interrupts after state-loss idle in
some conditions.
A solution is to save and reload MMCR0 over state-loss idle.
Signed-off-by: Nicholas Piggin <[email protected]>
Acked-by: Madhavan Srinivasan <[email protected]>
Tested-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Playing with IP-O-IB interface can trigger a warning message:
"ib0: Failed to modify QP to ERROR state" to be logged.
This happens when the QP is in IB_QPS_RESET state and the stack
is trying to transition it to IB_QPS_ERR state in ipoib_ib_dev_stop().
According to the IB spec, Table 91 - "QP State Transition Properties"
it looks like the transition from reset to error is valid:
Transition: Any State to Error
Required Attributes: None
Optional Attributes: None allowed
Actions: Queue processing is stopped. Work Requests pending or in
process are completed in error, when possible.
This patch allows the transition and quiets the message.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Tadeusz Struk <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch correct the comment style warnings caught by
checkpatch.pl script.
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When modified the MAC address used hns_roce_mac function, we release and create
reserved qp again, It is not necessary to use spin_lock_bh and spin_unlock_bh in
handle_en_event, Otherwise, it will occur a error. This patch mainly fixes it.
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When opcode of work request is RDMA read and write, it
should use rdma_wr to get remote_addr and rkey. This
patch fixes it.
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When destroyed rc qp, the hr_qp will be used after freed. This patch
will fix it.
Signed-off-by: Lijun Ou <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
In hip06 SoC, RoCE driver creates 8 reserved loopback QPs to
ensure zero wqe when free mr. However, if the enabled phy
port number is less than 6, it will fail in polling cqe with
8 reserved loopback QPs.
In order to solve this problem, the number of loopback Qps
will be adjusted based on the number of enabled phy port.
Signed-off-by: Shaobo Xu <[email protected]>
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The RXE coupled with dummy device causes to the kernel panic attached
below. The panic happens when ib_register_device tries to set dma_mask
by accessing a NULLed parent device.
The RXE does not actually use DMA, so we can set the dma_mask
to architecture value.
[16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core]
[16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246
[16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000
[16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
[16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f
[16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000
[16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8
[16240.263314] FS: 00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000
[16240.267292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0
[16240.277066] Call Trace:
[16240.281836] ? __kmalloc+0x26f/0x280
[16240.286596] rxe_register_device+0x297/0x300 [rdma_rxe]
[16240.291377] rxe_add+0x535/0x5b0 [rdma_rxe]
[16240.297586] rxe_net_add+0x3e/0xc0 [rdma_rxe]
[16240.302375] rxe_param_set_add+0x65/0x144 [rdma_rxe]
[16240.307769] param_attr_store+0x68/0xd0
[16240.311640] module_attr_store+0x1d/0x30
[16240.316421] sysfs_kf_write+0x3a/0x50
[16240.317802] kernfs_fop_write+0xff/0x180
[16240.322989] __vfs_write+0x37/0x140
[16240.328164] ? handle_mm_fault+0xce/0x240
[16240.333340] vfs_write+0xb2/0x1b0
[16240.335013] SyS_write+0x55/0xc0
[16240.340632] entry_SYSCALL_64_fastpath+0x1a/0xa9
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <[email protected]>
Reviewed-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
In the time between rxe_send has finished and skb destructor
called, the QP's ref count might be 0, leading to a possible
QP destruction. This will lead to a kernel panic when the destructor
dereferences the QP.
The operation of incrementing QP ref count at rxe_send and decrementing
from skb destructor will prevent this crash.
BUG: unable to handle kernel NULL pointer dereference at 000000000000072c
IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
PGD 0 [16240.211178]
Oops: 0002 [#1] SMP
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 4.9.0-mlnx #1
Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
task: ffff88042d6b1480 task.stack: ffffc90001904000
RIP: 0010:[<ffffffffa05df765>] [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
RSP: 0018:ffff88043fcc3df0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200
RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700
RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6
R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700
R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200
FS: 0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0
Stack:
ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2
ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700
ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96
Call Trace:
<IRQ> [16240.336345] [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0
[<ffffffff817b42c2>] skb_release_all+0x12/0x30
[<ffffffff817b4332>] kfree_skb+0x32/0x90
[<ffffffff81893f96>] ndisc_error_report+0x36/0x40
[<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0
[<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0
[<ffffffff81109295>] call_timer_fn+0x35/0x120
[<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460
[<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30
[<ffffffff810366b9>] ? sched_clock+0x9/0x10
[<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0
[<ffffffff818dd537>] __do_softirq+0xd7/0x289
[<ffffffff810a6c95>] irq_exit+0xb5/0xc0
[<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50
[<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90
<EOI> [16240.395776] [<ffffffff818da156>] ? native_safe_halt+0x6/0x10
[<ffffffff818d9e6e>] default_idle+0x1e/0xd0
[<ffffffff8103797f>] arch_cpu_idle+0xf/0x20
[<ffffffff818da2c5>] default_idle_call+0x35/0x40
[<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210
[<ffffffff81050433>] start_secondary+0x103/0x130
RIP [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <[email protected]>
Reviewed-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The driver checks if the lower level driver supports get_stats, and if
so calls it to get the updated statistics, otherwise takes from the
current netdevice stats object.
Signed-off-by: Erez Shitrit <[email protected]>
Reviewed-by: Alex Vesker <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Yuval Shaia <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs
according to the netdev events.
The ib_wq isn't an ordered workqueue and thus two work elements can be executed
concurrently which will result in unexpected behavior and inconsistency of the
GIDs cache content.
Example:
ifconfig eth1 11.11.11.11/16 up
This command will invoke the following netdev events in the following order:
1. NETDEV_UP
2. NETDEV_DOWN
3. NETDEV_UP
If (2) and (3) will be executed concurrently or in reverse order, instead of
having a new GID with 11.11.11.11 IP, we will end up without any new GIDs.
Signed-off-by: Majd Dibbiny <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Yuval Shaia <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The failure in creation of debugfs entries for mr_cache left entries,
which were already created.
It caused to mismatch and misguiding for the end users. The solution
is to clean mr_cache debugfs root, so no leftovers will be in the
system. In addition, let's document why the error is not needed to be
forwarded to user in case of failure.
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Matan Barak <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
There are no users for IB_QP_CREATE_USE_GFP_NOIO flag,
so let's remove it.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.
The patch removes the gfp flags argument and updates all driver paths.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The caller to the driver marks GFP_NOIO allocations with help
of memalloc_noio-* calls now. This makes redundant to pass down
to the driver gfp flags, which can be GFP_KERNEL only.
The patch removes the gfp flags argument and updates all driver paths.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Acked-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
during memory allocation") added the memalloc_noio_(save|restore) functions
to enable people to modify the MM behavior by disabling I/O during memory
allocation. This was further extended in Fixes: 934f3072c17c ("mm: clear
__GFP_FS when PF_MEMALLOC_NOIO is set"). memalloc_noio_* functions prevent
allocation paths recursing back into the filesystem without explicitly
changing the flags for every allocation site.
However the IPoIB hasn't been keeping up with the changes and missed
completely these memalloc_noio_* calls. This led to update of
allocation site with special QP creation flag, see commit 09b93088d750
("IB: Add a QP creation flag to use GFP_NOIO allocations"), while this
flag is supported by small number of drivers in IB stack.
Let's change it by updating to memalloc_noio_* calls and allow
for every driver underneath enjoy NOIO allocations.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch checks if there is a driver below that
needs to be updated on the new MTU and calls it
accordingly.
Signed-off-by: Erez Shitrit <[email protected]>
Reviewed by: Alex Vesker <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Yuval Shaia <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|