Age | Commit message (Collapse) | Author | Files | Lines |
|
OS boot during Boot from SAN was stuck at dracut emergency shell after
enabling NVMe driver parameter. For non-MQ support the driver was enabling
MQ. Add a check to confirm if FW supports MQ.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
qla_nvme_register_hba() puts out a warning when there are not enough queue
pairs available for FC-NVME. Just fail the NVME registration rather than a
WARNING + call Trace.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Arun Easi <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
anytime
ql2xextended_error_logging can now be set to 1 to get the default mask
value, as opposed to at module load time only.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Arun Easi <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Update debug level and message for ELS IOCB done.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Multipath errors were seen during failback due to login timeout. The
remote device sent LOGO, the local host tore down the session and did
relogin. The RSCN arrived indicates remote device is going through failover
after which the relogin is in a 20s timeout phase. At this point the
driver is stuck in the relogin process. Add a fix to delete the session as
part of abort/flush the login.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Correct the supported speeds for 16G Mezz card.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Perform implicit logout to flush I/O on zone disable.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
On Zone Disable, certain switches would ignore all commands. This causes
timeout for both switch scan command and abort of that command. On
detection of this condition, all sessions will be shutdown.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Improves readability of qla_mbx.c.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Reviewed-by: Roman Bolshakov <[email protected]>
Signed-off-by: Enzo Matsumiya <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were
mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The
problem was tracked down to a missing critical section on a "short circuit"
path. Namely, the time to process the current command so far has already
exceeded the requested command duration (i.e. the number of nanoseconds in
the ndelay parameter).
The random=1 parameter setting was pivotal in finding this error. The
failure scenario involved first taking that "short circuit" path (due to a
very short command duration) and then taking the more likely
hrtimer_start() path (due to a longer command duration). With random=1 each
command's duration is taken from the uniformly distributed [0..ndelay)
interval. The fio utility also helped by reliably generating the error
scenario at about once per minute on a RPi 4 (64 bit OS).
Link: https://lore.kernel.org/r/[email protected]
Reported-by: John Garry <[email protected]>
Reviewed-by: Lee Duncan <[email protected]>
Signed-off-by: Douglas Gilbert <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
timer_setup()"), we intentionally only passed zfcp_adapter as context
argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
adapter recovery, it was unnecessary to sync against races between timeout
and (late) completion. Likewise, we only passed zfcp_erp_action as context
argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
it was unnecessary to sync against races between timeout and (late)
completion.
Meanwhile the timeout handlers get timer_list as context argument and do a
timer-specific container-of to zfcp_fsf_req which can have been freed.
Fix it by making sure that any request timeout handlers, that might just
have started before del_timer(), are completed by using del_timer_sync()
instead. This ensures the request free happens afterwards.
Space time diagram of potential use-after-free:
Basic idea is to have 2 or more pending requests whose timeouts run out at
almost the same time.
req 1 timeout ERP thread req 2 timeout
---------------- ---------------- ---------------------------------------
zfcp_fsf_request_timeout_handler
fsf_req = from_timer(fsf_req, t, timer)
adapter = fsf_req->adapter
zfcp_qdio_siosl(adapter)
zfcp_erp_adapter_reopen(adapter,...)
zfcp_erp_strategy
...
zfcp_fsf_req_dismiss_all
list_for_each_entry_safe
zfcp_fsf_req_complete 1
del_timer 1
zfcp_fsf_req_free 1
zfcp_fsf_req_complete 2
zfcp_fsf_request_timeout_handler
del_timer 2
fsf_req = from_timer(fsf_req, t, timer)
zfcp_fsf_req_free 2
adapter = fsf_req->adapter
^^^^^^^ already freed
Link: https://lore.kernel.org/r/[email protected]
Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
Cc: <[email protected]> #4.15+
Suggested-by: Julian Wiedmann <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Signed-off-by: Steffen Maier <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
If the bit corresponding to a task in the Doorbell register has been
cleared, no need to poll the status of the task on the device side and to
send an Abort Task TM. Instead, let it directly goto cleanup.
In addition, to keep original debug output, move the goto below the debug
print.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Stanley Chu <[email protected]>
Reviewed-by: Can Guo <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
If somehow no interrupt notification is raised for a completed request and
its doorbell bit is cleared by host, UFS driver needs to cleanup its
outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally
in the following scenario:
After ufshcd_abort() returns, this request will be requeued by SCSI layer
with its outstanding bit set. Any future completed request will trigger
ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At
this time the "abnormal outstanding bit" will be detected and the "requeued
request" will be chosen to execute request post-processing flow. This is
wrong because this request is still "alive".
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Can Guo <[email protected]>
Acked-by: Avri Altman <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
For shared interrupts, the interrupt status might be zero, so check that
first.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The interrupt might be shared, in which case it is not an error for the
interrupt handler to be called when the interrupt status is zero, so don't
print the message unless there was enabled interrupt status.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code")
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Intel EHL UFS host controller advertises auto-hibernate capability but it
does not work correctly. Add a quirk for that.
[mkp: checkpatch fix]
Link: https://lore.kernel.org/r/[email protected]
Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL")
Acked-by: Stanley Chu <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Fix incorrect calculation of "ms" based waiting time in function
ufs_mtk_setup_clocks().
Link: https://lore.kernel.org/r/[email protected]
Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet")
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
In ufshcd_suspend(), after clk-gating is suspended and link is set
as Hibern8 state, ufshcd_hold() is still possibly invoked before
ufshcd_suspend() returns. For example, MediaTek's suspend vops may
issue UIC commands which would call ufshcd_hold() during the command
issuing flow.
Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
then ufshcd_hold() may enter infinite loops because there is no
clk-ungating work scheduled or pending. In this case, ufshcd_hold()
shall just bypass, and keep the link as Hibern8 state.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Avri Altman <[email protected]>
Co-developed-by: Andy Teng <[email protected]>
Signed-off-by: Andy Teng <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called
with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of
GFP_KERNEL.
Link: https://lore.kernel.org/r/[email protected]
cc: Hannes Reinecke <[email protected]>
Reviewed-by: Lee Duncan <[email protected]>
Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Fix to return error code PTR_ERR() from the error handling case instead of
0.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 22617e216331 ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes")
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Jing Xiangfeng <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull mailmap update from Kees Cook:
"This was originally part of my pstore tree, but when I realized that
mailmap needed re-alphabetizing, I decided to wait until -rc1 to send
this, as I saw a lot of mailmap additions pending in -next for the
merge window.
It's a programmatic reordering and the addition of a pstore
contributor's preferred email address"
* tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
mailmap: Add WeiXiong Liao
mailmap: Restore dictionary sorting
|
|
Pull networking fixes from David Miller:
"Another batch of fixes:
1) Remove nft_compat counter flush optimization, it generates warnings
from the refcount infrastructure. From Florian Westphal.
2) Fix BPF to search for build id more robustly, from Jiri Olsa.
3) Handle bogus getopt lengths in ebtables, from Florian Westphal.
4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and
Oleksij Rempel.
5) Reset iter properly on mptcp sendmsg() error, from Florian
Westphal.
6) Show a saner speed in bonding broadcast mode, from Jarod Wilson.
7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones.
8) Fix double unregister in bonding during namespace tear down, from
Cong Wang.
9) Disable RP filter during icmp_redirect selftest, from David Ahern"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
otx2_common: Use devm_kcalloc() in otx2_config_npa()
net: qrtr: fix usage of idr in port assignment to socket
selftests: disable rp_filter for icmp_redirect.sh
Revert "net: xdp: pull ethernet header off packet after computing skb->protocol"
phylink: <linux/phylink.h>: fix function prototype kernel-doc warning
mptcp: sendmsg: reset iter on error redux
net: devlink: Remove overzealous WARN_ON with snapshots
tipc: not enable tipc when ipv6 works as a module
tipc: fix uninit skb->data in tipc_nl_compat_dumpit()
net: Fix potential wrong skb->protocol in skb_vlan_untag()
net: xdp: pull ethernet header off packet after computing skb->protocol
ipvlan: fix device features
bonding: fix a potential double-unregister
can: j1939: add rxtimer for multipacket broadcast session
can: j1939: abort multipacket broadcast session when timeout occurs
can: j1939: cancel rxtimer on multipacket broadcast session complete
can: j1939: fix support for multipacket broadcast message
net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs'
net: fddi: skfp: cfm: Remove set but unused variable 'oldstate'
net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs'
...
|
|
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
Signed-off-by: Xu Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
My commit to make DMA ops support optional missed the reference in
the p2pdma code. And while the build bot didn't manage to find a config
where this can happen, Matthew did. Fix this by replacing two IS_ENABLED
checks with ifdefs.
Fixes: 2f9237d4f6df ("dma-mapping: make support for dma ops optional")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Matthew Wilcox <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>
|
|
Passing large uint32 sockaddr_qrtr.port numbers for port allocation
triggers a warning within idr_alloc() since the port number is cast
to int, and thus interpreted as a negative number. This leads to
the rejection of such valid port numbers in qrtr_port_assign() as
idr_alloc() fails.
To avoid the problem, switch to idr_alloc_u32() instead.
Fixes: bdabad3e363d ("net: Add Qualcomm IPC router")
Reported-by: [email protected]
Signed-off-by: Necip Fazil Yildiran <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
h1 is initially configured to reach h2 via r1 rather than the
more direct path through r2. If rp_filter is set and inherited
for r2, forwarding fails since the source address of h1 is
reachable from eth0 vs the packet coming to it via r1 and eth1.
Since rp_filter setting affects the test, explicitly reset it.
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
With latest `bpftool prog` command, we observed the following kernel
panic.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0
Oops: 0010 [#1] SMP
CPU: 9 PID: 6023 ...
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286
RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400
RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600
R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710
R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009
FS: 00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
rcu_core+0x1a4/0x440
__do_softirq+0xd3/0x2c8
irq_exit+0x9d/0xa0
smp_apic_timer_interrupt+0x68/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0033:0x47ce80
Code: Bad RIP value.
RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c
RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8
RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a
R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400
R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380
After further analysis, the bug is triggered by
Commit eaaacd23910f ("bpf: Add task and task/file iterator targets")
which introduced task_file bpf iterator, which traverses all open file
descriptors for all tasks in the current namespace.
The latest `bpftool prog` calls a task_file bpf program to traverse
all files in the system in order to associate processes with progs/maps, etc.
When traversing files for a given task, rcu read_lock is taken to
access all files in a file_struct. But it used get_file() to grab
a file, which is not right. It is possible file->f_count is 0 and
get_file() will unconditionally increase it.
Later put_file() may cause all kind of issues with the above
as one of sympotoms.
The failure can be reproduced with the following steps in a few seconds:
$ cat t.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define N 10000
int fd[N];
int main() {
int i;
for (i = 0; i < N; i++) {
fd[i] = open("./note.txt", 'r');
if (fd[i] < 0) {
fprintf(stderr, "failed\n");
return -1;
}
}
for (i = 0; i < N; i++)
close(fd[i]);
return 0;
}
$ gcc -O2 t.c
$ cat run.sh
#/bin/bash
for i in {1..100}
do
while true; do ./a.out; done &
done
$ ./run.sh
$ while true; do bpftool prog >& /dev/null; done
This patch used get_file_rcu() which only grabs a file if the
file->f_count is not zero. This is to ensure the file pointer
is always valid. The above reproducer did not fail for more
than 30 minutes.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Suggested-by: Josef Bacik <[email protected]>
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
WeiXiong Liao noted to me offlist that his preference for email address
had changed and that he'd like it updated in the mailmap so people
discussing pstore/blk would be able to reach him.
Cc: WeiXiong Liao <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
|
|
Several names had been recently appended (instead of inserted). While
git-shortlog doesn't need this file to be sorted, it helps humans to
keep it organized this way. Sort the entire file (which includes some
minor shuffling for dictionary order).
Done with the following commands:
grep -E '^(#|$)' .mailmap > .mailmap.head
grep -Ev '^(#|$)' .mailmap > .mailmap.body
sort -f .mailmap.body > .mailmap.body.sort
cat .mailmap.head .mailmap.body.sort > .mailmap
rm .mailmap.head .mailmap.body.sort
Signed-off-by: Kees Cook <[email protected]>
|
|
Currently invalid CPU addresses are not being sanity checked resulting in
SATA setup failure on a SynQuacer SC2A11 development machine. The original
check was removed by and earlier commit, so add a sanity check back in
to avoid this regression.
Fixes: 7a8b64d17e35 ("of/address: use range parser for of_dma_get_range")
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
See the SDM, volume 3, section 4.4.1:
If PAE paging would be in use following an execution of MOV to CR0 or
MOV to CR4 (see Section 4.1.1) and the instruction is modifying any of
CR0.CD, CR0.NW, CR0.PG, CR4.PAE, CR4.PGE, CR4.PSE, or CR4.SMEP; then
the PDPTEs are loaded from the address in CR3.
Fixes: b9baba8614890 ("KVM, pkeys: expose CPUID/CR4 to guest")
Cc: Huaitong Han <[email protected]>
Signed-off-by: Jim Mattson <[email protected]>
Reviewed-by: Peter Shier <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
See the SDM, volume 3, section 4.4.1:
If PAE paging would be in use following an execution of MOV to CR0 or
MOV to CR4 (see Section 4.1.1) and the instruction is modifying any of
CR0.CD, CR0.NW, CR0.PG, CR4.PAE, CR4.PGE, CR4.PSE, or CR4.SMEP; then
the PDPTEs are loaded from the address in CR3.
Fixes: 0be0226f07d14 ("KVM: MMU: fix SMAP virtualization")
Cc: Xiao Guangrong <[email protected]>
Signed-off-by: Jim Mattson <[email protected]>
Reviewed-by: Peter Shier <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The PK bit of the error code is computed dynamically in permission_fault
and therefore need not be passed to gva_to_gpa: only the access bits
(fetch, user, write) need to be passed down.
Not doing so causes a splat in the pku test:
WARNING: CPU: 25 PID: 5465 at arch/x86/kvm/mmu.h:197 paging64_walk_addr_generic+0x594/0x750 [kvm]
Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0014.D62.2001092233 01/09/2020
RIP: 0010:paging64_walk_addr_generic+0x594/0x750 [kvm]
Code: <0f> 0b e9 db fe ff ff 44 8b 43 04 4c 89 6c 24 30 8b 13 41 39 d0 89
RSP: 0018:ff53778fc623fb60 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ff53778fc623fbf0 RCX: 0000000000000007
RDX: 0000000000000001 RSI: 0000000000000002 RDI: ff4501efba818000
RBP: 0000000000000020 R08: 0000000000000005 R09: 00000000004000e7
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007
R13: ff4501efba818388 R14: 10000000004000e7 R15: 0000000000000000
FS: 00007f2dcf31a700(0000) GS:ff4501f1c8040000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000001dea475005 CR4: 0000000000763ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
paging64_gva_to_gpa+0x3f/0xb0 [kvm]
kvm_fixup_and_inject_pf_error+0x48/0xa0 [kvm]
handle_exception_nmi+0x4fc/0x5b0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x911/0x1c10 [kvm]
kvm_vcpu_ioctl+0x23e/0x5d0 [kvm]
ksys_ioctl+0x92/0xb0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x3e/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace d17eb998aee991da ]---
Reported-by: Sean Christopherson <[email protected]>
Fixes: 897861479c064 ("KVM: x86: Add helper functions for illegal GPA checking and page fault injection")
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
IA-64 is special and treats pgd_offset_k() differently to pgd_offset(),
using different formulae to calculate the indices into the kernel and user
PGDs. The index into the user PGDs takes into account the region number,
but the index into the kernel (init_mm) PGD always assumes a predefined
kernel region number. Commit 974b9b2c68f3 ("mm: consolidate pte_index() and
pte_offset_*() definitions") made IA-64 use a generic pgd_offset_k() which
incorrectly used pgd_index() for kernel page tables. As a result, the
index into the kernel PGD was going out of bounds and the kernel hung
during early boot.
Allow overrides of pgd_offset_k() and override it on IA-64 with the old
implementation that will correctly index the kernel PGD.
Fixes: 974b9b2c68f3 ("mm: consolidate pte_index() and pte_offset_*() definitions")
Reported-by: John Paul Adrian Glaubitz <[email protected]>
Signed-off-by: Jessica Clarke <[email protected]>
Tested-by: John Paul Adrian Glaubitz <[email protected]>
Acked-by: Tony Luck <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
|
|
This reverts commit f8414a8d886b613b90d9fdf7cda6feea313b1069.
eth_type_trans() does the necessary pull on the skb.
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix a kernel-doc warning for the pcs_config() function prototype:
../include/linux/phylink.h:406: warning: Excess function parameter 'permit_pause_to_mac' description in 'pcs_config'
Fixes: 7137e18f6f88 ("net: phylink: add struct phylink_pcs")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Russell King <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
Some devices have broken extension unit where getting current value
doesn't work. Attempt that once when creating mixer control for it. If
it fails, just ignore it, so that it won't cripple the device entirely
(and/or make the error floods).
Signed-off-by: Tom Yan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
If debug_regs.c is built with newer binutils, the resulting binary is "optimized"
by the assembler:
asm volatile("ss_start: "
"xor %%rax,%%rax\n\t"
"cpuid\n\t"
"movl $0x1a0,%%ecx\n\t"
"rdmsr\n\t"
: : : "rax", "ecx");
is translated to :
000000000040194e <ss_start>:
40194e: 31 c0 xor %eax,%eax <----- rax->eax?
401950: 0f a2 cpuid
401952: b9 a0 01 00 00 mov $0x1a0,%ecx
401957: 0f 32 rdmsr
As you can see rax is replaced with eax in target binary code.
This causes a difference is the length of xor instruction (2 Byte vs 3 Byte),
and makes the hard-coded instruction length check fail:
/* Instruction lengths starting at ss_start */
int ss_size[4] = {
3, /* xor */ <-------- 2 or 3?
2, /* cpuid */
5, /* mov */
2, /* rdmsr */
};
Encode the shorter version directly and, while at it, fix the "clobbers"
of the asm.
Cc: [email protected]
Signed-off-by: Yang Weijiang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The vfio_iommu_replay() function does not currently unwind on error,
yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma
structure to indicate IOMMU mapping. The IOMMU mappings are torn down
when the domain is destroyed, but the other actions go on to cause
trouble later. For example, the iommu->domain_list can be empty if we
only have a non-IOMMU backed mdev attached. We don't currently check
if the list is empty before getting the first entry in the list, which
leads to a bogus domain pointer. If a vfio_dma entry is erroneously
marked as iommu_mapped, we'll attempt to use that bogus pointer to
retrieve the existing physical page addresses.
This is the scenario that uncovered this issue, attempting to hot-add
a vfio-pci device to a container with an existing mdev device and DMA
mappings, one of which could not be pinned, causing a failure adding
the new group to the existing container and setting the conditions
for a subsequent attempt to explode.
To resolve this, we can first check if the domain_list is empty so
that we can reject replay of a bogus domain, should we ever encounter
this inconsistent state again in the future. The real fix though is
to add the necessary unwind support, which means cleaning up the
current pinning if an IOMMU mapping fails, then walking back through
the r-b tree of DMA entries, reading from the IOMMU which ranges are
mapped, and unmapping and unpinning those ranges. To be able to do
this, we also defer marking the DMA entry as IOMMU mapped until all
entries are processed, in order to allow the unwind to know the
disposition of each entry.
Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
Reported-by: Zhiyi Guo <[email protected]>
Tested-by: Zhiyi Guo <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
A down_read on memory_lock is held when performing read/write accesses
to MMIO BAR space, including across the copy_to/from_user() callouts
which may fault. If the user buffer for these copies resides in an
mmap of device MMIO space, the mmap fault handler will acquire a
recursive read-lock on memory_lock. Avoid this by reducing the lock
granularity. Sequential accesses requiring multiple ioread/iowrite
cycles are expected to be rare, therefore typical accesses should not
see additional overhead.
VGA MMIO accesses are expected to be non-fatal regardless of the PCI
memory enable bit to allow legacy probing, this behavior remains with
a comment added. ioeventfds are now included in memory access testing,
with writes dropped while memory space is disabled.
Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
Reported-by: Zhiyi Guo <[email protected]>
Tested-by: Zhiyi Guo <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
This -Wsign-compare compiler warning can be very noisy
and most of the suggested conversions are unnecessary.
Make the warning W=3 so it's described under the
"can most likely be ignored" block.
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Impose a limit on the number of watches that a user can hold so that
they can't use this mechanism to fill up all the available memory.
This is done by putting a counter in user_struct that's incremented when
a watch is allocated and decreased when it is released. If the number
exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN.
This can be tested by the following means:
(1) Create a watch queue and attach it to fd 5 in the program given - in
this case, bash:
keyctl watch_session /tmp/nlog /tmp/gclog 5 bash
(2) In the shell, set the maximum number of files to, say, 99:
ulimit -n 99
(3) Add 200 keyrings:
for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done
(4) Try to watch all of the keyrings:
for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done
This should fail when the number of watches belonging to the user hits
99.
(5) Remove all the keyrings and all of those watches should go away:
for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done
(6) Kill off the watch queue by exiting the shell spawned by
watch_session.
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Linus Torvalds <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit ("03fd42d458fb powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when
page size is 256k") reworked the setup of the early debug area and
mistakenly replaced 128 * 1024 by SZ_128.
Change to SZ_128K to restore the original 128 kbytes size of the area.
Fixes: 03fd42d458fb ("powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/996184974d674ff984643778cf1cdd7fe58cc065.1597644194.git.christophe.leroy@csgroup.eu
|
|
IS_ENABLED() instead of #ifdef still requires variable declaration.
In this specific case, default_uamor is declared in asm/pkeys.h which
is only included if PPC_MEM_KEYS is enabled.
arch/powerpc/mm/book3s64/hash_utils.c: In function ‘hash__early_init_mmu_secondary’:
arch/powerpc/mm/book3s64/hash_utils.c:1119:21: error: ‘default_uamor’ undeclared (first use in this function)
1119 | mtspr(SPRN_UAMOR, default_uamor);
| ^~~~~~~~~~~~~
Fixes: 6553fb799f60 ("powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)")
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The code modifies rdev, but locks c_rdev instead. Remove the lock
as this is held together by regulator_list_mutex taken in the caller.
Fixes: f9503385b187 ("regulator: core: Mutually resolve regulators coupling")
Signed-off-by: Michał Mirosław <[email protected]>
Reviewed-by: Dmitry Osipenko <[email protected]>
Link: https://lore.kernel.org/r/25eb81cefb37a646f3e44eaaf1d8ae8881cfde52.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|
|
Since only regulator_ena_gpio_request() allocates rdev->ena_pin, and it
guarantees that same gpiod gets same pin structure, it is enough to
compare just the pointers. Also we know there can be only one matching
entry on the list. Rework the code take advantage of the facts.
Signed-off-by: Michał Mirosław <[email protected]>
Link: https://lore.kernel.org/r/3ff002c7aa3bd774491af4291a9df23541fcf892.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|
|
By calling device_initialize() earlier and noting that kfree(NULL) is
ok, we can save a bit of code in error handling and plug of_node leak.
Fixed commit already did part of the work.
Fixes: 9177514ce349 ("regulator: fix memory leak on error path of regulator_register()")
Signed-off-by: Michał Mirosław <[email protected]>
Reviewed-by: Vladimir Zapolskiy <[email protected]>
Acked-by: Vladimir Zapolskiy <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/f5035b1b4d40745e66bacd571bbbb5e4644d21a1.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|
|
Pull regulator_list_mutex into set_consumer_device_supply() and keep
allocations outside of it. Fourth of the fs_reclaim deadlock case.
Fixes: 45389c47526d ("regulator: core: Add early supply resolution for regulators")
Signed-off-by: Michał Mirosław <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/f0380bdb3d60aeefa9693c4e234d2dcda7e56747.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|
|
Move all allocations outside of the regulator_lock()ed section.
======================================================
WARNING: possible circular locking dependency detected
5.7.13+ #535 Not tainted
------------------------------------------------------
f2fs_discard-179:7/702 is trying to acquire lock:
c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0
but task is already holding lock:
cb95b080 (&dcc->cmd_lock){+.+.}-{3:3}, at: __issue_discard_cmd+0xec/0x5f8
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
[...]
-> #3 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire.part.11+0x40/0x50
fs_reclaim_acquire+0x24/0x28
__kmalloc_track_caller+0x54/0x218
kstrdup+0x40/0x5c
create_regulator+0xf4/0x368
regulator_resolve_supply+0x1a0/0x200
regulator_register+0x9c8/0x163c
[...]
other info that might help us debug this:
Chain exists of:
regulator_list_mutex --> &sit_i->sentry_lock --> &dcc->cmd_lock
[...]
Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
Signed-off-by: Michał Mirosław <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/6eebc99b2474f4ffaa0405b15178ece0e7e4f608.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|
|
Move another allocation out of regulator_list_mutex-protected region, as
reclaim might want to take the same lock.
WARNING: possible circular locking dependency detected
5.7.13+ #534 Not tainted
------------------------------------------------------
kswapd0/383 is trying to acquire lock:
c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0
but task is already holding lock:
c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire.part.11+0x40/0x50
fs_reclaim_acquire+0x24/0x28
kmem_cache_alloc_trace+0x40/0x1e8
regulator_register+0x384/0x1630
devm_regulator_register+0x50/0x84
reg_fixed_voltage_probe+0x248/0x35c
[...]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(regulator_list_mutex);
lock(fs_reclaim);
lock(regulator_list_mutex);
*** DEADLOCK ***
[...]
2 locks held by kswapd0/383:
#0: c0e38518 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x50
#1: cb70e5e0 (hctx->srcu){....}-{0:0}, at: hctx_lock+0x60/0xb8
[...]
Fixes: 541d052d7215 ("regulator: core: Only support passing enable GPIO descriptors")
[this commit only changes context]
Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")
[this is when the regulator_list_mutex was introduced in reclaim locking path]
Signed-off-by: Michał Mirosław <[email protected]>
Link: https://lore.kernel.org/r/41fe6a9670335721b48e8f5195038c3d67a3bf92.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <[email protected]>
|