Age | Commit message (Collapse) | Author | Files | Lines |
|
One motivation for mapping range registers to decoder objects is
to use those settings for region autodiscovery.
The need to map a region for devices programmed to use range registers
is especially urgent now that the kernel no longer routes "Soft
Reserved" ranges in the memory map to device-dax by default. The CXL
memory range loses all access mechanisms.
Complete the implementation by marking the DPA reservation and setting
the endpoint-decoder state to signal autodiscovery. Note that the
default settings of ways=1 and granularity=4096 set in cxl_decode_init()
do not need to be updated.
Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions")
Tested-by: Dave Jiang <[email protected]>
Tested-by: Gregory Price <[email protected]>
Link: https://lore.kernel.org/r/168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Dave Jiang <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
Recall that range register emulation seeks to treat the 2 potential
range registers as Linux CXL "decoder" objects. The number of range
registers can be 1 or 2, while HDM decoder ranges can include more than
2.
Be careful not to confuse DVSEC range count with HDM capability decoder
count. Commit to range register earlier in devm_cxl_setup_hdm().
Otherwise, a device with more HDM decoders than range registers can set
@cxlhdm->decoder_count to an invalid value.
Avoid introducing a forward declaration by just moving the definition of
should_emulate_decoders() earlier in the file. should_emulate_decoders()
is unchanged.
Tested-by: Dave Jiang <[email protected]>
Fixes: d7a2153762c7 ("cxl/hdm: Add emulation when HDM decoders are not committed")
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/168012574932.221280.15944705098679646436.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
Each time the contents of a given HPA are potentially changed in a cache
incoherent manner the CXL core sets CXL_REGION_F_INCOHERENT to
invalidate CPU caches before the region is used.
Successful invocation of attach_target() indicates that DPA has been
newly assigned to a given HPA in the dynamic region creation flow.
However, attach_target() is also reused in the autodiscovery flow where
the region was activated by platform firmware. In that case there is no
need to invalidate caches because that region is already in active use
and nothing about the autodiscovery flow modifies the HPA-to-DPA
relationship.
In the autodiscovery case cxl_region_attach() exits early after
determining the endpoint decoder is already correctly attached to the
region.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reviewed-by: Fan Ni <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/168002858817.50647.1217607907088920888.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
RCDs (CXL memory devices that link train without VH capability and show
up as root complex integrated endpoints), hide the presence of the link
between the endpoint and the host-bridge. The CXL region setup/teardown
paths assume that a link hop is present and go looking for at least one
'struct cxl_port' instance between the CXL root port-object and an
endpoint port-object leading to crashes of the form:
BUG: kernel NULL pointer dereference, address: 0000000000000008
[..]
RIP: 0010:cxl_region_setup_targets+0x3e9/0xae0 [cxl_core]
[..]
Call Trace:
<TASK>
cxl_region_attach+0x46c/0x7a0 [cxl_core]
cxl_create_region+0x20b/0x270 [cxl_core]
cxl_mock_mem_probe+0x641/0x800 [cxl_mock_mem]
platform_probe+0x5b/0xb0
Detect RCDs explicitly and skip walking the non-existent port hierarchy
between root and endpoint in that case.
While this has been a problem since:
commit 0a19bfc8de93 ("cxl/port: Add RCD endpoint port enumeration")
...it becomes a more reliable crash scenario with the new autodiscovery
implementation.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/168002858268.50647.728091521032131326.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
The find_cxl_root() helper is used to lookup root decoders and other CXL
platform topology information for a given endpoint. It turns out that
for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev)
is always NULL for the RCH topology case because it expects to find a
cxl_port at the host-bridge. RCH topologies only have the root cxl_port
object with the host-bridge as a dport. While there are no reports of
this being a problem to date, by inspection region enumeration should
crash as a result of this problem, and it does in a local unit test for
this scenario.
However, an observation that ever since:
commit f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
...all callers of find_cxl_root() occur after the memdev connection to
the port topology has been established. That means that find_cxl_root()
can be simplified to a walk of the endpoint port topology to the root.
Switch to that arrangement which also fixes the RCD bug.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
If the driver is allowed to enable memory operation itself then it can
also turn on HDM decoder support at will.
With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when
an HDM decoder is not committed, is not needed.
Fixes: b777e9bec960 ("cxl/hdm: Emulate HDM decoder from DVSEC range registers")
Link: http://lore.kernel.org/r/[email protected]
Reported-by: Jonathan Cameron <[email protected]>
Tested-by: Jonathan Cameron <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Fan Ni <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
devm_cxl_setup_emulated_hdm() reallocates an instance of @cxlhdm that
was already allocated at the start of devm_cxl_setup_hdm(). Only one is
needed and devm_cxl_setup_emulated_hdm() does not do enough to warrant
being an explicit helper.
Fixes: 4474ce565ee4 ("cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders")
Tested-by: Dave Jiang <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Link: https://lore.kernel.org/r/167703067936.185722.7908921750127154779.stgit@dwillia2-xfh.jf.intel.com
Link: https://lore.kernel.org/r/168012574357.221280.5001364964799725366.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <[email protected]>
|
|
Polling needs a bio with a valid bi_bdev, but neither of those are
guaranteed for polled driver requests. Make request based polling
directly use blk-mq's polling function instead.
When executing a request from a polled hctx, we know the request's
cookie, and that it's from a live blk-mq queue that supports polling, so
we can safely skip everything that bio_poll provides.
Cc: [email protected]
Reported-by: Martin Belanger <[email protected]>
Reported-by: Daniel Wagner <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Tested-by: Daniel Wagner <[email protected]>
Revieded-by: Daniel Wagner <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Tested-by: Shin'ichiro Kawasaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Pull kvm fixes from Paolo Bonzini:
"PPC:
- Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled
s390:
- Fix handling of external interrupts in protected guests
x86:
- Resample the pending state of IOAPIC interrupts when unmasking them
- Fix usage of Hyper-V "enlightened TLB" on AMD
- Small fixes to real mode exceptions
- Suppress pending MMIO write exits if emulator detects exception
Documentation:
- Fix rST syntax"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
docs: kvm: x86: Fix broken field list
KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent
KVM: s390: pv: fix external interruption loop not always detected
KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
KVM: x86: Suppress pending MMIO write exits if emulator detects exception
KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking
KVM: irqfd: Make resampler_list an RCU list
KVM: SVM: Flush Hyper-V TLB when required
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix a crash and a resource leak in NFSv4 COMPOUND processing
- Fix issues with AUTH_SYS credential handling
- Try again to address an NFS/NFSD/SUNRPC build dependency regression
* tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: callback request does not use correct credential for AUTH_SYS
NFS: Remove "select RPCSEC_GSS_KRB5
sunrpc: only free unix grouplist after RCU settles
nfsd: call op_release, even when op_func returns an error
NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
|
|
Add a missing ":" to fix a broken field list.
Signed-off-by: Takahiro Itazuri <[email protected]>
Fixes: ba7bb663f554 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Code that passes a 32-bit constant into cmpxchg() produces a harmless
sparse warning because of the truncation in the branch that is not taken:
fs/erofs/zdata.c: note: in included file (through /home/arnd/arm-soc/arch/arm/include/asm/cmpxchg.h, /home/arnd/arm-soc/arch/arm/include/asm/atomic.h, /home/arnd/arm-soc/include/linux/atomic.h, ...):
include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe)
include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe)
include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe)
include/asm-generic/cmpxchg-local.h:30:42: warning: cast truncates bits from constant value (5f0edead becomes ad)
include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe)
include/asm-generic/cmpxchg-local.h:34:44: warning: cast truncates bits from constant value (5f0edead becomes dead)
This was reported as a regression to Matt's recent __generic_cmpxchg_local
patch, though this patch only added more warnings on top of the ones
that were already there.
Rewording the truncation to use an explicit bitmask instead of a cast
to a smaller type avoids the warning but otherwise leaves the code
unchanged.
I had another look at why the cast is even needed for atomic_cmpxchg(),
and as Matt describes the problem here is that atomic_t contains a
signed 'int', but cmpxchg() takes an 'unsigned long' argument, and
converting between the two leads to a 64-bit sign-extension of
negative 32-bit atomics.
I checked the other implementations of arch_cmpxchg() and did not find
any others that run into the same problem as __generic_cmpxchg_local(),
but it's easy to be on the safe side here and always convert the
signed int into an unsigned int when calling arch_cmpxchg(), as this
will work even when any of the arch_cmpxchg() implementations run
into the same problem.
Fixes: 624654152284 ("locking/atomic: cmpxchg: Make __generic_cmpxchg_local compare against zero-extended 'old' value")
Reviewed-by: Matt Evans <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Copy the forced type casts from the normal MMIO accessors to suppress
the sparse warnings that point out __raw_readl() returns a native endian
word (just like readl()).
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Commit c1d55d50139b ("asm-generic/io.h: Fix sparse warnings on
big-endian architectures") missed fixing the 64-bit accessors.
Arnd explains in the attached link why the casts are necessary, even if
__raw_readq() and __raw_writeq() do not take endian-specific types.
Link: https://lore.kernel.org/lkml/[email protected]/
Suggested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Reset the FDIR counters when FDIR inits. Without this patch,
when VF initializes or resets, all the FDIR counters are not
cleaned, which may cause unexpected behaviors for future FDIR
rule create (e.g., rule conflict).
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Junfeng Guo <[email protected]>
Signed-off-by: Lingyu Liu <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When adding a FDIR filter, if ice_vc_fdir_set_irq_ctx returns failure,
the inserted fdir entry will not be removed and if ice_vc_fdir_write_fltr
returns failure, the fdir context info for irq handler will not be cleared
which may lead to inconsistent or memory leak issue. This patch refines
failure cases to resolve this issue.
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Simei Su <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently callback request does not use the credential specified in
CREATE_SESSION if the security flavor for the back channel is AUTH_SYS.
Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error:
DELEG5 st_delegation.testCBSecParms : FAILURE
expected callback with uid, gid == 17, 19, got 0, 0
Signed-off-by: Dai Ngo <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Fixes: 8276c902bbe9 ("SUNRPC: remove uid and gid from struct auth_cred")
Signed-off-by: Chuck Lever <[email protected]>
|
|
If CONFIG_CRYPTO=n (e.g. arm/shmobile_defconfig):
WARNING: unmet direct dependencies detected for RPCSEC_GSS_KRB5
Depends on [n]: NETWORK_FILESYSTEMS [=y] && SUNRPC [=y] && CRYPTO [=n]
Selected by [y]:
- NFS_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFS_FS [=y]
As NFSv4 can work without crypto enabled, remove the RPCSEC_GSS_KRB5
dependency altogether.
Trond says:
> It is possible to use the NFSv4.1 client with just AUTH_SYS, and
> in fact there are plenty of people out there using only that. The
> fact that RFC5661 gets its knickers in a twist about RPCSEC_GSS
> support is largely irrelevant to those people.
>
> The other issue is that ’select’ enforces the strict dependency
> that if the NFS client is compiled into the kernel, then the
> RPCSEC_GSS and kerberos code needs to be compiled in as well: they
> cannot exist as modules.
Fixes: e57d06527738 ("NFS & NFSD: Update GSS dependencies")
Reported-by: kernel test robot <[email protected]>
Reported-by: Niklas Söderlund <[email protected]>
Suggested-by: Trond Myklebust <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
|
|
While the unix_gid object is rcu-freed, the group_info list that it
contains is not. Ensure that we only put the group list reference once
we are really freeing the unix_gid object.
Reported-by: Zhi Li <[email protected]>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2183056
Signed-off-by: Jeff Layton <[email protected]>
Fixes: fd5d2f78261b ("SUNRPC: Make server side AUTH_UNIX use lockless lookups")
Signed-off-by: Chuck Lever <[email protected]>
|
|
stmmac_reinit_queues() fails to fix up the RX hash. Even if the number
of channels gets restricted, the output of `ethtool -x' indicates that
all RX queues are used:
$ ethtool -l enp0s29f2
Channel parameters for enp0s29f2:
Pre-set maximums:
RX: 8
TX: 8
Other: n/a
Combined: n/a
Current hardware settings:
RX: 8
TX: 8
Other: n/a
Combined: n/a
$ ethtool -x enp0s29f2
RX flow hash indirection table for enp0s29f2 with 8 RX ring(s):
0: 0 1 2 3 4 5 6 7
8: 0 1 2 3 4 5 6 7
[...]
$ ethtool -L enp0s29f2 rx 3
$ ethtool -x enp0s29f2
RX flow hash indirection table for enp0s29f2 with 3 RX ring(s):
0: 0 1 2 3 4 5 6 7
8: 0 1 2 3 4 5 6 7
[...]
Fix this by setting the indirection table according to the number
of specified queues. The result is now as expected:
$ ethtool -L enp0s29f2 rx 3
$ ethtool -x enp0s29f2
RX flow hash indirection table for enp0s29f2 with 3 RX ring(s):
0: 0 1 2 0 1 2 0 1
8: 2 0 1 2 0 1 2 0
[...]
Tested on Intel Elkhart Lake.
Fixes: 0366f7e06a6b ("net: stmmac: add ethtool support for get/set channels")
Signed-off-by: Corinna Vinschen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
If batch->end is 0 then setting npfns[0] before computing the new value of
pfns will fail to adjust the pfn and result in various page accounting
corruptions. It should be ordered after.
This seems to result in various kinds of page meta-data corruption related
failures:
WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740
Modules linked in:
CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_folio+0x503/0x740
Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89
RSP: 0018:ffffc90000f37908 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26
RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002
RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008
R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540
R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540
FS: 00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
internal_get_user_pages_fast+0xd32/0x2200
pin_user_pages_fast+0x65/0x90
pfn_reader_user_pin+0x376/0x390
pfn_reader_next+0x14a/0x7b0
pfn_reader_first+0x140/0x1b0
iopt_area_fill_domain+0x74/0x210
iopt_table_add_domain+0x30e/0x6e0
iommufd_device_selftest_attach+0x7f/0x140
iommufd_test+0x10ff/0x16f0
iommufd_fops_ioctl+0x206/0x330
__x64_sys_ioctl+0x10e/0x160
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Cc: <[email protected]>
Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Kevin Tian <[email protected]>
Reported-by: Pengfei Xu <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
syzkaller found that the calculation of batch_last_index should use
'start_index' since at input to this function the batch is either empty or
it has already been adjusted to cross any accesses so it will start at the
point we are unmapping from.
Getting this wrong causes the unmap to run over the end of the pages
which corrupts pages that were never mapped. In most cases this triggers
the num pinned debugging:
WARNING: CPU: 0 PID: 557 at drivers/iommu/iommufd/pages.c:294 __iopt_area_unfill_domain+0x152/0x560
Modules linked in:
CPU: 0 PID: 557 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__iopt_area_unfill_domain+0x152/0x560
Code: d2 0f ff 44 8b 64 24 54 48 8b 44 24 48 31 ff 44 89 e6 48 89 44 24 38 e8 fc d3 0f ff 45 85 e4 0f 85 eb 01 00 00 e8 0e d2 0f ff <0f> 0b e8 07 d2 0f ff 48 8b 44 24 38 89 5c 24 58 89 18 8b 44 24 54
RSP: 0018:ffffc9000108baf0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffffffff821e3f85
RDX: 0000000000000000 RSI: ffff88800faf0000 RDI: 0000000000000002
RBP: ffffc9000108bd18 R08: 000000000003ca25 R09: 0000000000000014
R10: 000000000003ca00 R11: 0000000000000024 R12: 0000000000000004
R13: 0000000000000801 R14: 00000000000007ff R15: 0000000000000800
FS: 00007f3499ce1740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000243 CR3: 00000000179c2001 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
iopt_area_unfill_domain+0x32/0x40
iopt_table_remove_domain+0x23f/0x4c0
iommufd_device_selftest_detach+0x3a/0x90
iommufd_selftest_destroy+0x55/0x70
iommufd_object_destroy_user+0xce/0x130
iommufd_destroy+0xa2/0xc0
iommufd_fops_ioctl+0x206/0x330
__x64_sys_ioctl+0x10e/0x160
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Also add some useful WARN_ON sanity checks.
Cc: <[email protected]>
Fixes: 8d160cd4d506 ("iommufd: Algorithms for PFN storage")
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Kevin Tian <[email protected]>
Reported-by: Pengfei Xu <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
syzkaller found that setting up a map with a user VA that wraps past zero
can trigger WARN_ONs, particularly from pin_user_pages weirdly returning 0
due to invalid arguments.
Prevent creating a pages with a uptr and size that would math overflow.
WARNING: CPU: 0 PID: 518 at drivers/iommu/iommufd/pages.c:793 pfn_reader_user_pin+0x2e6/0x390
Modules linked in:
CPU: 0 PID: 518 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:pfn_reader_user_pin+0x2e6/0x390
Code: b1 11 e9 25 fe ff ff e8 28 e4 0f ff 31 ff 48 89 de e8 2e e6 0f ff 48 85 db 74 0a e8 14 e4 0f ff e9 4d ff ff ff e8 0a e4 0f ff <0f> 0b bb f2 ff ff ff e9 3c ff ff ff e8 f9 e3 0f ff ba 01 00 00 00
RSP: 0018:ffffc90000f9fa30 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff821e2b72
RDX: 0000000000000000 RSI: ffff888014184680 RDI: 0000000000000002
RBP: ffffc90000f9fa78 R08: 00000000000000ff R09: 0000000079de6f4e
R10: ffffc90000f9f790 R11: ffff888014185418 R12: ffffc90000f9fc60
R13: 0000000000000002 R14: ffff888007879800 R15: 0000000000000000
FS: 00007f4227555740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000043 CR3: 000000000e748005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
pfn_reader_next+0x14a/0x7b0
? interval_tree_double_span_iter_update+0x11a/0x140
pfn_reader_first+0x140/0x1b0
iopt_pages_rw_slow+0x71/0x280
? __this_cpu_preempt_check+0x20/0x30
iopt_pages_rw_access+0x2b2/0x5b0
iommufd_access_rw+0x19f/0x2f0
iommufd_test+0xd11/0x16f0
? write_comp_data+0x2f/0x90
iommufd_fops_ioctl+0x206/0x330
__x64_sys_ioctl+0x10e/0x160
? __pfx_iommufd_fops_ioctl+0x10/0x10
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Cc: <[email protected]>
Fixes: 8d160cd4d506 ("iommufd: Algorithms for PFN storage")
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Kevin Tian <[email protected]>
Reported-by: Pengfei Xu <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In the am65_cpsw_nuss_probe() function's cleanup path, the call to
of_platform_device_destroy() for the common->mdio_dev device is invoked
unconditionally. It is possible that either the MDIO node is not present
in the device-tree, or the MDIO node is disabled in the device-tree. In
both these cases, the MDIO device is not created, resulting in a NULL
pointer dereference when the of_platform_device_destroy() function is
invoked on the common->mdio_dev device on the cleanup path.
Fix this by ensuring that the common->mdio_dev device exists, before
attempting to invoke of_platform_device_destroy().
Fixes: a45cfcc69a25 ("net: ethernet: ti: am65-cpsw-nuss: use of_platform_device_create() for mdio")
Signed-off-by: Siddharth Vadapalli <[email protected]>
Reviewed-by: Roger Quadros <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
After a pci_doe_task completes, its work_struct needs to be destroyed
to avoid a memory leak with CONFIG_DEBUG_OBJECTS=y.
Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
Tested-by: Ira Weiny <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Cc: [email protected] # v6.0+
Acked-by: Bjorn Helgaas <[email protected]>
Link: https://lore.kernel.org/r/775768b4912531c3b887d405fc51a50e465e1bf9.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <[email protected]>
|
|
Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
probing because pci_doe_submit_task() invokes INIT_WORK() instead of
INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
All callers of pci_doe_submit_task() allocate the work_struct on the
stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
short-term fix.
The long-term fix implemented by a subsequent commit is to move to a
synchronous API which allocates the work_struct internally in the DOE
library.
Stacktrace for posterity:
WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
pci_doe_submit_task+0x5d/0xd0
pci_doe_discovery+0xb4/0x100
pcim_doe_create_mb+0x219/0x290
cxl_pci_probe+0x192/0x430
local_pci_probe+0x41/0x80
pci_device_probe+0xb3/0x220
really_probe+0xde/0x380
__driver_probe_device+0x78/0x170
driver_probe_device+0x1f/0x90
__driver_attach_async_helper+0x5c/0xe0
async_run_entry_fn+0x30/0x130
process_one_work+0x294/0x5b0
Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
Link: https://lore.kernel.org/linux-cxl/[email protected]/
Reported-by: Gregory Price <[email protected]>
Tested-by: Ira Weiny <[email protected]>
Tested-by: Gregory Price <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Gregory Price <[email protected]>
Cc: [email protected] # v6.0+
Reviewed-by: Jonathan Cameron <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Link: https://lore.kernel.org/r/67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <[email protected]>
|
|
If the length in the CDAT header is larger than the concatenation of the
header and all table entries, then the CDAT exposed to user space
contains trailing null bytes.
Not every consumer may be able to handle that. Per Postel's robustness
principle, "be liberal in what you accept" and silently reduce the
cached length to avoid exposing those null bytes.
Fixes: c97006046c79 ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Cc: [email protected] # v6.0+
Link: https://lore.kernel.org/r/6d98b3c7da5343172bd3ccabfabbc1f31c079d74.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <[email protected]>
|
|
If truncated CDAT entries are received from a device, the concatenation
of those entries constitutes a corrupt CDAT, yet is happily exposed to
user space.
Avoid by verifying response lengths and erroring out if truncation is
detected.
The last CDAT entry may still be truncated despite the checks introduced
herein if the length in the CDAT header is too small. However, that is
easily detectable by user space because it reaches EOF prematurely.
A subsequent commit which rightsizes the CDAT response allocation closes
that remaining loophole.
The two lines introduced here which exceed 80 chars are shortened to
less than 80 chars by a subsequent commit which migrates to a
synchronous DOE API and replaces "t.task.rv" by "rc".
The existing acpi_cdat_header and acpi_table_cdat struct definitions
provided by ACPICA cannot be used because they do not employ __le16 or
__le32 types. I believe that cannot be changed because those types are
Linux-specific and ACPI is specified for little endian platforms only,
hence doesn't care about endianness. So duplicate the structs.
Fixes: c97006046c79 ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Cc: [email protected] # v6.0+
Link: https://lore.kernel.org/r/bce3aebc0e8e18a1173425a7a865b232c3912963.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <[email protected]>
|
|
cxl_cdat_get_length() only checks whether the DOE response size is
sufficient for the Table Access response header (1 dword), but not the
succeeding CDAT header (1 dword length plus other fields).
It thus returns whatever uninitialized memory happens to be on the stack
if a truncated DOE response with only 1 dword was received. Fix it.
Fixes: c97006046c79 ("cxl/port: Read CDAT table")
Reported-by: Ming Li <[email protected]>
Tested-by: Ira Weiny <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Ming Li <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Cc: [email protected] # v6.0+
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Link: https://lore.kernel.org/r/000e69cd163461c8b1bc2cf4155b6e25402c29c7.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs fix from Christian Brauner:
"When a mount or mount tree is made shared the vfs allocates new peer
group ids for all mounts that have no peer group id set. Only mounts
that aren't marked with MNT_SHARED are relevant here as MNT_SHARED
indicates that the mount has fully transitioned to a shared mount. The
peer group id handling is done with namespace lock held.
On failure, the peer group id settings of mounts for which a new peer
group id was allocated need to be reverted and the allocated peer
group id freed. The cleanup_group_ids() helper can identify the mounts
to cleanup by checking whether a given mount has a peer group id set
but isn't marked MNT_SHARED. The deallocation always needs to happen
with namespace lock held to protect against concurrent modifications
of the propagation settings.
This fixes the one place where the namespace lock was dropped before
calling cleanup_group_ids()"
* tag 'vfs.misc.fixes.v6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fs: drop peer group ids under namespace lock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Fix a bug in channel allocation for VMbus (Mohammed Gamal)
- Do not allow root partition functionality in CVM (Michael Kelley)
* tag 'hyperv-fixes-signed-20230402' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Block root partition functionality in a Confidential VM
Drivers: vmbus: Check for channel allocation before looking up relids
|
|
A __field() in the TRACE_EVENT() macro is used to set up the fields of the
trace event data. It is for single storage units (word, char, int,
pointer, etc) and not for complex structures or arrays. Unfortunately,
there's nothing preventing the build from accepting:
__field(int, arr[5]);
from building. It will turn into a array value. This use to work fine, as
the offset and size use to be determined by the macro using the field name,
but things have changed and the offset and size are now determined by the
type. So the above would only be size 4, and the next field will be
located 4 bytes from it (instead of 20).
The proper way to declare static arrays is to use the __array() macro.
Instead of __field(int, arr[5]) it should be __array(int, arr, 5).
Add some macro tricks to the building of a trace event from the
TRACE_EVENT() macro such that __field(int, arr[5]) will fail to build. A
comment by the failure will explain why the build failed.
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Reported-by: Douglas RAILLARD <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
|
|
osnoise/timerlat tracers are reporting new max latency on instances
where the tracing is off, creating inconsistencies between the max
reported values in the trace and in the tracing_max_latency. Thus
only report new tracing_max_latency on active tracing instances.
Link: https://lkml.kernel.org/r/ecd109fde4a0c24ab0f00ba1e9a144ac19a91322.1680104184.git.bristot@kernel.org
Cc: [email protected]
Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
timerlat is not reporting a new tracing_max_latency for the thread
latency. The reason is that it is not calling notify_new_max_latency()
function after the new thread latency is sampled.
Call notify_new_max_latency() after computing the thread latency.
Link: https://lkml.kernel.org/r/16e18d61d69073d0192ace07bf61e405cca96e9c.1680104184.git.bristot@kernel.org
Cc: [email protected]
Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
If the compiler decides not to inline this function then preemption
tracing will always show an IP inside the preemption disabling path and
never the function actually calling preempt_{enable,disable}.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Fixes: f904f58263e1d ("sched/debug: Fix preempt_disable_ip recording for preempt_disable()")
Signed-off-by: John Keeping <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When user reads file 'trace_pipe', kernel keeps printing following logs
that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in
rb_get_reader_page(). It just looks like there's an infinite loop in
tracing_read_pipe(). This problem occurs several times on arm64 platform
when testing v5.10 and below.
Call trace:
rb_get_reader_page+0x248/0x1300
rb_buffer_peek+0x34/0x160
ring_buffer_peek+0xbc/0x224
peek_next_entry+0x98/0xbc
__find_next_entry+0xc4/0x1c0
trace_find_next_entry_inc+0x30/0x94
tracing_read_pipe+0x198/0x304
vfs_read+0xb4/0x1e0
ksys_read+0x74/0x100
__arm64_sys_read+0x24/0x30
el0_svc_common.constprop.0+0x7c/0x1bc
do_el0_svc+0x2c/0x94
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb4
el0_sync+0x160/0x180
Then I dump the vmcore and look into the problematic per_cpu ring_buffer,
I found that tail_page/commit_page/reader_page are on the same page while
reader_page->read is obviously abnormal:
tail_page == commit_page == reader_page == {
.write = 0x100d20,
.read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!!
.entries = 0x10004c,
.real_end = 0x0,
.page = {
.time_stamp = 0x857257416af0,
.commit = 0xd20, // This page hasn't been full filled.
// .data[0...0xd20] seems normal.
}
}
The root cause is most likely the race that reader and writer are on the
same page while reader saw an event that not fully committed by writer.
To fix this, add memory barriers to make sure the reader can see the
content of what is committed. Since commit a0fcaaed0c46 ("ring-buffer: Fix
race between reset page and reading page") has added the read barrier in
rb_get_reader_page(), here we just need to add the write barrier.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Fixes: 77ae365eca89 ("ring-buffer: make lockless")
Suggested-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Currently, the "last_cmd" variable can be accessed by multiple processes
asynchronously when multiple users manipulate synthetic_events node
at the same time, it could lead to use-after-free or double-free.
This patch add "lastcmd_mutex" to prevent "last_cmd" from being accessed
asynchronously.
================================================================
It's easy to reproduce in the KASAN environment by running the two
scripts below in different shells.
script 1:
while :
do
echo -n -e '\x88' > /sys/kernel/tracing/synthetic_events
done
script 2:
while :
do
echo -n -e '\xb0' > /sys/kernel/tracing/synthetic_events
done
================================================================
double-free scenario:
process A process B
------------------- ---------------
1.kstrdup last_cmd
2.free last_cmd
3.free last_cmd(double-free)
================================================================
use-after-free scenario:
process A process B
------------------- ---------------
1.kstrdup last_cmd
2.free last_cmd
3.tracing_log_err(use-after-free)
================================================================
Appendix 1. KASAN report double-free:
BUG: KASAN: double-free in kfree+0xdc/0x1d4
Free of addr ***** by task sh/4879
Call trace:
...
kfree+0xdc/0x1d4
create_or_delete_synth_event+0x60/0x1e8
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
Allocated by task 4879:
...
kstrdup+0x5c/0x98
create_or_delete_synth_event+0x6c/0x1e8
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
Freed by task 5464:
...
kfree+0xdc/0x1d4
create_or_delete_synth_event+0x60/0x1e8
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
================================================================
Appendix 2. KASAN report use-after-free:
BUG: KASAN: use-after-free in strlen+0x5c/0x7c
Read of size 1 at addr ***** by task sh/5483
sh: CPU: 7 PID: 5483 Comm: sh
...
__asan_report_load1_noabort+0x34/0x44
strlen+0x5c/0x7c
tracing_log_err+0x60/0x444
create_or_delete_synth_event+0xc4/0x204
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
Allocated by task 5483:
...
kstrdup+0x5c/0x98
create_or_delete_synth_event+0x80/0x204
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
Freed by task 5480:
...
kfree+0xdc/0x1d4
create_or_delete_synth_event+0x74/0x204
trace_parse_run_command+0x2bc/0x4b8
synth_events_write+0x20/0x30
vfs_write+0x200/0x830
...
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Fixes: 27c888da9867 ("tracing: Remove size restriction on synthetic event cmd error logging")
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: AngeloGioacchino Del Regno <[email protected]>
Cc: "Tom Zanussi" <[email protected]>
Signed-off-by: Tze-nan Wu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add the IRQCHIP_SKIP_SET_WAKE flag since there are no special IRQ Wake
bits that can be set to enable wakeup IRQ.
Fixes: 3d9edf09d452 ("[ARM] 4457/2: davinci: GPIO support")
Signed-off-by: Dhruva Gole <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
The interrupt enable bits might be set if we want to use the GPIO as
wakeup source. Clearing this will mean disabling of interrupts in the GPIO
banks that we may want to wakeup from.
Thus remove the line that was clearing this bit from the driver's save
context function.
Cc: Devarsh Thakkar <[email protected]>
Fixes: 0651a730924b ("gpio: davinci: Add support for system suspend/resume PM")
Signed-off-by: Dhruva Gole <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Acked-by: Keerthy <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
This helps avoid cleartext leakage of already queued or powersave buffered
packets, when a reassoc triggers the key deletion.
Cc: [email protected]
Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Currently ath11k breaks after hibernation, the reason being that ath11k expects
that the wireless device will have power during suspend and the firmware will
continue running. But of course during hibernation the power from the device is
cut off and firmware is not running when resuming, so ath11k will fail.
(The reason why ath11k needs the firmware running is the interaction between
mac80211 and MHI stack, it's a long story and more info in the bugzilla report.)
In SUSE kernels the watchdog timeout is reduced from the default 120 to 60 seconds:
CONFIG_DPM_WATCHDOG_TIMEOUT=60
But as the ath11k MHI timeout is 90 seconds the kernel will crash before will
ath11k will recover in resume callback. To avoid the crash reduce the MHI
timeout to just 20 seconds.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214649
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
From the commit message adding the first s2idle quirks:
> Lenovo laptops that contain NVME SSDs across a variety of generations have
> trouble resuming from suspend to idle when the IOMMU translation layer is
> active for the NVME storage device.
>
> This generally manifests as a large resume delay or page faults. These
> delays and page faults occur as a result of a Lenovo BIOS specific SMI
> that runs during the D3->D0 transition on NVME devices.
Add the DMI ids for another variant of the T14s Gen1, which also needs
the s2idle quirk.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://bbs.archlinux.org/viewtopic.php?pid=2084655#p2084655
Signed-off-by: Benjamin Asbach <[email protected]>
Tested-by: Benjamin Asbach <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
On ThinkStations on retrieving the attribute value the BIOS appends the
possible values to the string.
Clean up the display in the current_value_show function so the options
part is not displayed.
Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms")
Reported by Mario Limoncello <[email protected]>
Link: https://github.com/fwupd/fwupd/issues/5077#issuecomment-1488730526
Signed-off-by: Mark Pearson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Mario Limonciello <[email protected]>
Tested-by: Mirsad Goran Todorovac <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
My previous commit introduced a memory leak where the item allocated
from tlmi_setting was not freed.
This commit also renames it to avoid confusion with the similarly name
variable in the same function.
Fixes: 8a02d70679fc ("platform/x86: think-lmi: Add possible_values for ThinkStation")
Reported-by: Mirsad Todorovac <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/T/
Signed-off-by: Mark Pearson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Mario Limonciello <[email protected]>
Tested-by: Mirsad Goran Todorovac <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
When retriving a item string with tlmi_setting(), the result has to be
freed using kfree(). In current_value_show() however, malformed
item strings are not freed, causing a memory leak.
Fix this by eliminating the early return responsible for this.
Reported-by: Mirsad Goran Todorovac <[email protected]>
Link: https://lore.kernel.org/platform-driver-x86/[email protected]/T/#t
Tested-by: Mirsad Goran Todorovac <[email protected]>
Fixes: 0fdf10e5fc96 ("platform/x86: think-lmi: Split current_value to reflect only the value")
Signed-off-by: Armin Wolf <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Mario Limonciello <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
Syzbot reported a bug as following:
=====================================================
BUG: KMSAN: uninit-value in arch_atomic64_inc arch/x86/include/asm/atomic64_64.h:88 [inline]
BUG: KMSAN: uninit-value in arch_atomic_long_inc include/linux/atomic/atomic-long.h:161 [inline]
BUG: KMSAN: uninit-value in atomic_long_inc include/linux/atomic/atomic-instrumented.h:1429 [inline]
BUG: KMSAN: uninit-value in __ip6_make_skb+0x2f37/0x30f0 net/ipv6/ip6_output.c:1956
arch_atomic64_inc arch/x86/include/asm/atomic64_64.h:88 [inline]
arch_atomic_long_inc include/linux/atomic/atomic-long.h:161 [inline]
atomic_long_inc include/linux/atomic/atomic-instrumented.h:1429 [inline]
__ip6_make_skb+0x2f37/0x30f0 net/ipv6/ip6_output.c:1956
ip6_finish_skb include/net/ipv6.h:1122 [inline]
ip6_push_pending_frames+0x10e/0x550 net/ipv6/ip6_output.c:1987
rawv6_push_pending_frames+0xb12/0xb90 net/ipv6/raw.c:579
rawv6_sendmsg+0x297e/0x2e60 net/ipv6/raw.c:922
inet_sendmsg+0x101/0x180 net/ipv4/af_inet.c:827
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
____sys_sendmsg+0xa8e/0xe70 net/socket.c:2476
___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2530
__sys_sendmsg net/socket.c:2559 [inline]
__do_sys_sendmsg net/socket.c:2568 [inline]
__se_sys_sendmsg net/socket.c:2566 [inline]
__x64_sys_sendmsg+0x367/0x540 net/socket.c:2566
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was created at:
slab_post_alloc_hook mm/slab.h:766 [inline]
slab_alloc_node mm/slub.c:3452 [inline]
__kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491
__do_kmalloc_node mm/slab_common.c:967 [inline]
__kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988
kmalloc_reserve net/core/skbuff.c:492 [inline]
__alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565
alloc_skb include/linux/skbuff.h:1270 [inline]
__ip6_append_data+0x51c1/0x6bb0 net/ipv6/ip6_output.c:1684
ip6_append_data+0x411/0x580 net/ipv6/ip6_output.c:1854
rawv6_sendmsg+0x2882/0x2e60 net/ipv6/raw.c:915
inet_sendmsg+0x101/0x180 net/ipv4/af_inet.c:827
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
____sys_sendmsg+0xa8e/0xe70 net/socket.c:2476
___sys_sendmsg+0x2a1/0x3f0 net/socket.c:2530
__sys_sendmsg net/socket.c:2559 [inline]
__do_sys_sendmsg net/socket.c:2568 [inline]
__se_sys_sendmsg net/socket.c:2566 [inline]
__x64_sys_sendmsg+0x367/0x540 net/socket.c:2566
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
It is because icmp6hdr does not in skb linear region under the scenario
of SOCK_RAW socket. Access icmp6_hdr(skb)->icmp6_type directly will
trigger the uninit variable access bug.
Use a local variable icmp6_type to carry the correct value in different
scenarios.
Fixes: 14878f75abd5 ("[IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2]")
Reported-by: [email protected]
Link: https://syzkaller.appspot.com/bug?id=3d605ec1d0a7f2a269a1a6936ac7f2b85975ee9c
Signed-off-by: Ziyang Xuan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
On the remote side, when QRTR socket is removed, af_qrtr will call
qrtr_port_remove() which broadcasts the DEL_CLIENT packet to all neighbours
including local NS. NS upon receiving the DEL_CLIENT packet, will remove
the lookups associated with the node:port and broadcasts the DEL_SERVER
packet.
But on the host side, due to the arrival of the DEL_CLIENT packet, the NS
would've already deleted the server belonging to that port. So when the
remote's NS again broadcasts the DEL_SERVER for that port, it throws below
error message on the host:
"failed while handling packet from 2:-2"
So fix this error by not broadcasting the DEL_SERVER packet when the
DEL_CLIENT packet gets processed."
Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace")
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Ram Kumar Dharuman <[email protected]>
Signed-off-by: Sricharan Ramabadhran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The HG MXPD-483II 1310nm SFP module is meant to operate with 2500Base-X,
however, in their EEPROM they incorrectly specify:
Transceiver type : Ethernet: 1000BASE-LX
...
BR, Nominal : 2600MBd
Use sfp_quirk_2500basex for this module to allow 2500Base-X mode anyway.
https://forum.banana-pi.org/t/bpi-r3-sfp-module-compatibility/14573/60
Reported-by: chowtom <[email protected]>
Tested-by: chowtom <[email protected]>
Signed-off-by: Daniel Golle <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When considering whether to mark one context as stopped and another as
started we need to look at whether the previous and new _contexts_ are
different and not just requests. Otherwise the software tracked context
start time was incorrectly updated to the most recent lite-restore time-
stamp, which was in some cases resulting in active time going backward,
until the context switch (typically the heartbeat pulse) would synchronise
with the hardware tracked context runtime. Easiest use case to observe
this behaviour was with a full screen clients with close to 100% engine
load.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Fixes: bb6287cb1886 ("drm/i915: Track context current active time")
Cc: <[email protected]> # v5.19+
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
[tursulin: Fix spelling in commit msg.]
(cherry picked from commit b3e70051879c665acdd3a1ab50d0ed58d6a8001f)
Signed-off-by: Jani Nikula <[email protected]>
|
|
Userspace can guess the id value and try to race oa_config object creation
with config remove, resulting in a use-after-free if we dereference the
object after unlocking the metrics_lock. For that reason, unlocking the
metrics_lock must be done after we are done dereferencing the object.
Signed-off-by: Min Li <[email protected]>
Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface")
Cc: <[email protected]> # v4.14+
Reviewed-by: Andi Shyti <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
[tursulin: Manually added stable tag.]
(cherry picked from commit 49f6f6483b652108bcb73accd0204a464b922395)
Signed-off-by: Jani Nikula <[email protected]>
|