Age | Commit message (Collapse) | Author | Files | Lines |
|
Merge a runtime PM framework cleanup and fix related to device links.
* pm-core:
PM: runtime: Fix supplier device management during consumer probe
PM: runtime: Redefine pm_runtime_release_supplier()
|
|
Pull block fixes from Jens Axboe:
"NVMe pull request with another id quirk addition, and a tracing fix"
* tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
nvme: use struct group for generic command dwords
nvme-pci: phison e16 has bogus namespace ids
|
|
Pull io_uring tweak from Jens Axboe:
"Just a minor tweak to an addition made in this release cycle: padding
a 32-bit value that's in a 64-bit union to avoid any potential
funkiness from that"
* tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
io_uring: explicit sqe padding for ioctl commands
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:
- fbcon now prevents switching to screen resolutions which are smaller
than the font size, and prevents enabling a font which is bigger than
the current screen resolution. This fixes vmalloc-out-of-bounds
accesses found by KASAN.
- Guiling Deng fixed a bug where the centered fbdev logo wasn't
displayed correctly if the screen size matched the logo size.
- Hsin-Yi Wang provided a patch to include errno.h to fix build when
CONFIG_OF isn't enabled.
* tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
fbmem: Check virtual screen sizes in fb_set_var()
fbcon: Prevent that screen size is smaller than font size
fbcon: Disallow setting font bigger than screen size
video: of_display_timing.h: include errno.h
fbdev: fbmem: Fix logo center image dx issue
|
|
We have an optimization in do_zone_finish() to send REQ_OP_ZONE_FINISH only
when necessary, i.e. we don't send REQ_OP_ZONE_FINISH when we assume we
wrote fully into the zone.
The assumption is determined by "alloc_offset == capacity". This condition
won't work if the last ordered extent is canceled due to some errors. In
that case, we consider the zone is deactivated without sending the finish
command while it's still active.
This inconstancy results in activating another block group while we cannot
really activate the underlying zone, which causes the active zone exceeds
errors like below.
BTRFS error (device nvme3n2): allocation failed flags 1, wanted 520192 tree-log 0, relocation: 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
Fix the issue by removing the optimization for now.
Fixes: 8376d9e1ed8f ("btrfs: zoned: finish superblock zone once no space left for new SB")
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Naohiro Aota <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
The bioc would leak on the normal completion path and also on the RAID56
check (but that one won't happen in practice due to the invalid
combination with zoned mode).
Fixes: 7db1c5d14dcd ("btrfs: zoned: support dev-replace in zoned filesystems")
CC: [email protected] # 5.16+
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
[ update changelog ]
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
extents
When doing a direct IO read or write, we always return -ENOTBLK when we
find a compressed extent (or an inline extent) so that we fallback to
buffered IO. This however is not ideal in case we are in a NOWAIT context
(io_uring for example), because buffered IO can block and we currently
have no support for NOWAIT semantics for buffered IO, so if we need to
fallback to buffered IO we should first signal the caller that we may
need to block by returning -EAGAIN instead.
This behaviour can also result in short reads being returned to user
space, which although it's not incorrect and user space should be able
to deal with partial reads, it's somewhat surprising and even some popular
applications like QEMU (Link tag #1) and MariaDB (Link tag #2) don't
deal with short reads properly (or at all).
The short read case happens when we try to read from a range that has a
non-compressed and non-inline extent followed by a compressed extent.
After having read the first extent, when we find the compressed extent we
return -ENOTBLK from btrfs_dio_iomap_begin(), which results in iomap to
treat the request as a short read, returning 0 (success) and waiting for
previously submitted bios to complete (this happens at
fs/iomap/direct-io.c:__iomap_dio_rw()). After that, and while at
btrfs_file_read_iter(), we call filemap_read() to use buffered IO to
read the remaining data, and pass it the number of bytes we were able to
read with direct IO. Than at filemap_read() if we get a page fault error
when accessing the read buffer, we return a partial read instead of an
-EFAULT error, because the number of bytes previously read is greater
than zero.
So fix this by returning -EAGAIN for NOWAIT direct IO when we find a
compressed or an inline extent.
Reported-by: Dominique MARTINET <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
Link: https://jira.mariadb.org/browse/MDEV-27900?focusedCommentId=216582&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-216582
Tested-by: Dominique MARTINET <[email protected]>
CC: [email protected] # 5.10+
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This cycle we added support for mounting overlayfs on top of idmapped
mounts. Recently I've started looking into potential corner cases when
trying to add additional tests and I noticed that reporting for POSIX ACLs
is currently wrong when using idmapped layers with overlayfs mounted on top
of it.
I have sent out an patch that fixes this and makes POSIX ACLs work
correctly but the patch is a bit bigger and we're already at -rc5 so I
recommend we simply don't raise SB_POSIXACL when idmapped layers are
used. Then we can fix the VFS part described below for the next merge
window so we can have good exposure in -next.
I'm going to give a rather detailed explanation to both the origin of the
problem and mention the solution so people know what's going on.
Let's assume the user creates the following directory layout and they have
a rootfs /var/lib/lxc/c1/rootfs. The files in this rootfs are owned as you
would expect files on your host system to be owned. For example, ~/.bashrc
for your regular user would be owned by 1000:1000 and /root/.bashrc would
be owned by 0:0. IOW, this is just regular boring filesystem tree on an
ext4 or xfs filesystem.
The user chooses to set POSIX ACLs using the setfacl binary granting the
user with uid 4 read, write, and execute permissions for their .bashrc
file:
setfacl -m u:4:rwx /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc
Now they to expose the whole rootfs to a container using an idmapped
mount. So they first create:
mkdir -pv /vol/contpool/{ctrover,merge,lowermap,overmap}
mkdir -pv /vol/contpool/ctrover/{over,work}
chown 10000000:10000000 /vol/contpool/ctrover/{over,work}
The user now creates an idmapped mount for the rootfs:
mount-idmapped/mount-idmapped --map-mount=b:0:10000000:65536 \
/var/lib/lxc/c2/rootfs \
/vol/contpool/lowermap
This for example makes it so that
/var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc which is owned by uid and gid
1000 as being owned by uid and gid 10001000 at
/vol/contpool/lowermap/home/ubuntu/.bashrc.
Assume the user wants to expose these idmapped mounts through an overlayfs
mount to a container.
mount -t overlay overlay \
-o lowerdir=/vol/contpool/lowermap, \
upperdir=/vol/contpool/overmap/over, \
workdir=/vol/contpool/overmap/work \
/vol/contpool/merge
The user can do this in two ways:
(1) Mount overlayfs in the initial user namespace and expose it to the
container.
(2) Mount overlayfs on top of the idmapped mounts inside of the container's
user namespace.
Let's assume the user chooses the (1) option and mounts overlayfs on the
host and then changes into a container which uses the idmapping
0:10000000:65536 which is the same used for the two idmapped mounts.
Now the user tries to retrieve the POSIX ACLs using the getfacl command
getfacl -n /vol/contpool/lowermap/home/ubuntu/.bashrc
and to their surprise they see:
# file: vol/contpool/merge/home/ubuntu/.bashrc
# owner: 1000
# group: 1000
user::rw-
user:4294967295:rwx
group::r--
mask::rwx
other::r--
indicating the uid wasn't correctly translated according to the idmapped
mount. The problem is how we currently translate POSIX ACLs. Let's inspect
the callchain in this example:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
|> vfs_getxattr()
| -> __vfs_getxattr()
| -> handler->get == ovl_posix_acl_xattr_get()
| -> ovl_xattr_get()
| -> vfs_getxattr()
| -> __vfs_getxattr()
| -> handler->get() /* lower filesystem callback */
|> posix_acl_fix_xattr_to_user()
{
4 = make_kuid(&init_user_ns, 4);
4 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 4);
/* FAILURE */
-1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
}
If the user chooses to use option (2) and mounts overlayfs on top of
idmapped mounts inside the container things don't look that much better:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:10000000:65536
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
|> vfs_getxattr()
| -> __vfs_getxattr()
| -> handler->get == ovl_posix_acl_xattr_get()
| -> ovl_xattr_get()
| -> vfs_getxattr()
| -> __vfs_getxattr()
| -> handler->get() /* lower filesystem callback */
|> posix_acl_fix_xattr_to_user()
{
4 = make_kuid(&init_user_ns, 4);
4 = mapped_kuid_fs(&init_user_ns, 4);
/* FAILURE */
-1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
}
As is easily seen the problem arises because the idmapping of the lower
mount isn't taken into account as all of this happens in do_gexattr(). But
do_getxattr() is always called on an overlayfs mount and inode and thus
cannot possible take the idmapping of the lower layers into account.
This problem is similar for fscaps but there the translation happens as
part of vfs_getxattr() already. Let's walk through an fscaps overlayfs
callchain:
setcap 'cap_net_raw+ep' /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc
The expected outcome here is that we'll receive the cap_net_raw capability
as we are able to map the uid associated with the fscap to 0 within our
container. IOW, we want to see 0 as the result of the idmapping
translations.
If the user chooses option (1) we get the following callchain for fscaps:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
-> vfs_getxattr()
-> xattr_getsecurity()
-> security_inode_getsecurity() ________________________________
-> cap_inode_getsecurity() | |
{ V |
10000000 = make_kuid(0:0:4k /* overlayfs idmapping */, 10000000); |
10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000); |
/* Expected result is 0 and thus that we own the fscap. */ |
0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000); |
} |
-> vfs_getxattr_alloc() |
-> handler->get == ovl_other_xattr_get() |
-> vfs_getxattr() |
-> xattr_getsecurity() |
-> security_inode_getsecurity() |
-> cap_inode_getsecurity() |
{ |
0 = make_kuid(0:0:4k /* lower s_user_ns */, 0); |
10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); |
10000000 = from_kuid(0:0:4k /* overlayfs idmapping */, 10000000); |
|____________________________________________________________________|
}
-> vfs_getxattr_alloc()
-> handler->get == /* lower filesystem callback */
And if the user chooses option (2) we get:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:10000000:65536
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
-> vfs_getxattr()
-> xattr_getsecurity()
-> security_inode_getsecurity() _______________________________
-> cap_inode_getsecurity() | |
{ V |
10000000 = make_kuid(0:10000000:65536 /* overlayfs idmapping */, 0); |
10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000); |
/* Expected result is 0 and thus that we own the fscap. */ |
0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000); |
} |
-> vfs_getxattr_alloc() |
-> handler->get == ovl_other_xattr_get() |
|-> vfs_getxattr() |
-> xattr_getsecurity() |
-> security_inode_getsecurity() |
-> cap_inode_getsecurity() |
{ |
0 = make_kuid(0:0:4k /* lower s_user_ns */, 0); |
10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); |
0 = from_kuid(0:10000000:65536 /* overlayfs idmapping */, 10000000); |
|____________________________________________________________________|
}
-> vfs_getxattr_alloc()
-> handler->get == /* lower filesystem callback */
We can see how the translation happens correctly in those cases as the
conversion happens within the vfs_getxattr() helper.
For POSIX ACLs we need to do something similar. However, in contrast to
fscaps we cannot apply the fix directly to the kernel internal posix acl
data structure as this would alter the cached values and would also require
a rework of how we currently deal with POSIX ACLs in general which almost
never take the filesystem idmapping into account (the noteable exception
being FUSE but even there the implementation is special) and instead
retrieve the raw values based on the initial idmapping.
The correct values are then generated right before returning to
userspace. The fix for this is to move taking the mount's idmapping into
account directly in vfs_getxattr() instead of having it be part of
posix_acl_fix_xattr_to_user().
To this end we simply move the idmapped mount translation into a separate
step performed in vfs_{g,s}etxattr() instead of in
posix_acl_fix_xattr_{from,to}_user().
To see how this fixes things let's go back to the original example. Assume
the user chose option (1) and mounted overlayfs on top of idmapped mounts
on the host:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
|> vfs_getxattr()
| |> __vfs_getxattr()
| | -> handler->get == ovl_posix_acl_xattr_get()
| | -> ovl_xattr_get()
| | -> vfs_getxattr()
| | |> __vfs_getxattr()
| | | -> handler->get() /* lower filesystem callback */
| | |> posix_acl_getxattr_idmapped_mnt()
| | {
| | 4 = make_kuid(&init_user_ns, 4);
| | 10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
| | 10000004 = from_kuid(&init_user_ns, 10000004);
| | |_______________________
| | } |
| | |
| |> posix_acl_getxattr_idmapped_mnt() |
| { |
| V
| 10000004 = make_kuid(&init_user_ns, 10000004);
| 10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
| 10000004 = from_kuid(&init_user_ns, 10000004);
| } |_________________________________________________
| |
| |
|> posix_acl_fix_xattr_to_user() |
{ V
10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
/* SUCCESS */
4 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000004);
}
And similarly if the user chooses option (1) and mounted overayfs on top of
idmapped mounts inside the container:
idmapped mount /vol/contpool/merge: 0:10000000:65536
caller's idmapping: 0:10000000:65536
overlayfs idmapping (ofs->creator_cred): 0:10000000:65536
sys_getxattr()
-> path_getxattr()
-> getxattr()
-> do_getxattr()
|> vfs_getxattr()
| |> __vfs_getxattr()
| | -> handler->get == ovl_posix_acl_xattr_get()
| | -> ovl_xattr_get()
| | -> vfs_getxattr()
| | |> __vfs_getxattr()
| | | -> handler->get() /* lower filesystem callback */
| | |> posix_acl_getxattr_idmapped_mnt()
| | {
| | 4 = make_kuid(&init_user_ns, 4);
| | 10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
| | 10000004 = from_kuid(&init_user_ns, 10000004);
| | |_______________________
| | } |
| | |
| |> posix_acl_getxattr_idmapped_mnt() |
| { V
| 10000004 = make_kuid(&init_user_ns, 10000004);
| 10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
| 10000004 = from_kuid(0(&init_user_ns, 10000004);
| |_________________________________________________
| } |
| |
|> posix_acl_fix_xattr_to_user() |
{ V
10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
/* SUCCESS */
4 = from_kuid(0:10000000:65536 /* caller's idmappings */, 10000004);
}
The last remaining problem we need to fix here is ovl_get_acl(). During
ovl_permission() overlayfs will call:
ovl_permission()
-> generic_permission()
-> acl_permission_check()
-> check_acl()
-> get_acl()
-> inode->i_op->get_acl() == ovl_get_acl()
> get_acl() /* on the underlying filesystem)
->inode->i_op->get_acl() == /*lower filesystem callback */
-> posix_acl_permission()
passing through the get_acl request to the underlying filesystem. This will
retrieve the acls stored in the lower filesystem without taking the
idmapping of the underlying mount into account as this would mean altering
the cached values for the lower filesystem. The simple solution is to have
ovl_get_acl() simply duplicate the ACLs, update the values according to the
idmapped mount and return it to acl_permission_check() so it can be used in
posix_acl_permission(). Since overlayfs doesn't cache ACLs they'll be
released right after.
Link: https://github.com/brauner/mount-idmapped/issues/9
Cc: Seth Forshee <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Aleksa Sarai <[email protected]>
Cc: [email protected]
Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
Fixes: bc70682a497c ("ovl: support idmapped layers")
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.
If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.
Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
|
|
The IOMMU mailing list has moved to [email protected]
and the old list should bounce by now. Remove it from the
MAINTAINERS file.
Cc: [email protected]
Signed-off-by: Joerg Roedel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- another bogus identifier quirk (Keith Busch)
- use struct group in the tracer to avoid a gcc warning (Keith Busch)"
* tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme:
nvme: use struct group for generic command dwords
nvme-pci: phison e16 has bogus namespace ids
|
|
32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/e6b95a05e970af79000435166185e85b196b2ba2.1657202417.git.asml.silence@gmail.com
[axboe: turn bitwise OR into logical variant]
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch ensures that the clock notifier is unregistered
when driver probe is returning error.
Fixes: df8eb5691c48 ("i2c: Add driver for Cadence I2C controller")
Signed-off-by: Satish Nagireddy <[email protected]>
Tested-by: Lars-Peter Clausen <[email protected]>
Reviewed-by: Michal Simek <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull a devfreq fix for 5.19-rc6 from Chanwoo Choi:
"- Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver."
* tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: exynos-bus: Fix NULL pointer dereference
|
|
Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver.
Reported-by: Marek Szyprowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Fixes: b5d281f6c16d ("PM / devfreq: Rework freq_table to be local to devfreq struct")
Cc: [email protected]
Signed-off-by: Christian Marangi <[email protected]>
Acked-by: Chanwoo Choi <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"A fix for tinyconfig build error, a fix for section mismatch warning,
and two cleanups of obsolete code"
* tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Fix section mismatch warning
LoongArch: Fix build errors for tinyconfig
LoongArch: Remove obsolete mentions of vcsr
LoongArch: Drop these obsolete selects in Kconfig
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, netfilter, can, and bluetooth.
Current release - regressions:
- bluetooth: fix deadlock on hci_power_on_sync
Previous releases - regressions:
- sched: act_police: allow 'continue' action offload
- eth: usbnet: fix memory leak in error case
- eth: ibmvnic: properly dispose of all skbs during a failover
Previous releases - always broken:
- bpf:
- fix insufficient bounds propagation from
adjust_scalar_min_max_vals
- clear page contiguity bit when unmapping pool
- netfilter: nft_set_pipapo: release elements in clone from
abort path
- mptcp: netlink: issue MP_PRIO signals from userspace PMs
- can:
- rcar_canfd: fix data transmission failed on R-Car V3U
- gs_usb: gs_usb_open/close(): fix memory leak
Misc:
- add Wenjia as SMC maintainer"
* tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
wireguard: Kconfig: select CRYPTO_CHACHA_S390
crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
wireguard: selftests: use microvm on x86
wireguard: selftests: always call kernel makefile
wireguard: selftests: use virt machine on m68k
wireguard: selftests: set fake real time in init
r8169: fix accessing unset transport header
net: rose: fix UAF bug caused by rose_t0timer_expiry
usbnet: fix memory leak in error case
Revert "tls: rx: move counting TlsDecryptErrors for sync"
mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
mptcp: fix local endpoint accounting
selftests: mptcp: userspace PM support for MP_PRIO signals
mptcp: netlink: issue MP_PRIO signals from userspace PMs
mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
mptcp: Avoid acquiring PM lock for subflow priority changes
mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
net/mlx5e: Fix matchall police parameters validation
net/sched: act_police: allow 'continue' action offload
net: lan966x: hardcode the number of external ports
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Tag Intel pin control as supported in MAINTAINERS
- Fix a NULL pointer exception in the Aspeed driver
- Correct some NAND functions in the Sunxi A83T driver
- Use the right offset for some Sunxi pins
- Fix a zero base offset in the Freescale (NXP) i.MX93
- Fix the IRQ support in the STM32 driver
* tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: stm32: fix optional IRQ support to gpios
pinctrl: imx: Add the zero base flag for imx93
pinctrl: sunxi: sunxi_pconf_set: use correct offset
pinctrl: sunxi: a83t: Fix NAND function name for some pins
pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
MAINTAINERS: Update Intel pin control to Supported
|
|
These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.
It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".
Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running. Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit
ee774dac0da1 ("x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()")
moved PUSH_AND_CLEAR_REGS out of error_entry, into its own function, in
part to avoid calling error_entry() for XenPV.
However, commit
7c81c0c9210c ("x86/entry: Avoid very early RET")
had to change that because the 'ret' was too early and moved it into
idtentry, bloating the text size, since idtentry is expanded for every
exception vector.
However, with the advent of xen_error_entry() in commit
d147553b64bad ("x86/xen: Add UNTRAIN_RET")
it became possible to remove PUSH_AND_CLEAR_REGS from idtentry, back
into *error_entry().
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
|
|
On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:
> # ./ftracetest
> === Ftrace unit tests ===
> [1] Basic trace file check [PASS]
> [2] Basic test for tracers [PASS]
> [3] Basic trace clock test [PASS]
> [4] Basic event tracing check [PASS]
> [5] Change the ringbuffer size [PASS]
> [6] Snapshot and tracing setting [PASS]
> [7] trace_pipe and trace_marker [PASS]
> [8] Test ftrace direct functions against tracers [UNRESOLVED]
> [9] Test ftrace direct functions against kprobes [UNRESOLVED]
> [10] Generic dynamic event - add/remove eprobe events [FAIL]
> [11] Generic dynamic event - add/remove kprobe events
>
> It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> "
> [ 9332.752836] mmiotrace: enabled CPU7.
> [ 9332.788612] mmiotrace: disabled.
> [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0
It turns out that while syscall_regfunc() does have an ENDBR when
generated, it gets sealed by objtool's .ibt_endbr_seal list.
Since the only text references to this function:
$ git grep syscall_regfunc
include/linux/tracepoint.h:extern int syscall_regfunc(void);
include/trace/events/syscalls.h: syscall_regfunc, syscall_unregfunc
include/trace/events/syscalls.h: syscall_regfunc, syscall_unregfunc
kernel/tracepoint.c:int syscall_regfunc(void)
appear in the __tracepoint section which is excluded by objtool.
Fixes: 3c6f9f77e618 ("objtool: Rework ibt and extricate from stack validation")
Reported-by: Pengfei Xu <[email protected]
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Cannon lake is also affected by RETBleed, add it to the list.
Fixes: 6ad0ad2bf8a6 ("x86/bugs: Report Intel retbleed vulnerability")
Signed-off-by: Pawan Gupta <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
|
|
Fix a kernel NULL pointer dereference reported by gpio kselftests.
linereq_free() can be called as part of the cleanup of a failed request,
at which time the desc for a line may not have been determined, so it
is unsafe to dereference without a check.
Add a check prior to dereferencing the line desc.
Fixes: 2068339a6c35 ("gpiolib: cdev: Add hardware timestamp clock type")
Signed-off-by: Kent Gibson <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
init_numa_memory() is annotated __init and not used by any module,
thus don't export it.
Remove not needed EXPORT_SYMBOL for init_numa_memory() to fix the
following section mismatch warning:
MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference
from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()
The symbol init_numa_memory is exported and annotated __init
Fix this by removing the __init annotation of init_numa_memory or drop the export.
This is build on Linux 5.19-rc4.
Fixes: d4b6f1562a3c ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
|
|
Building loongarch:tinyconfig fails with the following error.
./arch/loongarch/include/asm/page.h: In function 'pfn_valid':
./arch/loongarch/include/asm/page.h:42:32: error: 'PHYS_OFFSET' undeclared
Add the missing include file and fix succeeding vdso errors.
Fixes: 09cfefb7fa70 ("LoongArch: Add memory management")
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
|
|
The `vcsr` only exists in the old hardware design, it isn't used in any
shipped hardware from Loongson-3A5000 on. Both scalar FP and LSX/LASX
instructions use the `fcsr` as their control and status registers now.
For example, the RM control bit in fcsr0 is shared by FP, LSX and LASX
instructions.
Particularly, fcsr16 to fcsr31 are reserved for LSX/LASX now, access to
these registers has no visible effect if LSX/LASX is enabled, and will
cause SXD/ASXD exceptions if LSX/LASX is not enabled.
So, mentions of vcsr are obsolete in the first place (it was just used
for debugging), let's remove them.
Reviewed-by: WANG Xuerui <[email protected]>
Signed-off-by: Qi Hu <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
|
|
Commit fa96b57c1490 ("LoongArch: Add build infrastructure") adds the new
file arch/loongarch/Kconfig.
As the work on LoongArch was probably quite some time under development,
various config symbols have changed and disappeared from the time of
initial writing of the Kconfig file and its inclusion in the repository.
The following four commits:
commit c126a53c2760 ("arch: remove GENERIC_FIND_FIRST_BIT entirely")
commit 140c8180eb7c ("arch: remove HAVE_COPY_THREAD_TLS")
commit aca52c398389 ("mm: remove CONFIG_HAVE_MEMBLOCK")
commit 3f08a302f533 ("mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option")
remove the mentioned config symbol, and enable the intended setup by
default without configuration.
Drop these obsolete selects in loongarch's Kconfig.
Reviewed-by: WANG Xuerui <[email protected]>
Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
|
|
Use the fbcon_info_from_console() wrapper which was added to kernel
v5.19 with commit 409d6c95f9c6 ("fbcon: Introduce wrapper for console->fb_info lookup").
Signed-off-by: Helge Deller <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
|
|
Verify that the fbdev or drm driver correctly adjusted the virtual
screen sizes. On failure report the failing driver and reject the screen
size change.
Signed-off-by: Helge Deller <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Cc: [email protected] # v5.4+
|
|
Fix small typo which causes the mask for the 'precharge1' setting
to be used with the 'precharge2' value.
Signed-off-by: Ezequiel Garcia <[email protected]>
Acked-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Javier Martinez Canillas <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We need to prevent that users configure a screen size which is smaller than the
currently selected font size. Otherwise rendering chars on the screen will
access memory outside the graphics memory region.
This patch adds a new function fbcon_modechange_possible() which
implements this check and which later may be extended with other checks
if necessary. The new function is called from the FBIOPUT_VSCREENINFO
ioctl handler in fbmem.c, which will return -EINVAL if userspace asked
for a too small screen size.
Signed-off-by: Helge Deller <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Cc: [email protected] # v5.4+
|
|
Prevent that users set a font size which is bigger than the physical screen.
It's unlikely this may happen (because screens are usually much larger than the
fonts and each font char is limited to 32x32 pixels), but it may happen on
smaller screens/LCD displays.
Signed-off-by: Helge Deller <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Cc: [email protected] # v4.14+
|
|
Need get the new fence when we replace the old one.
Fixes: 047a1b877ed48 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Christian König <[email protected]>
|
|
In vma destruction, the following race may occur:
Thread 1: Thread 2:
i915_vma_destroy();
...
list_del_init(vma->vm_link);
...
mutex_unlock(vma->vm->mutex);
__i915_vm_release();
release_references();
And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.
However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().
v2: Fix a typo in the commit message (Andi Shyti)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")
Cc: Niranjana Vishwanathapura <[email protected]>
Cc: Matthew Auld <[email protected]>
Signed-off-by: Thomas Hellström <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.
Fixes: 7e28d0b26759 ("drm/i915/adl-n: Enable ADL-N platform")
Cc: John Harrison <[email protected]>
Cc: Tejas Upadhyay <[email protected]>
Cc: Anusha Srivatsa <[email protected]>
Cc: Jani Nikula <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 971e4a9781742aaad1587e25fd5582b2dd595ef8)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
If drm_connector_init fails, intel_connector_free will be called to take
care of proper free. So it is necessary to drop the refcount of port
before intel_connector_free.
Fixes: 091a4f91942a ("drm/i915: Handle drm-layer errors in intel_dp_add_mst_connector")
Signed-off-by: Hangyu Hua <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: José Roberto de Souza <[email protected]>
(cherry picked from commit cea9ed611e85d36a05db52b6457bf584b7d969e2)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Jason A. Donenfeld says:
====================
wireguard patches for 5.19-rc6
1) A few small fixups to the selftests, per usual. Of particular note is
a fix for a test flake that occurred on especially fast systems that
boot in less than a second.
2) An addition during this cycle of some s390 crypto interacted with the
way wireguard selects dependencies, resulting in linker errors
reported by the kernel test robot. So Vladis sent in a patch for
that, which also required a small preparatory fix moving some Kconfig
symbols around.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Select the new implementation of CHACHA20 for S390 when available.
It is faster than the generic software implementation, but also prevents
some linker errors in certain situations.
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/linux-kernel/[email protected]/
Signed-off-by: Vladis Dronov <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Various accelerated software implementation Kconfig values for S390 were
mistakenly placed into drivers/crypto/Kconfig, even though they're
mainly just SIMD code and live in arch/s390/crypto/ like usual. This
gives them the very unusual dependency on CRYPTO_HW, which leads to
problems elsewhere.
This patch fixes the issue by moving the Kconfig values for non-hardware
drivers into the usual place in crypto/Kconfig.
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This makes for faster tests, faster compile time, and allows us to ditch
ACPI finally.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
These selftests are used for much more extensive changes than just the
wireguard source files. So always call the kernel's build file, which
will do something or nothing after checking the whole tree, per usual.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This should be a bit more stable hopefully.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Not all platforms have an RTC, and rather than trying to force one into
each, it's much easier to just set a fixed time. This is necessary
because WireGuard's latest handshakes parameter is returned in wallclock
time, and if the system time isn't set, and the system is really fast,
then this returns 0, which trips the test.
Turning this on requires setting CONFIG_COMPAT_32BIT_TIME=y, as musl
doesn't support settimeofday without it.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
66e4c8d95008 ("net: warn if transport header was not set") added
a check that triggers a warning in r8169, see [0].
The commit referenced in the Fixes tag refers to the change from
which the patch applies cleanly, there's nothing wrong with this
commit. It seems the actual issue (not bug, because the warning
is harmless here) was introduced with bdfa4ed68187
("r8169: use Giant Send").
[0] https://bugzilla.kernel.org/show_bug.cgi?id=216157
Fixes: 8d520b4de3ed ("r8169: work around RTL8125 UDP hw bug")
Reported-by: Erhard F. <[email protected]>
Tested-by: Erhard F. <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There are UAF bugs caused by rose_t0timer_expiry(). The
root cause is that del_timer() could not stop the timer
handler that is running and there is no synchronization.
One of the race conditions is shown below:
(thread 1) | (thread 2)
| rose_device_event
| rose_rt_device_down
| rose_remove_neigh
rose_t0timer_expiry | rose_stop_t0timer(rose_neigh)
... | del_timer(&neigh->t0timer)
| kfree(rose_neigh) //[1]FREE
neigh->dce_mode //[2]USE |
The rose_neigh is deallocated in position [1] and use in
position [2].
The crash trace triggered by POC is like below:
BUG: KASAN: use-after-free in expire_timers+0x144/0x320
Write of size 8 at addr ffff888009b19658 by task swapper/0/0
...
Call Trace:
<IRQ>
dump_stack_lvl+0xbf/0xee
print_address_description+0x7b/0x440
print_report+0x101/0x230
? expire_timers+0x144/0x320
kasan_report+0xed/0x120
? expire_timers+0x144/0x320
expire_timers+0x144/0x320
__run_timers+0x3ff/0x4d0
run_timer_softirq+0x41/0x80
__do_softirq+0x233/0x544
...
This patch changes rose_stop_ftimer() and rose_stop_t0timer()
in rose_remove_neigh() to del_timer_sync() in order that the
timer handler could be finished before the resources such as
rose_neigh and so on are deallocated. As a result, the UAF
bugs could be mitigated.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Seems to break hibernation. Disable for now until we can root
cause it.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216119
Acked-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Was dropped when we converted to the generic helpers.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Acked-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
usbnet_write_cmd_async() mixed up which buffers
need to be freed in which error case.
v2: add Fixes tag
v3: fix uninitialized buf pointer
Fixes: 877bd862f32b8 ("usbnet: introduce usbnet 3 command helpers")
Signed-off-by: Oliver Neukum <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This will allow the trace event to know the full size of the data
intended to be copied and silence read overflow checks.
Reported-by: John Garry <[email protected]>
Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Pull OpenRISC fixes from Stafford Horne:
"Fixups for OpenRISC found during recent testing:
- An OpenRISC irqchip fix to stop acking level interrupts which was
causing issues on SMP platforms
- A comment typo fix in our unwinder code"
* tag 'for-linus' of https://github.com/openrisc/linux:
openrisc: unwinder: Fix grammar issue in comment
irqchip: or1k-pic: Undefine mask_ack for level triggered hardware
|