Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.
Reviewed-by: Eric Farman <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.
Reviewed-by: Stefan Haberland <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.
dax_direct_access() should receive a virtual kernel address in kaddr.
Reviewed-by: Heiko Carstens <[email protected]>
Acked-by: Alexander Gordeev <[email protected]>
Signed-off-by: Gerald Schaefer <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Tests with hot-plugging crytpo cards on KVM guests with debug
kernel build revealed an use after free for the load field of
the struct zcrypt_card. The reason was an incorrect reference
handling of the zcrypt card object which could lead to a free
of the zcrypt card object while it was still in use.
This is an example of the slab message:
kernel: 0x00000000885a7512-0x00000000885a7513 @offset=1298. First byte 0x68 instead of 0x6b
kernel: Allocated in zcrypt_card_alloc+0x36/0x70 [zcrypt] age=18046 cpu=3 pid=43
kernel: kmalloc_trace+0x3f2/0x470
kernel: zcrypt_card_alloc+0x36/0x70 [zcrypt]
kernel: zcrypt_cex4_card_probe+0x26/0x380 [zcrypt_cex4]
kernel: ap_device_probe+0x15c/0x290
kernel: really_probe+0xd2/0x468
kernel: driver_probe_device+0x40/0xf0
kernel: __device_attach_driver+0xc0/0x140
kernel: bus_for_each_drv+0x8c/0xd0
kernel: __device_attach+0x114/0x198
kernel: bus_probe_device+0xb4/0xc8
kernel: device_add+0x4d2/0x6e0
kernel: ap_scan_adapter+0x3d0/0x7c0
kernel: ap_scan_bus+0x5a/0x3b0
kernel: ap_scan_bus_wq_callback+0x40/0x60
kernel: process_one_work+0x26e/0x620
kernel: worker_thread+0x21c/0x440
kernel: Freed in zcrypt_card_put+0x54/0x80 [zcrypt] age=9024 cpu=3 pid=43
kernel: kfree+0x37e/0x418
kernel: zcrypt_card_put+0x54/0x80 [zcrypt]
kernel: ap_device_remove+0x4c/0xe0
kernel: device_release_driver_internal+0x1c4/0x270
kernel: bus_remove_device+0x100/0x188
kernel: device_del+0x164/0x3c0
kernel: device_unregister+0x30/0x90
kernel: ap_scan_adapter+0xc8/0x7c0
kernel: ap_scan_bus+0x5a/0x3b0
kernel: ap_scan_bus_wq_callback+0x40/0x60
kernel: process_one_work+0x26e/0x620
kernel: worker_thread+0x21c/0x440
kernel: kthread+0x150/0x168
kernel: __ret_from_fork+0x3c/0x58
kernel: ret_from_fork+0xa/0x30
kernel: Slab 0x00000372022169c0 objects=20 used=18 fp=0x00000000885a7c88 flags=0x3ffff00000000a00(workingset|slab|node=0|zone=1|lastcpupid=0x1ffff)
kernel: Object 0x00000000885a74b8 @offset=1208 fp=0x00000000885a7c88
kernel: Redzone 00000000885a74b0: bb bb bb bb bb bb bb bb ........
kernel: Object 00000000885a74b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
kernel: Object 00000000885a74c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
kernel: Object 00000000885a74d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
kernel: Object 00000000885a74e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
kernel: Object 00000000885a74f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
kernel: Object 00000000885a7508: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 68 4b 6b 6b 6b a5 kkkkkkkkkkhKkkk.
kernel: Redzone 00000000885a7518: bb bb bb bb bb bb bb bb ........
kernel: Padding 00000000885a756c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ
kernel: CPU: 0 PID: 387 Comm: systemd-udevd Not tainted 6.8.0-HF #2
kernel: Hardware name: IBM 3931 A01 704 (KVM/Linux)
kernel: Call Trace:
kernel: [<00000000ca5ab5b8>] dump_stack_lvl+0x90/0x120
kernel: [<00000000c99d78bc>] check_bytes_and_report+0x114/0x140
kernel: [<00000000c99d53cc>] check_object+0x334/0x3f8
kernel: [<00000000c99d820c>] alloc_debug_processing+0xc4/0x1f8
kernel: [<00000000c99d852e>] get_partial_node.part.0+0x1ee/0x3e0
kernel: [<00000000c99d94ec>] ___slab_alloc+0xaf4/0x13c8
kernel: [<00000000c99d9e38>] __slab_alloc.constprop.0+0x78/0xb8
kernel: [<00000000c99dc8dc>] __kmalloc+0x434/0x590
kernel: [<00000000c9b4c0ce>] ext4_htree_store_dirent+0x4e/0x1c0
kernel: [<00000000c9b908a2>] htree_dirblock_to_tree+0x17a/0x3f0
kernel: [<00000000c9b919dc>] ext4_htree_fill_tree+0x134/0x400
kernel: [<00000000c9b4b3d0>] ext4_dx_readdir+0x160/0x2f0
kernel: [<00000000c9b4bedc>] ext4_readdir+0x5f4/0x760
kernel: [<00000000c9a7efc4>] iterate_dir+0xb4/0x280
kernel: [<00000000c9a7f1ea>] __do_sys_getdents64+0x5a/0x120
kernel: [<00000000ca5d6946>] __do_syscall+0x256/0x310
kernel: [<00000000ca5eea10>] system_call+0x70/0x98
kernel: INFO: lockdep is turned off.
kernel: FIX kmalloc-96: Restoring Poison 0x00000000885a7512-0x00000000885a7513=0x6b
kernel: FIX kmalloc-96: Marking all objects used
The fix is simple: Before use of the queue not only the queue object
but also the card object needs to increase it's reference count
with a call to zcrypt_card_get(). Similar after use of the queue
not only the queue but also the card object's reference count is
decreased with zcrypt_card_put().
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Cc: [email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Various virtual vs physical address usage fixes
- Fix error handling in Processor Activity Instrumentation device
driver, and export number of counters with a sysfs file
- Allow for multiple events when Processor Activity Instrumentation
counters are monitored in system wide sampling
- Change multiplier and shift values of the Time-of-Day clock source to
improve steering precision
- Remove a couple of unneeded GFP_DMA flags from allocations
- Disable mmap alignment if randomize_va_space is also disabled, to
avoid a too small heap
- Various changes to allow s390 to be compiled with LLVM=1, since
ld.lld and llvm-objcopy will have proper s390 support witch clang 19
- Add __uninitialized macro to Compiler Attributes. This is helpful
with s390's FPU code where some users have up to 520 byte stack
frames. Clearing such stack frames (if INIT_STACK_ALL_PATTERN or
INIT_STACK_ALL_ZERO is enabled) before they are used contradicts the
intention (performance improvement) of such code sections.
- Convert switch_to() to an out-of-line function, and use the generic
switch_to header file
- Replace the usage of s390's debug feature with pr_debug() calls
within the zcrypt device driver
- Improve hotplug support of the Adjunct Processor device driver
- Improve retry handling in the zcrypt device driver
- Various changes to the in-kernel FPU code:
- Make in-kernel FPU sections preemptible
- Convert various larger inline assemblies and assembler files to
C, mainly by using singe instruction inline assemblies. This
increases readability, but also allows makes it easier to add
proper instrumentation hooks
- Cleanup of the header files
- Provide fast variants of csum_partial() and
csum_partial_copy_nocheck() based on vector instructions
- Introduce and use a lock to synchronize accesses to zpci device data
structures to avoid inconsistent states caused by concurrent accesses
- Compile the kernel without -fPIE. This addresses the following
problems if the kernel is compiled with -fPIE:
- It uses dynamic symbols (.dynsym), for which the linker refuses
to allow more than 64k sections. This can break features which
use '-ffunction-sections' and '-fdata-sections', including
kpatch-build and function granular KASLR
- It unnecessarily uses GOT relocations, adding an extra layer of
indirection for many memory accesses
- Fix shared_cpu_list for CPU private L2 caches, which incorrectly were
reported as globally shared
* tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (117 commits)
s390/tools: handle rela R_390_GOTPCDBL/R_390_GOTOFF64
s390/cache: prevent rebuild of shared_cpu_list
s390/crypto: remove retry loop with sleep from PAES pkey invocation
s390/pkey: improve pkey retry behavior
s390/zcrypt: improve zcrypt retry behavior
s390/zcrypt: introduce retries on in-kernel send CPRB functions
s390/ap: introduce mutex to lock the AP bus scan
s390/ap: rework ap_scan_bus() to return true on config change
s390/ap: clarify AP scan bus related functions and variables
s390/ap: rearm APQNs bindings complete completion
s390/configs: increase number of LOCKDEP_BITS
s390/vfio-ap: handle hardware checkstop state on queue reset operation
s390/pai: change sampling event assignment for PMU device driver
s390/boot: fix minor comment style damages
s390/boot: do not check for zero-termination relocation entry
s390/boot: make type of __vmlinux_relocs_64_start|end consistent
s390/boot: sanitize kaslr_adjust_relocs() function prototype
s390/boot: simplify GOT handling
s390: vmlinux.lds.S: fix .got.plt assertion
s390/boot: workaround current 'llvm-objdump -t -j ...' behavior
...
|
|
Pull block updates from Jens Axboe:
- MD pull requests via Song:
- Cleanup redundant checks (Yu Kuai)
- Remove deprecated headers (Marc Zyngier, Song Liu)
- Concurrency fixes (Li Lingfeng)
- Memory leak fix (Li Nan)
- Refactor raid1 read_balance (Yu Kuai, Paul Luse)
- Clean up and fix for md_ioctl (Li Nan)
- Other small fixes (Gui-Dong Han, Heming Zhao)
- MD atomic limits (Christoph)
- NVMe pull request via Keith:
- RDMA target enhancements (Max)
- Fabrics fixes (Max, Guixin, Hannes)
- Atomic queue_limits usage (Christoph)
- Const use for class_register (Ricardo)
- Identification error handling fixes (Shin'ichiro, Keith)
- Improvement and cleanup for cached request handling (Christoph)
- Moving towards atomic queue limits. Core changes and driver bits so
far (Christoph)
- Fix UAF issues in aoeblk (Chun-Yi)
- Zoned fix and cleanups (Damien)
- s390 dasd cleanups and fixes (Jan, Miroslav)
- Block issue timestamp caching (me)
- noio scope guarding for zoned IO (Johannes)
- block/nvme PI improvements (Kanchan)
- Ability to terminate long running discard loop (Keith)
- bdev revalidation fix (Li)
- Get rid of old nr_queues hack for kdump kernels (Ming)
- Support for async deletion of ublk (Ming)
- Improve IRQ bio recycling (Pavel)
- Factor in CPU capacity for remote vs local completion (Qais)
- Add shared_tags configfs entry for null_blk (Shin'ichiro
- Fix for a regression in page refcounts introduced by the folio
unification (Tony)
- Misc fixes and cleanups (Arnd, Colin, John, Kunwu, Li, Navid,
Ricardo, Roman, Tang, Uwe)
* tag 'for-6.9/block-20240310' of git://git.kernel.dk/linux: (221 commits)
block: partitions: only define function mac_fix_string for CONFIG_PPC_PMAC
block/swim: Convert to platform remove callback returning void
cdrom: gdrom: Convert to platform remove callback returning void
block: remove disk_stack_limits
md: remove mddev->queue
md: don't initialize queue limits
md/raid10: use the atomic queue limit update APIs
md/raid5: use the atomic queue limit update APIs
md/raid1: use the atomic queue limit update APIs
md/raid0: use the atomic queue limit update APIs
md: add queue limit helpers
md: add a mddev_is_dm helper
md: add a mddev_add_trace_msg helper
md: add a mddev_trace_remap helper
bcache: move calculation of stripe_size and io_opt into bcache_device_init
virtio_blk: Do not use disk_set_max_open/active_zones()
aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts
block: move capacity validation to blkpg_do_ioctl()
block: prevent division by zero in blk_rq_stat_sum()
drbd: atomically update queue limits in drbd_reconsider_queue_parameters
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull block handle updates from Christian Brauner:
"Last cycle we changed opening of block devices, and opening a block
device would return a bdev_handle. This allowed us to implement
support for restricting and forbidding writes to mounted block
devices. It was accompanied by converting and adding helpers to
operate on bdev_handles instead of plain block devices.
That was already a good step forward but ultimately it isn't necessary
to have special purpose helpers for opening block devices internally
that return a bdev_handle.
Fundamentally, opening a block device internally should just be
equivalent to opening files. So now all internal opens of block
devices return files just as a userspace open would. Instead of
introducing a separate indirection into bdev_open_by_*() via struct
bdev_handle bdev_file_open_by_*() is made to just return a struct
file. Opening and closing a block device just becomes equivalent to
opening and closing a file.
This all works well because internally we already have a pseudo fs for
block devices and so opening block devices is simple. There's a few
places where we needed to be careful such as during boot when the
kernel is supposed to mount the rootfs directly without init doing it.
Here we need to take care to ensure that we flush out any asynchronous
file close. That's what we already do for opening, unpacking, and
closing the initramfs. So nothing new here.
The equivalence of opening and closing block devices to regular files
is a win in and of itself. But it also has various other advantages.
We can remove struct bdev_handle completely. Various low-level helpers
are now private to the block layer. Other helpers were simply
removable completely.
A follow-up series that is already reviewed build on this and makes it
possible to remove bdev->bd_inode and allows various clean ups of the
buffer head code as well. All places where we stashed a bdev_handle
now just stash a file and use simple accessors to get to the actual
block device which was already the case for bdev_handle"
* tag 'vfs-6.9.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
block: remove bdev_handle completely
block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
bdev: remove bdev pointer from struct bdev_handle
bdev: make struct bdev_handle private to the block layer
bdev: make bdev_{release, open_by_dev}() private to block layer
bdev: remove bdev_open_by_path()
reiserfs: port block device access to file
ocfs2: port block device access to file
nfs: port block device access to files
jfs: port block device access to file
f2fs: port block device access to files
ext4: port block device access to file
erofs: port device access to file
btrfs: port device access to file
bcachefs: port block device access to file
target: port block device access to file
s390: port block device access to file
nvme: port block device access to file
block2mtd: port device access to files
bcache: port block device access to files
...
|
|
This patch reworks and improves the pkey retry behavior for the
pkey_ep11key2pkey() function. In contrast to the pkey_skey2pkey()
function which is used to trigger a protected key derivation from an
CCA secure data or cipher key the EP11 counterpart function had no
proper retry loop implemented. This patch now introduces code which
acts similar to the retry already done for CCA keys for this function
used for EP11 keys.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch reworks and improves the zcrypt retry behavior:
- The zcrypt_rescan_req counter has been removed. This
counter variable has been increased on some transport
errors and was used as a gatekeeper for AP bus rescans.
- Rework of the zcrypt_process_rescan() function to not
use the above counter variable any more. Instead now
always the ap_bus_force_rescan() function is called
(as this has been improved with a previous patch).
- As the zcrpyt_process_rescan() function is called in
all cprb send functions in case of the first attempt
to send failed with ENODEV now before the next attempt
to send an cprb is started.
- Introduce a define ZCRYPT_WAIT_BINDINGS_COMPLETE_MS
for the amount of milliseconds to have the zcrypt API
wait for AP bindings complete. This amount has been
reduced to 30s (was 60s). Some playing around showed
that 30s is a really fair limit.
The result of the above together with the patches to
improve the AP scan bus functions is that after the
first loop of cprb send retries when the result is a
ENODEV the AP bus scan is always triggered (synchronous).
If the AP bus scan detects changes in the configuration,
all the send functions now retry when the first attempt
was failing with ENODEV in the hope that now a suitable
device has appeared.
About concurrency: The ap_bus_force_rescan() uses a mutex
to ensure only one active AP bus scan is running. Another
caller of this function is blocked as long as the scan is
running but does not cause yet another scan. Instead the
result of the 'other' scan is used. This affects only tasks
which run into an initial ENODEV. Tasks with successful
delivery of cprbs will never invoke the bus scan and thus
never get blocked by the mutex.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
The both functions zcrypt_send_cprb() and zcrypt_send_ep11_cprb()
are used to send CPRBs in-kernel from different sources. For
example the pkey module may call one of the functions in
zcrypt_ep11misc.c to trigger a derive of a protected key from
a secure key blob via an existing crypto card. These both
functions are then the internal API to send the CPRB and
receive the response.
All the ioctl functions to send an CPRB down to the addressed
crypto card use some kind of retry mechanism. When the first
attempt fails with ENODEV, a bus rescan is triggered and a
loop with retries is carried out.
For the both named internal functions there was never any
retry attempt made. This patch now introduces the retry code
even for this both internal functions to have effectively
same behavior on sending an CPRB from an in-kernel source
and sending an CPRB from userspace via ioctl.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Rework the invocations around ap_scan_bus():
- Protect ap_scan_bus() with a mutex to make sure only one
scan at a time is running.
- The workqueue invocation which is triggered by either the
module init or via AP bus scan timer expiration uses this
mutex and if there is already a scan running, the work
is simple aborted (as the job is done by another task).
- The ap_bus_force_rescan() which is invoked by higher level
layers mostly on failures which indicate a bus scan may
help is reworked to call ap_scan_bus() direct instead of
enqueuing work into a system workqueue and waiting for that
to finish. Of course the mutex is respected and in case of
another task already running a bus scan the shortcut of
waiting for this scan to finish and reusing the scan result
is taken.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
The AP scan bus function now returns true if there have
been any config changes detected. This will become
important in a follow up patch which will exploit this
hint for further actions. This also required to have
the AP scan bus timer callback reworked as the function
signature has changed to bool ap_scan_bus(void).
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch tries to clarify the functions and variables
around the AP scan bus job. All these variables and
functions start with ap_scan_bus and are declared in
one place now.
No functional changes in this patch - only renaming and
move of code or declarations.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
The APQN bindings complete completion was used to reflect
that 1st the AP bus initial scan is done and 2nd all the
detected APQNs have been bound to a device driver.
This was a single-shot action. However, as the AP bus
supports hot-plug it may be that new APQNs appear reflected
as new AP queue and card devices which need to be bound
to appropriate device drivers. So the condition that
all existing AP queue devices are bound to device drivers
may go away for a certain time.
This patch now checks during AP bus scan for maybe new AP
devices appearing and does a re-init of the internal completion
variable. So the AP bus function ap_wait_apqn_bindings_complete()
now may block on this condition variable even later after
initial scan is through when new APQNs appear which need to
get bound.
This patch also moves the check for binding complete invocation
from the probe function to the end of the AP bus scan function.
This change also covers some weird scenarios where during a
card hotplug the binding of the card device was sufficient for
binding complete but the queue devices where still in the
process of being discovered.
As of now this change has no impact on existing code. The
behavior change in the now later bindings complete should not
impact any code (and has been tested so far). The only
exploiter is the zcrypt function zcrypt_wait_api_operational()
which only initial calls ap_wait_apqn_bindings_complete().
However, this new behavior of the AP bus wait for APQNs bindings
complete function will be used in a later patch exploiting
this for the zcrypt API layer.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Update vfio_ap_mdev_reset_queue() to handle an unexpected checkstop (hardware error) the
same as the deconfigured case. This prevents unexpected and unhelpful warnings in the
event of a hardware error.
We also stop lying about a queue's reset response code. This was originally done so we
could force vfio_ap_mdev_filter_matrix to pass a deconfigured device through to the guest
for the hotplug scenario. vfio_ap_mdev_filter_matrix is instead modified to allow
passthrough for all queues with reset state normal, deconfigured, or checkstopped. In the
checkstopped case we choose to pass the device through and let the error state be
reflected at the guest level.
Signed-off-by: "Jason J. Herne" <[email protected]>
Reviewed-by: Anthony Krowiak <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Found with git grep 'MODULE_AUTHOR(".*([^)]*@'
Fixed with
sed -i '/MODULE_AUTHOR(".*([^)]*@/{s/ (/ </g;s/)"/>"/;s/)and/> and/}' \
$(git grep -l 'MODULE_AUTHOR(".*([^)]*@')
Also:
in drivers/media/usb/siano/smsusb.c normalise ", INC" to ", Inc";
this is what every other MODULE_AUTHOR for this company says,
and it's what the header says
in drivers/sbus/char/openprom.c normalise a double-spaced separator;
this is clearly copied from the copyright header,
where the names are aligned on consecutive lines thusly:
* Linux/SPARC PROM Configuration Driver
* Copyright (C) 1996 Thomas K. Dyas ([email protected])
* Copyright (C) 1996 Eddie C. Dost ([email protected])
but the authorship branding is single-line
Link: https://lkml.kernel.org/r/mk3geln4azm5binjjlfsgjepow4o73domjv6ajybws3tz22vb3@tarta.nabijaczleweli.xyz
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Pass the constant limits directly to blk_mq_alloc_disk, set the nonrot
flag there as well, and then use the commit API to change the transfer
size and logical block size dependent values.
This relies on the assumption that no I/O can be pending before the
devices moves into the ready state and doesn't need extra freezing
for changes to the queue limits.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Stefan Haberland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Most of the code in setup_blk_queue is shared between all disciplines.
Move it to common code and leave a method to query the maximum number
of transferable blocks, and a flag to indicate discard support.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Stefan Haberland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Reflow dasd_state_basic_to_ready a bit to make it easier to modify.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Stefan Haberland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Fix invalid -EBUSY on ccw_device_start() which can lead to failing
device initialization
- Add missing multiplication by 8 in __iowrite64_copy() to get the
correct byte length before calling zpci_memcpy_toio()
- Various config updates
* tag 's390-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: fix invalid -EBUSY on ccw_device_start
s390: use the correct count for __iowrite64_copy()
s390/configs: update default configurations
s390/configs: enable INIT_STACK_ALL_ZERO in all configurations
s390/configs: provide compat topic configuration target
|
|
In preparation for checking whether the architecture has data cache
aliasing within alloc_dax(), modify the error handling of dcssblk
dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures.
Considering that s390 is not a data cache aliasing architecture,
and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP
from alloc_dax() should make dcssblk_add_store() fail.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Acked-by: Heiko Carstens <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Mikulas Patocka <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Sclafani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The s390 common I/O layer (CIO) returns an unexpected -EBUSY return code
when drivers try to start I/O while a path-verification (PV) process is
pending. This can lead to failed device initialization attempts with
symptoms like broken network connectivity after boot.
Fix this by replacing the -EBUSY return code with a deferred condition
code 1 reply to make path-verification handling consistent from a
driver's point of view.
The problem can be reproduced semi-regularly using the following process,
while repeating steps 2-3 as necessary (example assumes an OSA device
with bus-IDs 0.0.a000-0.0.a002 on CHPID 0.02):
1. echo 0.0.a000,0.0.a001,0.0.a002 >/sys/bus/ccwgroup/drivers/qeth/group
2. echo 0 > /sys/bus/ccwgroup/devices/0.0.a000/online
3. echo 1 > /sys/bus/ccwgroup/devices/0.0.a000/online ; \
echo on > /sys/devices/css0/chp0.02/status
Background information:
The common I/O layer starts path-verification I/Os when it receives
indications about changes in a device path's availability. This occurs
for example when hardware events indicate a change in channel-path
status, or when a manual operation such as a CHPID vary or configure
operation is performed.
If a driver attempts to start I/O while a PV is running, CIO reports a
successful I/O start (ccw_device_start() return code 0). Then, after
completion of PV, CIO synthesizes an interrupt response that indicates
an asynchronous status condition that prevented the start of the I/O
(deferred condition code 1).
If a PV indication arrives while a device is busy with driver-owned I/O,
PV is delayed until after I/O completion was reported to the driver's
interrupt handler. To ensure that PV can be started eventually, CIO
reports a device busy condition (ccw_device_start() return code -EBUSY)
if a driver tries to start another I/O while PV is pending.
In some cases this -EBUSY return code causes device drivers to consider
a device not operational, resulting in failed device initialization.
Note: The code that introduced the problem was added in 2003. Symptoms
started appearing with the following CIO commit that causes a PV
indication when a device is removed from the cio_ignore list after the
associated parent subchannel device was probed, but before online
processing of the CCW device has started:
2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
During boot, the cio_ignore list is modified by the cio_ignore dracut
module [1] as well as Linux vendor-specific systemd service scripts[2].
When combined, this commit and boot scripts cause a frequent occurrence
of the problem during boot.
[1] https://github.com/dracutdevs/dracut/tree/master/modules.d/81cio_ignore
[2] https://github.com/SUSE/s390-tools/blob/master/cio_ignore.service
Cc: [email protected] # v5.15+
Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
Tested-By: Thorsten Winkler <[email protected]>
Reviewed-by: Thorsten Winkler <[email protected]>
Signed-off-by: Peter Oberparleiter <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
MEM_PREPARE_ONLINE memory notifier makes memory block physical
accessible via sclp assign command. The notifier ensures self-contained
memory maps are accessible and hence enabling the "memmap on memory" on
s390.
MEM_FINISH_OFFLINE memory notifier shifts the memory block to an
inaccessible state via sclp unassign command.
Implementation considerations:
* When MHP_MEMMAP_ON_MEMORY is disabled, the system retains the old
behavior. This means the memory map is allocated from default memory.
* If MACHINE_HAS_EDAT1 is unavailable, MHP_MEMMAP_ON_MEMORY is
automatically disabled. This ensures that vmemmap pagetables do not
consume additional memory from the default memory allocator.
* The MEM_GOING_ONLINE notifier has been modified to perform no
operation, as MEM_PREPARE_ONLINE already executes the sclp assign
command.
* The MEM_CANCEL_ONLINE/MEM_OFFLINE notifier now performs no operation, as
MEM_FINISH_OFFLINE already executes the sclp unassign command.
Link: https://lkml.kernel.org/r/[email protected]
Reviewed-by: Gerald Schaefer <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Sumanth Korikkar <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Remove memory notifier types which are unhandled by s390. Unhandled
memory notifier types are covered by default case.
Link: https://lkml.kernel.org/r/[email protected]
Suggested-by: Alexander Gordeev <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Sumanth Korikkar <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The ap_bus is using inline functions of the ultravisor (uv) in-kernel
API. The related header file is implicitly included via several other
headers. Replace this by an explicit include of the ultravisor header
in the ap_bus file.
Signed-off-by: Holger Dengler <[email protected]>
Reviewed-by: Harald Freudenberger <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Pass the few limits scm_block imposes directly to blk_mq_alloc_disk
instead of setting them one at a time.
Signed-off-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Pass the queue limits directly to blk_alloc_disk instead of setting them
one at a time.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Pass a queue_limits to blk_alloc_disk and apply it if non-NULL. This
will allow allocating queues with valid queue limits instead of setting
the values one at a time later.
Also change blk_alloc_disk to return an ERR_PTR instead of just NULL
which can't distinguish errors.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch introduces dynamic debug hexdump invocation
possibilities to be able to:
- dump an CCA or EP11 CPRB request as early as possible
when received via ioctl from userspace but after the
ap message has been collected together.
- dump an CCA or EP11 CPRB reply short before it is
transferred via ioctl into userspace.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch introduces two dynamic debug hexdump
invocation possibilities to be able to a) dump an
AP message immediately before it goes into the
firmware queue and b) dump a fresh from the
firmware queue received AP message.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch replaces all the s390 debug feature calls with
debug level by dynamic debug calls pr_debug. These calls
are much more flexible and each single invocation can get
enabled/disabled at runtime wheres the s390 debug feature
debug calls have only one knob - enable or disable all in
one bunch.
This patch follows a similar change for the AP bus and
zcrypt device driver code. All this code uses dynamic
debugging with pr_debug and friends for emitting debug
traces now.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Cleanup and harmonize the s390 debug feature calls
and defines for the pkey module to be similar to
the debug feature as it is used in the zcrypt device
driver and AP bus.
More or less only renaming but no functional changes.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch replaces all the s390 debug feature calls with
debug level by dynamic debug calls pr_debug. These calls
are much more flexible and each single invocation can get
enabled/disabled at runtime wheres the s390 debug feature
debug calls have only one knob - enable or disable all in
one bunch. The benefit is especially significant with
high frequency called functions like the AP bus scan. In
most debugging scenarios you don't want and need them, but
sometimes it is crucial to know exactly when and how long
the AP bus scan took.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
This patch harmonizes the calls and defines around the
s390 debug feature as it is used in the AP bus and
zcrypt device driver code.
More or less cleanup and renaming, no functional changes.
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Pass a queue_limits to blk_mq_alloc_disk and apply it if non-NULL. This
will allow allocating queues with valid queue limits instead of setting
the values one at a time later.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Move the switch_to() implementation to process.c and use the generic
switch_to.h header file instead, like some other architectures.
This addresses also the oddity that the old switch_to() implementation
assigns the return value of __switch_to() to 'prev' instead of 'last',
like it should.
Remove also all includes of switch_to.h from C files, except process.c.
Acked-by: Alexander Gordeev <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Once the discipline is associated with the device, deleting the device
takes care of decrementing the module's refcount. Doing it manually on
this error path causes refcount to artificially decrease on each error
while it should just stay the same.
Fixes: c020d722b110 ("s390/dasd: fix panic during offline processing")
Signed-off-by: Miroslav Franc <[email protected]>
Signed-off-by: Jan Höppner <[email protected]>
Signed-off-by: Stefan Haberland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Some ERP errors still share the same message format and only add
different reason codes to it. These reason codes don't have any meaning
anymore.
Make the individual error messages more explicit and remove the reason
codes altogether. Comments around the error messages are also removed as
they provide no additional value anymore with more explicit messages.
Signed-off-by: Jan Höppner <[email protected]>
Reviewed-by: Stefan Haberland <[email protected]>
Signed-off-by: Stefan Haberland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the matrix_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Acked-by: Halil Pasic <[email protected]>
Reviewed-by: Anthony Krowiak <[email protected]>
Reviewed-by: "Jason J. Herne" <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the ap_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Reviewed-by: Harald Freudenberger <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the scm_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the ccw_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the css_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the ccwgroup_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: "Ricardo B. Marliere" <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
The measurement block origin address is an absolute address; therefore
add a missing virt_to_phys() translation to the cmf_activate() inline
assembly.
This doesn't fix a bug, since virtual and physical addresses are
currently identical.
Acked-by: Alexander Gordeev <[email protected]>
Acked-by: Vineeth Vijayan <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
The address of the measurement block can be anywhere in 64 bit
absolute space. See description of the schm instruction in the
Principles of Operation.
Therefore remove the GFP_DMA flag when allocating the block.
Acked-by: Alexander Gordeev <[email protected]>
Reviewed-by: Vineeth Vijayan <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Remove GFP_DMA flag when allocating memory to be used for CHSC control
blocks. The CHSC instruction can access memory beyond the DMA zone.
Suggested-by: Heiko Carstens <[email protected]>
Reviewed-by: Vineeth Vijayan <[email protected]>
Signed-off-by: Peter Oberparleiter <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Add missing virt_to_phys() / phys_to_virt() translation to
alloc_chan_prog() and free_chan_prog().
This doesn't fix a bug since virtual and physical addresses
are currently the same.
Signed-off-by: Thomas Richter <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|