aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-11virtio_ring: packed: extract the logic of vring initXuan Zhuo1-11/+17
Separate the logic of initializing vring, and subsequent patches will call it separately. This function completes the variable initialization of packed vring. It together with the logic of atatch constitutes the initialization of vring. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: packed: extract the logic of alloc state and extraXuan Zhuo1-14/+34
Separate the logic for alloc desc_state and desc_extra, which will be called separately by subsequent patches. Use struct vring_packed to pass desc_state, desc_extra. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: packed: extract the logic of alloc queueXuan Zhuo1-29/+51
Separate the logic of packed to create vring queue. This feature is required for subsequent virtuqueue reset vring. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: packed: introduce vring_free_packedXuan Zhuo1-0/+22
Free the structure struct vring_vritqueue_packed. Subsequent patches require it. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: introduce virtqueue_resize_split()Xuan Zhuo1-0/+34
virtio ring split supports resize. Only after the new vring is successfully allocated based on the new num, we will release the old vring. In any case, an error is returned, indicating that the vring still points to the old vring. In the case of an error, re-initialize(virtqueue_reinit_split()) the virtqueue to ensure that the vring can be used. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: reserve vring_align, may_reduce_numXuan Zhuo1-0/+10
In vring_alloc_queue_split() save vring_align, may_reduce_num to structure vring_virtqueue_split. Used to create a new vring when implementing resize. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: introduce virtqueue_reinit_split()Xuan Zhuo1-0/+23
Introduce a function to initialize vq without allocating new ring, desc_state, desc_extra. Subsequent patches will call this function after reset vq to reinitialize vq. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: extract the logic of attach vringXuan Zhuo1-13/+10
Separate the logic of attach vring, subsequent patches will call it separately. virtqueue_vring_init_split() completes the initialization of other variables of vring split. We can directly use vq->split = *vring_split to complete attach. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: extract the logic of vring initXuan Zhuo1-10/+21
Separate the logic of initializing vring, and subsequent patches will call it separately. This function completes the variable initialization of split vring. It together with the logic of atatch constitutes the initialization of vring. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: extract the logic of alloc state and extraXuan Zhuo1-16/+36
Separate the logic of creating desc_state, desc_extra, and subsequent patches will call it independently. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: extract the logic of alloc queueXuan Zhuo1-25/+40
Separate the logic of split to create vring queue. This feature is required for subsequent virtuqueue reset vring. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: introduce vring_free_split()Xuan Zhuo1-0/+11
Free the structure struct vring_vritqueue_split. Subsequent patches require it. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: __vring_new_virtqueue() accept struct vring_virtqueue_splitXuan Zhuo1-14/+15
__vring_new_virtqueue() instead accepts struct vring_virtqueue_split. The purpose of this is to pass more information into __vring_new_virtqueue() to make the code simpler and the structure cleaner. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split: stop __vring_new_virtqueue as export symbolXuan Zhuo3-21/+18
There is currently only one place to reference __vring_new_virtqueue() directly from the outside of virtio core. And here vring_new_virtqueue() can be used instead. Subsequent patches will modify __vring_new_virtqueue, so stop it as an export symbol for now. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: introduce virtqueue_init()Xuan Zhuo1-16/+22
Separate the logic of virtqueue initialization. These variables should be reset during reset. This logic can be called independently when implementing resize/reset later. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: split vring_virtqueueXuan Zhuo1-56/+60
Separate the two inline structures(split and packed) from the structure vring_virtqueue. In this way, we can use these two structures later to pass parameters and retain temporary variables. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: extract the logic of freeing vringXuan Zhuo1-5/+13
Introduce vring_free() to free the vring of vq. Subsequent patches will use vring_free() alone. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: update the document of the virtqueue_detach_unused_buf for ↵Xuan Zhuo1-2/+2
queue reset Added documentation for virtqueue_detach_unused_buf, allowing it to be called on queue reset. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio: struct virtio_config_ops add callbacks for queue_resetXuan Zhuo1-0/+14
reset can be divided into the following four steps (example): 1. transport: notify the device to reset the queue 2. vring: recycle the buffer submitted 3. vring: reset/resize the vring (may re-alloc) 4. transport: mmap vring to device, and enable the queue In order to support queue reset, add two callbacks in struct virtio_config_ops to implement steps 1 and 4. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio: record the maximum queue num supported by the device.Xuan Zhuo9-0/+18
virtio-net can display the maximum (supported by hardware) ring size in ethtool -g eth0. When the subsequent patch implements vring reset, it can judge whether the ring size passed by the driver is legal based on this. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11drivers/virtio: Clarify CONFIG_VIRTIO_MEM for unsupported architecturesDavid Hildenbrand1-3/+5
Let's make it clearer that simply unlocking CONFIG_VIRTIO_MEM on an architecture is most probably not sufficient to have it working as expected. Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Cc: Gavin Shan <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_mmio: add support to set IRQ of a virtio device as wakeup sourceMinghao Xue1-0/+3
According to virtio_mmio wakeup flag in device trees, set its IRQ as wakeup source in virtqueue initialization. Signed-off-by: Minghao Xue <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11dt-bindings: virtio: mmio: add optional wakeup-source propertyMinghao Xue1-0/+4
Some systems want to set the interrupt of virtio_mmio device as a wakeup source. On such systems, we'll use the existence of the "wakeup-source" property as a signal of requirement. Signed-off-by: Minghao Xue <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11vdpa: Use device_iommu_capable()Robin Murphy1-1/+1
Use the new interface to check the capability for our device specifically. Signed-off-by: Robin Murphy <[email protected]> Message-Id: <548e316fa282ce513fabb991a4c4d92258062eb5.1654688822.git.robin.murphy@arm.com> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-08-11virtio: VIRTIO_HARDEN_NOTIFICATION is brokenMichael S. Tsirkin1-1/+2
This option doesn't really work and breaks too many drivers. Not yet sure what's the right thing to do, for now let's make sure randconfig isn't broken by this. Fixes: c346dae4f3fb ("virtio: disable notification hardening by default") Cc: "Jason Wang" <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-08-11virtio_pmem: set device ready in probe()Jason Wang1-0/+7
The NVDIMM region could be available before the virtio_device_ready() that is called by virtio_dev_probe(). This means the driver tries to use device before DRIVER_OK which violates the spec, fixing this by set device ready before the nvdimm_pmem_region_create(). Note that this means the virtio_pmem_host_ack() could be triggered before the creation of the nd region, this is safe since the pmem_lock has been initialized and whether or not any available buffer is added before is validated by virtio_pmem_host_ack(). Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver") Acked-by: Pankaj Gupta <[email protected]> Signed-off-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_pmem: initialize provider_data through nd_region_descJason Wang1-1/+1
We used to initialize the provider_data manually after nvdimm_pemm_region_create(). This seems to be racy if the flush is issued before the initialization of provider_data[1]. Fixing this by initializing the provider_data through nd_region_desc to make sure the provider_data is ready after the pmem is created. [1]: [ 80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes [ 92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318 [ 92.394551] #PF: supervisor read access in kernel mode [ 92.394955] #PF: error_code(0x0000) - not-present page [ 92.395365] PGD 0 P4D 0 [ 92.395566] Oops: 0000 [#1] PREEMPT SMP PTI [ 92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453 [ 92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0 [ 92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c 00 00 [ 92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246 [ 92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000 [ 92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00 [ 92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000 [ 92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28 [ 92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0 [ 92.401849] FS: 00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000) knlGS:0000000000000000 [ 92.402423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0 [ 92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 92.404278] Call Trace: [ 92.404481] <TASK> [ 92.404654] ? mempool_alloc+0x5d/0x160 [ 92.404939] ? terminate_walk+0x5f/0xf0 [ 92.405226] ? bio_alloc_bioset+0xbb/0x3f0 [ 92.405525] async_pmem_flush+0x17/0x80 [ 92.405806] nvdimm_flush+0x11/0x30 [ 92.406067] pmem_submit_bio+0x1e9/0x200 [ 92.406354] __submit_bio+0x80/0x120 [ 92.406621] submit_bio_noacct_nocheck+0xdc/0x2a0 [ 92.406958] submit_bio_wait+0x4e/0x80 [ 92.407234] blkdev_issue_flush+0x31/0x50 [ 92.407526] ? punt_bios_to_rescuer+0x230/0x230 [ 92.407852] blkdev_fsync+0x1e/0x30 [ 92.408112] do_fsync+0x33/0x70 [ 92.408354] __x64_sys_fsync+0xb/0x10 [ 92.408625] do_syscall_64+0x43/0x90 [ 92.408895] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 92.409257] RIP: 0033:0x7f3cd76c6c44 Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver") Acked-by: Pankaj Gupta <[email protected]> Reviewed-by: Dan Williams <[email protected]> Signed-off-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11vringh: iterate on iotlb_translate to handle large translationsStefano Garzarella1-22/+56
iotlb_translate() can return -ENOBUFS if the bio_vec is not big enough to contain all the ranges for translation. This can happen for example if the VMM maps a large bounce buffer, without using hugepages, that requires more than 16 ranges to translate the addresses. To handle this case, let's extend iotlb_translate() to also return the number of bytes successfully translated. In copy_from_iotlb()/copy_to_iotlb() loops by calling iotlb_translate() several times until we complete the translation. Signed-off-by: Stefano Garzarella <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11virtio_ring: remove the arg vq of vring_alloc_desc_extra()Xuan Zhuo1-4/+3
The parameter vq of vring_alloc_desc_extra() is useless. This patch removes this parameter. Subsequent patches will call this function to avoid passing useless arguments. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-11remoteproc: rename len of rpoc_vring to numXuan Zhuo3-9/+9
Rename the member len in the structure rpoc_vring to num. And remove 'in bytes' from the comment of it. This is misleading. Because this actually refers to the size of the virtio vring to be created. The unit is not bytes. Signed-off-by: Xuan Zhuo <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-08-10bpf: Shut up kern_sys_bpf warning.Alexei Starovoitov1-0/+8
Shut up this warning: kernel/bpf/syscall.c:5089:5: warning: no previous prototype for function 'kern_sys_bpf' [-Wmissing-prototypes] int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size) Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2022-08-11KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability tableBagas Sanjaya1-4/+4
There is unexpected warning on KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table, which cause the table to be rendered as paragraph text instead. The warning is due to missing colon at capability name and returns keyword, as well as improper alignment on multi-line returns field. Fix the warning by adding missing colons and aligning the field. Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: 084cc29f8bbb03 ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis") Reported-by: Stephen Rothwell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: David Matlack <[email protected]> Cc: Ben Gardon <[email protected]> Cc: Peter Xu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Bagas Sanjaya <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-08-11Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underlineBagas Sanjaya1-1/+1
Extend heading underline for KVM_CAP_VM_DISABLE_NX_HUGE_PAGE to match the heading text length. Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: 084cc29f8bbb03 ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis") Reported-by: Stephen Rothwell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: David Matlack <[email protected]> Cc: Ben Gardon <[email protected]> Cc: Peter Xu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Bagas Sanjaya <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-08-10net/tls: Use RCU API to access tls_ctx->netdevMaxim Mikityanskiy6-15/+54
Currently, tls_device_down synchronizes with tls_device_resync_rx using RCU, however, the pointer to netdev is stored using WRITE_ONCE and loaded using READ_ONCE. Although such approach is technically correct (rcu_dereference is essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store NULL), using special RCU helpers for pointers is more valid, as it includes additional checks and might change the implementation transparently to the callers. Mark the netdev pointer as __rcu and use the correct RCU helpers to access it. For non-concurrent access pass the right conditions that guarantee safe access (locks taken, refcount value). Also use the correct helper in mlx5e, where even READ_ONCE was missing. The transition to RCU exposes existing issues, fixed by this commit: 1. bond_tls_device_xmit could read netdev twice, and it could become NULL the second time, after the NULL check passed. 2. Drivers shouldn't stop processing the last packet if tls_device_down just set netdev to NULL, before tls_dev_del was called. This prevents a possible packet drop when transitioning to the fallback software mode. Fixes: 89df6a810470 ("net/bonding: Implement TLS TX device offload") Fixes: c55dcdd435aa ("net/tls: Fix use-after-free after the TLS device goes down and up") Signed-off-by: Maxim Mikityanskiy <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10tls: rx: device: don't try to copy too much on detachJakub Kicinski1-1/+1
Another device offload bug, we use the length of the output skb as an indication of how much data to copy. But that skb is sized to offset + record length, and we start from offset. So we end up double-counting the offset which leads to skb_copy_bits() returning -EFAULT. Reported-by: Tariq Toukan <[email protected]> Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Tested-by: Ran Rozenstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10tls: rx: device: bound the frag walkJakub Kicinski1-1/+7
We can't do skb_walk_frags() on the input skbs, because the input skbs is really just a pointer to the tcp read queue. We need to bound the "is decrypted" check by the amount of data in the message. Note that the walk in tls_device_reencrypt() is after a CoW so the skb there is safe to walk. Actually in the current implementation it can't have frags at all, but whatever, maybe one day it will. Reported-by: Tariq Toukan <[email protected]> Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Tested-by: Ran Rozenstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10net_sched: cls_route: remove from list when handle is 0Thadeu Lima de Souza Cascardo1-1/+1
When a route filter is replaced and the old filter has a 0 handle, the old one won't be removed from the hashtable, while it will still be freed. The test was there since before commit 1109c00547fc ("net: sched: RCU cls_route"), when a new filter was not allocated when there was an old one. The old filter was reused and the reinserting would only be necessary if an old filter was replaced. That was still wrong for the same case where the old handle was 0. Remove the old filter from the list independently from its handle value. This fixes CVE-2022-2588, also reported as ZDI-CAN-17440. Reported-by: Zhenpeng Lin <[email protected]> Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Reviewed-by: Kamal Mostafa <[email protected]> Cc: <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-11ALSA: hda: Fix crash due to jack poll in suspendMohan Kumar1-5/+9
With jackpoll_in_suspend flag set, there is a possibility that jack poll worker thread will run even after system suspend was completed. Any register access after system pm callback flow will result in kernel crash as still jack poll worker thread tries to access registers. To fix the crash issue during system flow, cancel the jack poll worker thread during system pm prepare callback and cancel the worker thread at start of runtime suspend callback and re-schedule at last to avoid any unwarranted access of register by worker thread during suspend flow. Signed-off-by: Mohan Kumar <[email protected]> Fixes: b33115bd05af ("ALSA: hda: Jack detection poll in suspend state") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-08-11ALSA: hda/cirrus - support for iMac 12,1 modelAllen Ballway1-0/+1
The 12,1 model requires the same configuration as the 12,2 model to enable headphones but has a different codec SSID. Adds 12,1 SSID for matching quirk. [ re-sorted in SSID order by tiwai ] Signed-off-by: Allen Ballway <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/20220810152701.1.I902c2e591bbf8de9acb649d1322fa1f291849266@changeid Signed-off-by: Takashi Iwai <[email protected]>
2022-08-10selftests: forwarding: Fix failing tests with old libnetIdo Schimmel3-24/+48
The custom multipath hash tests use mausezahn in order to test how changes in various packet fields affect the packet distribution across the available nexthops. The tool uses the libnet library for various low-level packet construction and injection. The library started using the "SO_BINDTODEVICE" socket option for IPv6 sockets in version 1.1.6 and for IPv4 sockets in version 1.2. When the option is not set, packets are not routed according to the table associated with the VRF master device and tests fail. Fix this by prefixing the command with "ip vrf exec", which will cause the route lookup to occur in the VRF routing table. This makes the tests pass regardless of the libnet library version. Fixes: 511e8db54036 ("selftests: forwarding: Add test for custom multipath hash") Fixes: 185b0c190bb6 ("selftests: forwarding: Add test for custom multipath hash with IPv4 GRE") Fixes: b7715acba4d3 ("selftests: forwarding: Add test for custom multipath hash with IPv6 GRE") Reported-by: Ivan Vecera <[email protected]> Tested-by: Ivan Vecera <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Amit Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski19-35/+424
Daniel Borkmann says: ==================== bpf 2022-08-10 We've added 23 non-merge commits during the last 7 day(s) which contain a total of 19 files changed, 424 insertions(+), 35 deletions(-). The main changes are: 1) Several fixes for BPF map iterator such as UAFs along with selftests, from Hou Tao. 2) Fix BPF syscall program's {copy,strncpy}_from_bpfptr() to not fault, from Jinghao Jia. 3) Reject BPF syscall programs calling BPF_PROG_RUN, from Alexei Starovoitov and YiFei Zhu. 4) Fix attach_btf_obj_id info to pick proper target BTF, from Stanislav Fomichev. 5) BPF design Q/A doc update to clarify what is not stable ABI, from Paul E. McKenney. 6) Fix BPF map's prealloc_lru_pop to not reinitialize, from Kumar Kartikeya Dwivedi. 7) Fix bpf_trampoline_put to avoid leaking ftrace hash, from Jiri Olsa. 8) Fix arm64 JIT to address sparse errors around BPF trampoline, from Xu Kuohai. 9) Fix arm64 JIT to use kvcalloc instead of kcalloc for internal program address offset buffer, from Aijun Sun. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits) selftests/bpf: Ensure sleepable program is rejected by hash map iter selftests/bpf: Add write tests for sk local storage map iterator selftests/bpf: Add tests for reading a dangling map iter fd bpf: Only allow sleepable program for resched-able iterator bpf: Check the validity of max_rdwr_access for sock local storage map iterator bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator bpf: Acquire map uref in .init_seq_private for sock local storage map iterator bpf: Acquire map uref in .init_seq_private for hash map iterator bpf: Acquire map uref in .init_seq_private for array map iterator bpf: Disallow bpf programs call prog_run command. bpf, arm64: Fix bpf trampoline instruction endianness selftests/bpf: Add test for prealloc_lru_pop bug bpf: Don't reinit map value in prealloc_lru_pop bpf: Allow calling bpf_prog_test kfuncs in tracing programs bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc selftests/bpf: Excercise bpf_obj_get_info_by_fd for bpf2bpf bpf: Use proper target btf when exporting attach_btf_obj_id mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled bpf: Cleanup ftrace hash in bpf_trampoline_put BPF: Fix potential bad pointer dereference in bpf_sys_bpf() ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10Merge branch 'net-enhancements-to-sk_user_data-field'Jakub Kicinski4-28/+56
Hawkins Jiawei says: ==================== net: enhancements to sk_user_data field This patchset fixes refcount bug by adding SK_USER_DATA_PSOCK flag bit in sk_user_data field. The bug cause following info: WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19 Modules linked in: CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0 <TASK> __refcount_add_not_zero include/linux/refcount.h:163 [inline] __refcount_inc_not_zero include/linux/refcount.h:227 [inline] refcount_inc_not_zero include/linux/refcount.h:245 [inline] sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682 sk_backlog_rcv include/net/sock.h:1061 [inline] __release_sock+0x134/0x3b0 net/core/sock.c:2849 release_sock+0x54/0x1b0 net/core/sock.c:3404 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909 __sys_shutdown_sock net/socket.c:2331 [inline] __sys_shutdown_sock net/socket.c:2325 [inline] __sys_shutdown+0xf1/0x1b0 net/socket.c:2343 __do_sys_shutdown net/socket.c:2351 [inline] __se_sys_shutdown net/socket.c:2349 [inline] __x64_sys_shutdown+0x50/0x70 net/socket.c:2349 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> To improve code maintainability, this patchset refactors sk_user_data flags code to be more generic. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10net: refactor bpf_sk_reuseport_detach()Hawkins Jiawei1-6/+3
Refactor sk_user_data dereference using more generic function __rcu_dereference_sk_user_data_with_flags(), which improve its maintainability Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Hawkins Jiawei <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10net: fix refcount bug in sk_psock_get (2)Hawkins Jiawei3-22/+53
Syzkaller reports refcount bug as follows: ------------[ cut here ]------------ refcount_t: saturated; leaking memory. WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19 Modules linked in: CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0 <TASK> __refcount_add_not_zero include/linux/refcount.h:163 [inline] __refcount_inc_not_zero include/linux/refcount.h:227 [inline] refcount_inc_not_zero include/linux/refcount.h:245 [inline] sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682 sk_backlog_rcv include/net/sock.h:1061 [inline] __release_sock+0x134/0x3b0 net/core/sock.c:2849 release_sock+0x54/0x1b0 net/core/sock.c:3404 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909 __sys_shutdown_sock net/socket.c:2331 [inline] __sys_shutdown_sock net/socket.c:2325 [inline] __sys_shutdown+0xf1/0x1b0 net/socket.c:2343 __do_sys_shutdown net/socket.c:2351 [inline] __se_sys_shutdown net/socket.c:2349 [inline] __x64_sys_shutdown+0x50/0x70 net/socket.c:2349 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> During SMC fallback process in connect syscall, kernel will replaces TCP with SMC. In order to forward wakeup smc socket waitqueue after fallback, kernel will sets clcsk->sk_user_data to origin smc socket in smc_fback_replace_callbacks(). Later, in shutdown syscall, kernel will calls sk_psock_get(), which treats the clcsk->sk_user_data as psock type, triggering the refcnt warning. So, the root cause is that smc and psock, both will use sk_user_data field. So they will mismatch this field easily. This patch solves it by using another bit(defined as SK_USER_DATA_PSOCK) in PTRMASK, to mark whether sk_user_data points to a psock object or not. This patch depends on a PTRMASK introduced in commit f1ff5ce2cd5e ("net, sk_msg: Clear sk_user_data pointer on clone if tagged"). For there will possibly be more flags in the sk_user_data field, this patch also refactor sk_user_data flags code to be more generic to improve its maintainability. Reported-and-tested-by: [email protected] Suggested-by: Jakub Kicinski <[email protected]> Acked-by: Wen Gu <[email protected]> Signed-off-by: Hawkins Jiawei <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-10riscv: implement Zicbom-based CMO instructions + the t-head variantPalmer Dabbelt15-9/+297
This series is based on the alternatives changes done in my svpbmt series and thus also depends on Atish's isa-extension parsing series. It implements using the cache-management instructions from the Zicbom- extension to handle cache flush, etc actions on platforms needing them. SoCs using cpu cores from T-Head like the Allwinne D1 implement a different set of cache instructions. But while they are different, instructions they provide the same functionality, so a variant can easly hook into the existing alternatives mechanism on those. [Palmer: Some minor fixups, including a RISCV_ISA_ZICBOM dependency on MMU that's probably not strictly necessary. The Zicbom support will trip up sparse for users that have new toolchains, I just sent a patch.] Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-sparse/[email protected]/T/#u * palmer/riscv-zicbom: riscv: implement cache-management errata for T-Head SoCs riscv: Add support for non-coherent devices using zicbom extension dt-bindings: riscv: document cbom-block-size of: also handle dma-noncoherent in of_dma_is_coherent()
2022-08-10cifs: Remove {cifs,nfs}_fscache_release_page()David Howells1-16/+0
Remove {cifs,nfs}_fscache_release_page() from fs/cifs/fscache.h. This functionality got built directly into cifs_release_folio() and will hopefully be replaced with netfs_release_folio() at some point. The "nfs_" version is a copy and paste error and should've been altered to read "cifs_". That can also be removed. Reported-by: Matthew Wilcox <[email protected]> Signed-off-by: David Howells <[email protected]> Reviewed-by: Jeff Layton <[email protected]> cc: Steve French <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2022-08-10x86: link vdso and boot with -z noexecstack --no-warn-rwx-segmentsNick Desaulniers3-2/+6
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple instances of a new warning when linking kernels in the form: ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions Generally, we would like to avoid the stack being executable. Because there could be a need for the stack to be executable, assembler sources have to opt-in to this security feature via explicit creation of the .note.GNU-stack feature (which compilers create by default) or command line flag --noexecstack. Or we can simply tell the linker the production of such sections is irrelevant and to link the stack as --noexecstack. LLVM's LLD linker defaults to -z noexecstack, so this flag isn't strictly necessary when linking with LLD, only BFD, but it doesn't hurt to be explicit here for all linkers IMO. --no-warn-rwx-segments is currently BFD specific and only available in the current latest release, so it's wrapped in an ld-option check. While the kernel makes extensive usage of ELF sections, it doesn't use permissions from ELF segments. Link: https://lore.kernel.org/linux-block/[email protected]/ Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107 Link: https://github.com/llvm/llvm-project/issues/57009 Reported-and-tested-by: Jens Axboe <[email protected]> Suggested-by: Fangrui Song <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-08-10Makefile: link with -z noexecstack --no-warn-rwx-segmentsNick Desaulniers1-0/+5
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple instances of a new warning when linking kernels in the form: ld: warning: vmlinux: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: vmlinux has a LOAD segment with RWX permissions Generally, we would like to avoid the stack being executable. Because there could be a need for the stack to be executable, assembler sources have to opt-in to this security feature via explicit creation of the .note.GNU-stack feature (which compilers create by default) or command line flag --noexecstack. Or we can simply tell the linker the production of such sections is irrelevant and to link the stack as --noexecstack. LLVM's LLD linker defaults to -z noexecstack, so this flag isn't strictly necessary when linking with LLD, only BFD, but it doesn't hurt to be explicit here for all linkers IMO. --no-warn-rwx-segments is currently BFD specific and only available in the current latest release, so it's wrapped in an ld-option check. While the kernel makes extensive usage of ELF sections, it doesn't use permissions from ELF segments. Link: https://lore.kernel.org/linux-block/[email protected]/ Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107 Link: https://github.com/llvm/llvm-project/issues/57009 Reported-and-tested-by: Jens Axboe <[email protected]> Suggested-by: Fangrui Song <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-08-10crypto: blake2b: effectively disable frame size warningLinus Torvalds1-0/+1
It turns out that gcc-12.1 has some nasty problems with register allocation on a 32-bit x86 build for the 64-bit values used in the generic blake2b implementation, where the pattern of 64-bit rotates and xor operations ends up making gcc generate horrible code. As a result it ends up with a ridiculously large stack frame for all the spills it generates, resulting in the following build problem: crypto/blake2b_generic.c: In function ‘blake2b_compress_one_generic’: crypto/blake2b_generic.c:109:1: error: the frame size of 2640 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] on the same test-case, clang ends up generating a stack frame that is just 296 bytes (and older gcc versions generate a slightly bigger one at 428 bytes - still nowhere near that almost 3kB monster stack frame of gcc-12.1). The issue is fixed both in mainline and the GCC 12 release branch [1], but current release compilers end up failing the i386 allmodconfig build due to this issue. Disable the warning for now by simply raising the frame size for this one file, just to keep this issue from having people turn off WERROR. Link: https://lore.kernel.org/all/CAHk-=wjxqgeG2op+=W9sqgsWqCYnavC+SRfVyopu9-31S6xw+Q@mail.gmail.com/ Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 [1] Signed-off-by: Linus Torvalds <[email protected]>
2022-08-10xfs: fix inode reservation space for removing transactionhexiaole1-1/+1
In 'fs/xfs/libxfs/xfs_trans_resv.c', the comment for transaction of removing a directory entry writes: /* fs/xfs/libxfs/xfs_trans_resv.c begin */ /* * For removing a directory entry we can modify: * the parent directory inode: inode size * the removed inode: inode size ... xfs_calc_remove_reservation( struct xfs_mount *mp) { return XFS_DQUOT_LOGRES(mp) + xfs_calc_iunlink_add_reservation(mp) + max((xfs_calc_inode_res(mp, 1) + ... /* fs/xfs/libxfs/xfs_trans_resv.c end */ There has 2 inode size of space to be reserverd, but the actual code for inode reservation space writes. There only count for 1 inode size to be reserved in 'xfs_calc_inode_res(mp, 1)', rather than 2. Signed-off-by: hexiaole <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> [djwong: remove redundant code citations] Signed-off-by: Darrick J. Wong <[email protected]>