Age | Commit message (Collapse) | Author | Files | Lines |
|
In the function irdma_post_recv, the function irdma_copy_sg_list is
not needed since the struct irdma_sge and ib_sge have the similar
member variables. The struct irdma_sge can be replaced with the
struct ib_sge totally.
This can increase the rx performance of irdma.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Zhu Yanjun <[email protected]>
Reviewed-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Support the DECLARE_PHY_INTERFACE_MASK() macro that is used to declare
a bitmap by converting the macro to DECLARE_BITMAP(), as has been done
for the __ETHTOOL_DECLARE_LINK_MODE_MASK() macro.
This fixes a 'make htmldocs' warning:
include/linux/phylink.h:82: warning: Function parameter or member 'DECLARE_PHY_INTERFACE_MASK(supported_interfaces' not described in 'phylink_config'
that was introduced by commit
38c310eb46f5 ("net: phylink: add MAC phy_interface_t bitmap")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: Stephen Rothwell <[email protected]>
Cc: Russell King (Oracle) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
|
|
Translate Documentation/core-api/xarray.rst into Chinese
Signed-off-by: Yanteng Si <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
Link: https://lore.kernel.org/r/2a125bcb3220e7c1b72ae87bcad1b225dd950338.1634358018.git.siyanteng@loongson.cn
Signed-off-by: Jonathan Corbet <[email protected]>
|
|
Translate Documentation/core-api/assoc_array.rst into Chinese.
Signed-off-by: Yanteng Si <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
Link: https://lore.kernel.org/r/860ac85d9a2a83c2b63eb8d1be929ad64280d7b2.1634358018.git.siyanteng@loongson.cn
Signed-off-by: Jonathan Corbet <[email protected]>
|
|
Pull block inode sync updates from Jens Axboe:
"This contains improvements to how bdev inode syncing is handled,
unifying the API"
* tag 'for-5.16/inode-sync-2021-10-29' of git://git.kernel.dk/linux-block:
block: simplify the block device syncing code
ntfs3: use sync_blockdev_nowait
fat: use sync_blockdev_nowait
btrfs: use sync_blockdev
xen-blkback: use sync_blockdev
block: remove __sync_blockdev
fs: remove __sync_filesystem
|
|
There is a typo in the speakup documentation. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Samuel Thibault <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
|
|
Pull kiocb->ki_complete() cleanup from Jens Axboe:
"This removes the res2 argument from kiocb->ki_complete().
Only the USB gadget code used it, everybody else passes 0. The USB
guys checked the user gadget code they could find, and everybody just
uses res as expected for the async interface"
* tag 'for-5.16/ki_complete-2021-10-29' of git://git.kernel.dk/linux-block:
fs: get rid of the res2 iocb->ki_complete argument
usb: remove res2 argument from gadget code completions
|
|
git://git.kernel.dk/linux-block
Pull QUEUE_FLAG_SCSI_PASSTHROUGH removal from Jens Axboe:
"This contains a series leading to the removal of the
QUEUE_FLAG_SCSI_PASSTHROUGH queue flag"
* tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block:
block: remove blk_{get,put}_request
block: remove QUEUE_FLAG_SCSI_PASSTHROUGH
block: remove the initialize_rq_fn blk_mq_ops method
scsi: add a scsi_alloc_request helper
bsg-lib: initialize the bsg_job in bsg_transport_sg_io_fn
nfsd/blocklayout: use ->get_unique_id instead of sending SCSI commands
sd: implement ->get_unique_id
block: add a ->get_unique_id method
|
|
Pull CDROM updates from Jens Axboe:
"On behalf of Phillip, here are the CDROM updates for the 5.16-rc1
merge window:
- Add ioctl for improved media change detection (Lukas)
- Reformat some documentation (Phillip)
- Redundant variable removal (luo)"
* tag 'for-5.16/cdrom-2021-10-29' of git://git.kernel.dk/linux-block:
cdrom: Remove redundant variable and its assignment
cdrom: docs: reformat table in Documentation/userspace-api/ioctl/cdrom.rst
drivers/cdrom: improved ioctl for media change detection
|
|
Pull SCSI multi-actuator support from Jens Axboe:
"This adds SCSI support for the recently merged block multi-actuator
support. Since this was sitting on top of the block tree, the SCSI
side asked me to queue it up."
* tag 'for-5.16/scsi-ma-2021-10-29' of git://git.kernel.dk/linux-block:
doc: Fix typo in request queue sysfs documentation
doc: document sysfs queue/independent_access_ranges attributes
libata: support concurrent positioning ranges log
scsi: sd: add concurrent positioning ranges support
|
|
Pull bdev size cleanups from Jens Axboe:
"Clean up the bdev size handling with new bdev_nr_bytes() helper"
* tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-block: (34 commits)
partitions/ibm: use bdev_nr_sectors instead of open coding it
partitions/efi: use bdev_nr_bytes instead of open coding it
block/ioctl: use bdev_nr_sectors and bdev_nr_bytes
block: cache inode size in bdev
udf: use sb_bdev_nr_blocks
reiserfs: use sb_bdev_nr_blocks
ntfs: use sb_bdev_nr_blocks
jfs: use sb_bdev_nr_blocks
ext4: use sb_bdev_nr_blocks
block: add a sb_bdev_nr_blocks helper
block: use bdev_nr_bytes instead of open coding it in blkdev_fallocate
squashfs: use bdev_nr_bytes instead of open coding it
reiserfs: use bdev_nr_bytes instead of open coding it
pstore/blk: use bdev_nr_bytes instead of open coding it
ntfs3: use bdev_nr_bytes instead of open coding it
nilfs2: use bdev_nr_bytes instead of open coding it
nfs/blocklayout: use bdev_nr_bytes instead of open coding it
jfs: use bdev_nr_bytes instead of open coding it
hfsplus: use bdev_nr_sectors instead of open coding it
hfs: use bdev_nr_sectors instead of open coding it
...
|
|
In commit 324bda9e6c5a("bpf: multi program support for cgroup+bpf")
cgroup_bpf_*() called from kernel/bpf/syscall.c, but now they are only
used in kernel/bpf/cgroup.c, so move these function to
kernel/bpf/cgroup.c, like cgroup_bpf_replace().
Signed-off-by: He Fengqing <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
In account_guest_time in kernel/sched/cputime.c guest time is
attributed to both CPUTIME_NICE and CPUTIME_USER in addition to
CPUTIME_GUEST_NICE and CPUTIME_GUEST respectively. Therefore, adding
both to calculate usage results in double counting any guest time at
the rootcg.
Fixes: 936f2a70f207 ("cgroup: add cpu.stat file to root cgroup")
Signed-off-by: Dan Schatzberg <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Pull io_uring updates from Jens Axboe:
"Light on new features - basically just the hybrid mode support.
Outside of that it's just fixes, cleanups, and performance
improvements.
In detail:
- Add ring related information to the fdinfo output (Hao)
- Hybrid async mode (Hao)
- Support for batched issue on block (me)
- sqe error trace improvement (me)
- IOPOLL efficiency improvements (Pavel)
- submit state cleanups and improvements (Pavel)
- Completion side improvements (Pavel)
- Drain improvements (Pavel)
- Buffer selection cleanups (Pavel)
- Fixed file node improvements (Pavel)
- io-wq setup cancelation fix (Pavel)
- Various other performance improvements and cleanups (Pavel)
- Misc fixes (Arnd, Bixuan, Changcheng, Hao, me, Noah)"
* tag 'for-5.16/io_uring-2021-10-29' of git://git.kernel.dk/linux-block: (97 commits)
io-wq: remove worker to owner tw dependency
io_uring: harder fdinfo sq/cq ring iterating
io_uring: don't assign write hint in the read path
io_uring: clusterise ki_flags access in rw_prep
io_uring: kill unused param from io_file_supports_nowait
io_uring: clean up timeout async_data allocation
io_uring: don't try io-wq polling if not supported
io_uring: check if opcode needs poll first on arming
io_uring: clean iowq submit work cancellation
io_uring: clean io_wq_submit_work()'s main loop
io-wq: use helper for worker refcounting
io_uring: implement async hybrid mode for pollable requests
io_uring: Use ERR_CAST() instead of ERR_PTR(PTR_ERR())
io_uring: split logic of force_nonblock
io_uring: warning about unused-but-set parameter
io_uring: inform block layer of how many requests we are submitting
io_uring: simplify io_file_supports_nowait()
io_uring: combine REQ_F_NOWAIT_{READ,WRITE} flags
io_uring: arm poll for non-nowait files
fs/io_uring: Prioritise checking faster conditions first in io_write
...
|
|
Pull block driver updates from Jens Axboe:
- paride driver cleanups (Christoph)
- Remove cryptoloop support (Christoph)
- null_blk poll support (me)
- Now that add_disk() supports proper error handling, add it to various
drivers (Luis)
- Make ataflop actually work again (Michael)
- s390 dasd fixes (Stefan, Heiko)
- nbd fixes (Yu, Ye)
- Remove redundant wq flush in mtip32xx (Christophe)
- NVMe updates
- fix a multipath partition scanning deadlock (Hannes Reinecke)
- generate uevent once a multipath namespace is operational again
(Hannes Reinecke)
- support unique discovery controller NQNs (Hannes Reinecke)
- fix use-after-free when a port is removed (Israel Rukshin)
- clear shadow doorbell memory on resets (Keith Busch)
- use struct_size (Len Baker)
- add error handling support for add_disk (Luis Chamberlain)
- limit the maximal queue size for RDMA controllers (Max Gurtovoy)
- use a few more symbolic names (Max Gurtovoy)
- fix error code in nvme_rdma_setup_ctrl (Max Gurtovoy)
- add support for ->map_queues on FC (Saurav Kashyap)
- support the current discovery subsystem entry (Hannes Reinecke)
- use flex_array_size and struct_size (Len Baker)
- bcache fixes (Christoph, Coly, Chao, Lin, Qing)
- MD updates (Christoph, Guoqing, Xiao)
- Misc fixes (Dan, Ding, Jiapeng, Shin'ichiro, Ye)
* tag 'for-5.16/drivers-2021-10-29' of git://git.kernel.dk/linux-block: (117 commits)
null_blk: Fix handling of submit_queues and poll_queues attributes
block: ataflop: Fix warning comparing pointer to 0
bcache: replace snprintf in show functions with sysfs_emit
bcache: move uapi header bcache.h to bcache code directory
nvmet: use flex_array_size and struct_size
nvmet: register discovery subsystem as 'current'
nvmet: switch check for subsystem type
nvme: add new discovery log page entry definitions
block: ataflop: more blk-mq refactoring fixes
block: remove support for cryptoloop and the xor transfer
mtd: add add_disk() error handling
rnbd: add error handling support for add_disk()
um/drivers/ubd_kern: add error handling support for add_disk()
m68k/emu/nfblock: add error handling support for add_disk()
xen-blkfront: add error handling support for add_disk()
bcache: add error handling support for add_disk()
dm: add add_disk() error handling
block: aoe: fixup coccinelle warnings
nvmet: use struct_size over open coded arithmetic
nvme: drop scan_lock and always kick requeue list when removing namespaces
...
|
|
Pull block updates from Jens Axboe:
- mq-deadline accounting improvements (Bart)
- blk-wbt timer fix (Andrea)
- Untangle the block layer includes (Christoph)
- Rework the poll support to be bio based, which will enable adding
support for polling for bio based drivers (Christoph)
- Block layer core support for multi-actuator drives (Damien)
- blk-crypto improvements (Eric)
- Batched tag allocation support (me)
- Request completion batching support (me)
- Plugging improvements (me)
- Shared tag set improvements (John)
- Concurrent queue quiesce support (Ming)
- Cache bdev in ->private_data for block devices (Pavel)
- bdev dio improvements (Pavel)
- Block device invalidation and block size improvements (Xie)
- Various cleanups, fixes, and improvements (Christoph, Jackie,
Masahira, Tejun, Yu, Pavel, Zheng, me)
* tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block: (174 commits)
blk-mq-debugfs: Show active requests per queue for shared tags
block: improve readability of blk_mq_end_request_batch()
virtio-blk: Use blk_validate_block_size() to validate block size
loop: Use blk_validate_block_size() to validate block size
nbd: Use blk_validate_block_size() to validate block size
block: Add a helper to validate the block size
block: re-flow blk_mq_rq_ctx_init()
block: prefetch request to be initialized
block: pass in blk_mq_tags to blk_mq_rq_ctx_init()
block: add rq_flags to struct blk_mq_alloc_data
block: add async version of bio_set_polled
block: kill DIO_MULTI_BIO
block: kill unused polling bits in __blkdev_direct_IO()
block: avoid extra iter advance with async iocb
block: Add independent access ranges support
blk-mq: don't issue request directly in case that current is to be blocked
sbitmap: silence data race warning
blk-cgroup: synchronize blkg creation against policy deactivation
block: refactor bio_iov_bvec_set()
block: add single bio async direct IO helper
...
|
|
This patch is closely related to commit 6016df8fe874 ("selftests/bpf:
Fix broken riscv build"). When clang includes the system include
directories, but targeting BPF program, __BITS_PER_LONG defaults to
32, unless explicitly set. Work around this problem, by explicitly
setting __BITS_PER_LONG to __riscv_xlen.
Signed-off-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h.
Signed-off-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add RISC-V to the HOSTARCH parsing, so that ARCH is "riscv", and not
"riscv32" or "riscv64".
This affects the perf and libbpf builds, so that arch specific
includes are correctly picked up for RISC-V.
Signed-off-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Now that BPF programs can be up to 1M instructions, it is not uncommon
that a program requires more than the current 16 iterations to
converge.
Bump it to 32, which is enough for selftests/bpf, and test_bpf.ko.
Signed-off-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add the test to check sockmap with strparser is working well.
Signed-off-by: Liu Jian <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
After "skmsg: lose offset info in sk_psock_skb_ingress", the test case
with ktls failed. This because ktls parser(tls_read_size) return value
is 285 not 256.
The case like this:
tls_sk1 --> redir_sk --> tls_sk2
tls_sk1 sent out 512 bytes data, after tls related processing redir_sk
recved 570 btyes data, and redirect 512 (skb_use_parser) bytes data to
tls_sk2; but tls_sk2 needs 285 * 2 bytes data, receive timeout occurred.
Signed-off-by: Liu Jian <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
If sockmap enable strparser, there are lose offset info in
sk_psock_skb_ingress(). If the length determined by parse_msg function is not
skb->len, the skb will be converted to sk_msg multiple times, and userspace
app will get the data multiple times.
Fix this by get the offset and length from strp_msg. And as Cong suggested,
add one bit in skb->_sk_redir to distinguish enable or disable strparser.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
After most recent nightly Clang update strobemeta selftests started
failing with the following error (relevant portion of assembly included):
1624: (85) call bpf_probe_read_user_str#114
1625: (bf) r1 = r0
1626: (18) r2 = 0xfffffffe
1628: (5f) r1 &= r2
1629: (55) if r1 != 0x0 goto pc+7
1630: (07) r9 += 104
1631: (6b) *(u16 *)(r9 +0) = r0
1632: (67) r0 <<= 32
1633: (77) r0 >>= 32
1634: (79) r1 = *(u64 *)(r10 -456)
1635: (0f) r1 += r0
1636: (7b) *(u64 *)(r10 -456) = r1
1637: (79) r1 = *(u64 *)(r10 -368)
1638: (c5) if r1 s< 0x1 goto pc+778
1639: (bf) r6 = r8
1640: (0f) r6 += r7
1641: (b4) w1 = 0
1642: (6b) *(u16 *)(r6 +108) = r1
1643: (79) r3 = *(u64 *)(r10 -352)
1644: (79) r9 = *(u64 *)(r10 -456)
1645: (bf) r1 = r9
1646: (b4) w2 = 1
1647: (85) call bpf_probe_read_user_str#114
R1 unbounded memory access, make sure to bounds check any such access
In the above code r0 and r1 are implicitly related. Clang knows that,
but verifier isn't able to infer this relationship.
Yonghong Song narrowed down this "regression" in code generation to
a recent Clang optimization change ([0]), which for BPF target generates
code pattern that BPF verifier can't handle and loses track of register
boundaries.
This patch works around the issue by adding an BPF assembly-based helper
that helps to prove to the verifier that upper bound of the register is
a given constant by controlling the exact share of generated BPF
instruction sequence. This fixes the immediate issue for strobemeta
selftest.
[0] https://github.com/llvm/llvm-project/commit/acabad9ff6bf13e00305d9d8621ee8eafc1f8b08
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"Most of this is just follow-on cleanup work of documentation and
comments from the mandatory locking removal in v5.15.
The only real functional change is that LOCK_MAND flock() support is
also being removed, as it has basically been non-functional since the
v2.5 days"
* tag 'locks-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs: remove leftover comments from mandatory locking removal
locks: remove changelog comments
docs: fs: locks.rst: update comment about mandatory file locking
Documentation: remove reference to now removed mandatory-locking doc
locks: remove LOCK_MAND flock lock support
|
|
Disabling unprivileged BPF would help prevent unprivileged users from
creating certain conditions required for potential speculative execution
side-channel attacks on unmitigated affected hardware.
A deep dive on such attacks and current mitigations is available here [0].
Sync with what many distros are currently applying already, and disable
unprivileged BPF by default. An admin can enable this at runtime, if
necessary, as described in 08389d888287 ("bpf: Add kconfig knob for
disabling unpriv bpf by default").
[0] "BPF and Spectre: Mitigating transient execution attacks", Daniel Borkmann, eBPF Summit '21
https://ebpf.io/summit-2021-slides/eBPF_Summit_2021-Keynote-Daniel_Borkmann-BPF_and_Spectre.pdf
Signed-off-by: Pawan Gupta <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/bpf/0ace9ce3f97656d5f62d11093ad7ee81190c3c25.1635535215.git.pawan.kumar.gupta@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"Only bug fixes"
* tag 'tpmdd-next-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm_tis_spi: Add missing SPI ID
tpm: fix Atmel TPM crash caused by too frequent queries
tpm: Check for integer overflow in tpm2_map_response_body()
tpm: tis: Kconfig: Add helper dependency on COMPILE_TEST
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v5.16
This is an unusually large set of updates, mostly a large crop of
unusually big drivers coupled with extensive overhauls of existing code.
There's a SH change here for the DAI format terminology, the change is
straightforward and the SH maintainers don't seem very active.
- A new version of the audio graph card which supports a wider range of
systems.
- Move of the Cirrus DSP framework into drivers/firmware to allow for
future use by non-audio DSPs.
- Several conversions to YAML DT bindings.
- Continuing cleanups to the SOF and Intel code.
- A very big overhaul of the cs42l42 driver, correcting many problems.
- Support for AMD Vangogh and Yelow Cap, Cirrus CS35L41, Maxim
MAX98520 and MAX98360A, Mediatek MT8195, Nuvoton NAU8821, nVidia
Tegra210, NXP i.MX8ULP, Qualcomm AudioReach, Realtek ALC5682I-VS,
RT5682S, and RT9120 and Rockchip RV1126 and RK3568
|
|
Pull memory folios from Matthew Wilcox:
"Add memory folios, a new type to represent either order-0 pages or the
head page of a compound page. This should be enough infrastructure to
support filesystems converting from pages to folios.
The point of all this churn is to allow filesystems and the page cache
to manage memory in larger chunks than PAGE_SIZE. The original plan
was to use compound pages like THP does, but I ran into problems with
some functions expecting only a head page while others expect the
precise page containing a particular byte.
The folio type allows a function to declare that it's expecting only a
head page. Almost incidentally, this allows us to remove various calls
to VM_BUG_ON(PageTail(page)) and compound_head().
This converts just parts of the core MM and the page cache. For 5.17,
we intend to convert various filesystems (XFS and AFS are ready; other
filesystems may make it) and also convert more of the MM and page
cache to folios. For 5.18, multi-page folios should be ready.
The multi-page folios offer some improvement to some workloads. The
80% win is real, but appears to be an artificial benchmark (postgres
startup, which isn't a serious workload). Real workloads (eg building
the kernel, running postgres in a steady state, etc) seem to benefit
between 0-10%. I haven't heard of any performance losses as a result
of this series. Nobody has done any serious performance tuning; I
imagine that tweaking the readahead algorithm could provide some more
interesting wins. There are also other places where we could choose to
create large folios and currently do not, such as writes that are
larger than PAGE_SIZE.
I'd like to thank all my reviewers who've offered review/ack tags:
Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes
Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil
Babka, William Kucharski, Yu Zhao and Zi Yan.
I'd also like to thank those who gave feedback I incorporated but
haven't offered up review tags for this part of the series: Nick
Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard,
Hugh Dickins, and probably a few others who I forget"
* tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits)
mm/writeback: Add folio_write_one
mm/filemap: Add FGP_STABLE
mm/filemap: Add filemap_get_folio
mm/filemap: Convert mapping_get_entry to return a folio
mm/filemap: Add filemap_add_folio()
mm/filemap: Add filemap_alloc_folio
mm/page_alloc: Add folio allocation functions
mm/lru: Add folio_add_lru()
mm/lru: Convert __pagevec_lru_add_fn to take a folio
mm: Add folio_evictable()
mm/workingset: Convert workingset_refault() to take a folio
mm/filemap: Add readahead_folio()
mm/filemap: Add folio_mkwrite_check_truncate()
mm/filemap: Add i_blocks_per_folio()
mm/writeback: Add folio_redirty_for_writepage()
mm/writeback: Add folio_account_redirty()
mm/writeback: Add folio_clear_dirty_for_io()
mm/writeback: Add folio_cancel_dirty()
mm/writeback: Add folio_account_cleaned()
mm/writeback: Add filemap_dirty_folio()
...
|
|
The commit 23efd0804c0a869dfb1e7 ("vsprintf: Make %pGp print
the hex value") changed the behavior of %pGp printk format.
Update the documentation accordingly.
Fixes: 23efd0804c0a869dfb1e7 ("vsprintf: Make %pGp print the hex value")
Reviewed-by: Yafang Shao <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/YXlKqCPY9suM4mfT@alley
|
|
There are a lot of warnings due to unused protocol constants, but I believe
it's good to leave them in the sources for documentation purposes for further
development.
Switch them over from static conts to macros to avoid the warnings.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
The driver requires multicolor LED support; express that in Kconfig.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
Tony Lu says:
====================
Tracepoints for SMC
This patch set introduces tracepoints for SMC, including the tracepoints
basic code. The tracepoitns would help us to track SMC's behaviors by
automatic tools, or other BPF tools, and zero overhead if not enabled.
Compared with kprobe and other dymatic tools, the tracepoints are
considered as stable API, and less overhead for tracing with easy-to-use
API.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
SMC-R link down event is important to help us find links' issues, we
should track this event, especially in the single nic mode, which means
upper layer connection would be shut down. Then find out the direct
link-down reason in time, not only increased the counter, also the
location of the code who triggered this event.
Signed-off-by: Tony Lu <[email protected]>
Reviewed-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This introduce two tracepoints for smc tx and rx msg to help us
diagnosis issues of data path. These two tracepoitns don't cover the
path of CORK or MSG_MORE in tx, just the top half of data path.
Signed-off-by: Tony Lu <[email protected]>
Reviewed-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This introduces tracepoint for smc fallback to TCP, so that we can track
which connection and why it fallbacks, and map the clcsocks' pointer with
/proc/net/tcp to find more details about TCP connections. Compared with
kprobe or other dynamic tracing, tracepoints are stable and easy to use.
Signed-off-by: Tony Lu <[email protected]>
Reviewed-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Taehee Yoo says:
====================
amt: add initial driver for Automatic Multicast Tunneling (AMT)
This is an implementation of AMT(Automatic Multicast Tunneling), RFC 7450.
https://datatracker.ietf.org/doc/html/rfc7450
This implementation supports IGMPv2, IGMPv3, MLDv1, MLDv2, and IPv4
underlay.
Summary of RFC 7450
The purpose of this protocol is to provide multicast tunneling.
The main use-case of this protocol is to provide delivery multicast
traffic from a multicast-enabled network to sites that lack multicast
connectivity to the source network.
There are two roles in AMT protocol, Gateway, and Relay.
The main purpose of Gateway mode is to forward multicast listening
information(IGMP, MLD) to the source.
The main purpose of Relay mode is to forward multicast data to listeners.
These multicast traffics(IGMP, MLD, multicast data packets) are tunneled.
Listeners are located behind Gateway endpoint.
But gateway itself can be a listener too.
Senders are located behind Relay endpoint.
___________ _________ _______ ________
| | | | | | | |
| Listeners <-----> Gateway <-----> Relay <-----> Source |
|___________| |_________| |_______| |________|
IGMP/MLD---------(encap)----------->
<-------------(decap)--------(encap)------Multicast Data
Usage of AMT interface
1. Create gateway interface
ip link add amtg type amt mode gateway local 10.0.0.1 discovery 10.0.0.2 \
dev gw1_rt gateway_port 2268 relay_port 2268
2. Create Relay interface
ip link add amtr type amt mode relay local 10.0.0.2 dev relay_rt \
relay_port 2268 max_tunnels 4
v1 -> v2:
- Eliminate sparse warnings.
- Use bool type instead of __be16 for identifying v4/v6 protocol.
v2 -> v3:
- Fix compile warning due to unsed variable.
- Add missing spinlock comment.
- Update help message of amt in Kconfig.
v3 -> v4:
- Split patch.
- Use CHECKSUM_NONE instead of CHECKSUM_UNNECESSARY.
- Fix compile error.
v4 -> v5:
- Remove unnecessary rcu_read_lock().
- Remove unnecessary amt_change_mtu().
- Change netlink error message.
- Add validation for IFLA_AMT_LOCAL_IP and IFLA_AMT_DISCOVERY_IP.
- Add comments in amt.h.
- Add missing dev_put() in error path of amt_newlink().
- Fix typo.
- Add BUILD_BUG_ON() in amt_smb_cb().
- Use macro instead of magic values.
- Use kzalloc() instead of kmalloc().
- Add selftest script.
v5 -> v6:
- Reset remote_ip in amt_dev_stop().
v6 -> v7:
- Fix compile error.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
This is selftest script for amt interface.
This script includes basic forwarding scenarion and torture scenario.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the previous patch, igmp report handler was added.
That handler can be used for mld too.
So, it uses that common code to parse mld report message.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
amt 'Relay' interface manages multicast groups(igmp/mld) and sources.
In order to manage, it should have the function to parse igmp/mld
report messages. So, this adds the logic for parsing igmp report messages
and saves them on their own data structure.
struct amt_group_node means one group(igmp/mld).
struct amt_source_node means one source.
The same source can't exist in the same group.
The same group can exist in the same tunnel because it manages
the host address too.
The group information is used when forwarding multicast data.
If there are no groups in the specific tunnel, Relay doesn't forward it.
Although Relay manages sources, it doesn't support the source filtering
feature. Because the reason to manage sources is just that in order
to manage group more correctly.
In the next patch, MLD part will be added.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Before forwarding multicast traffic, the amt interface establishes between
gateway and relay. In order to establish, amt defined some message type
and those message flow looks like the below.
Gateway Relay
------- -----
: Request :
[1] | N |
|---------------------->|
| Membership Query | [2]
| N,MAC,gADDR,gPORT |
|<======================|
[3] | Membership Update |
| ({G:INCLUDE({S})}) |
|======================>|
| |
---------------------:-----------------------:---------------------
| | | |
| | *Multicast Data | *IP Packet(S,G) |
| | gADDR,gPORT |<-----------------() |
| *IP Packet(S,G) |<======================| |
| ()<-----------------| | |
| | | |
---------------------:-----------------------:---------------------
~ ~
~ Request ~
[4] | N' |
|---------------------->|
| Membership Query | [5]
| N',MAC',gADDR',gPORT' |
|<======================|
[6] | |
| Teardown |
| N,MAC,gADDR,gPORT |
|---------------------->|
| | [7]
| Membership Update |
| ({G:INCLUDE({S})}) |
|======================>|
| |
---------------------:-----------------------:---------------------
| | | |
| | *Multicast Data | *IP Packet(S,G) |
| | gADDR',gPORT' |<-----------------() |
| *IP Packet (S,G) |<======================| |
| ()<-----------------| | |
| | | |
---------------------:-----------------------:---------------------
| |
: :
1. Discovery
- Sent by Gateway to Relay
- To find Relay unique ip address
2. Advertisement
- Sent by Relay to Gateway
- Contains the unique IP address
3. Request
- Sent by Gateway to Relay
- Solicit to receive 'Query' message.
4. Query
- Sent by Relay to Gateway
- Contains General Query message.
5. Update
- Sent by Gateway to Relay
- Contains report message.
6. Multicast Data
- Sent by Relay to Gateway
- encapsulated multicast traffic.
7. Teardown
- Not supported at this time.
Except for the Teardown message, it supports all messages.
In the next patch, IGMP/MLD logic will be added.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It adds definitions and control plane code for AMT.
this is very similar to udp tunneling interfaces such as gtp, vxlan, etc.
In the next patch, data plane code will be added.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jakub Kicinski says:
====================
netdevsim: improve separation between device and bus
VF config falls strangely in between device and bus
responsibilities today. Because of this bus.c sticks fingers
directly into struct nsim_dev and we look at nsim_bus_dev
in many more places than necessary.
Make bus.c contain pure interface code, and move
the particulars of the logic (which touch on eswitch,
devlink reloads etc) to dev.c. Rename the functions
at the boundary of the interface to make the separation
clearer.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Rename functions serving as driver entry points
from nsim_dev_... to nsim_drv_... this makes the
API boundary between bus and dev clearer.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
max_vfs is a strange little beast because the file
hangs off of nsim's debugfs, but it configures a field
in the bus device. Move it to dev.c, let's look at it
as if the device driver was imposing VF limit based
on FW info (like pci_sriov_set_totalvfs()).
Again, when moving refactor the function not to hold
the vfs lock pointlessly while parsing the input.
Wrap the access from the read side in READ_ONCE()
to appease concurrency checkers. Do not check if
return value from snprintf() is negative...
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since "eswitch" configuration was added bus.c contains
a lot of device details which really belong to dev.c.
Restructure the code while moving it.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When netdevsim got split into the faux bus vfconfig ended
up in the bus device (think pci_dev) which is strange because
it contains very networky not to say netdevy information.
Move it to nsim_dev, which is the driver "priv" structure
for the device.
To make sure we don't race with probe/remove take
the device lock (much like PCI).
While at it remove the NULL-checking of vfconfigs.
It appears to be pointless.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Legacy VF NDOs look at num_vfs and then based on that
index into vfconfig. If we don't rtnl_lock() num_vfs
may get set to 0 and vfconfig freed/replaced while
the NDO is running.
We don't need to protect replacing vfconfig since it's
only done when num_vfs is 0.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jakub Kicinski says:
====================
improve ethtool/rtnl vs devlink locking
During ethtool netlink development we decided to move some of
the commmands to devlink. Since we don't want drivers to implement
both devlink and ethtool version of the commands ethtool ioctl
falls back to calling devlink. Unfortunately devlink locks must
be taken before rtnl_lock. This results in a questionable
dev_hold() / rtnl_unlock() / devlink / rtnl_lock() / dev_put()
pattern.
This method "works" but it working depends on drivers in question
not doing much in ethtool_ops->begin / complete, and on the netdev
not having needs_free_netdev set.
Since commit 437ebfd90a25 ("devlink: Count struct devlink consumers")
we can hold a reference on a devlink instance and prevent it from
going away (sort of like netdev with dev_hold()). We can use this
to create a more natural reference nesting where we get a ref on
the devlink instance and make the devlink call entirely outside
of the rtnl_lock section.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
devlink compat code needs to drop rtnl_lock to take
devlink->lock to ensure correct lock ordering.
This is problematic because we're not strictly guaranteed
that the netdev will not disappear after we re-lock.
It may open a possibility of nested ->begin / ->complete
calls.
Instead of calling into devlink under rtnl_lock take
a ref on the devlink instance and make the call after
we've dropped rtnl_lock.
We (continue to) assume that netdevs have an implicit
reference on the devlink returned from ndo_get_devlink_port
Note that ndo_get_devlink_port will now get called
under rtnl_lock. That should be fine since none of
the drivers seem to be taking serious locks inside
ndo_get_devlink_port.
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|