aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2022-06-20rcu-tasks: Merge state into .b.need_qs and atomically updatePaul E. McKenney2-8/+11
This commit gets rid of the task_struct structure's ->trc_reader_checked field, making it instead be a bit within the task_struct structure's existing ->trc_reader_special.b.need_qs field. This commit also atomically loads, stores, and checks the resulting combination of the reader-checked and need-quiescent state flags. This will in turn allow significant simplification of the rcu_tasks_trace_postgp() function as well as elimination of the trc_n_readers_need_end counter in later commits. These changes will in turn simplify later elimination of the RCU Tasks Trace scan of the task list, which will make RCU Tasks Trace grace periods less CPU-intensive. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu: Provide a get_completed_synchronize_rcu() functionPaul E. McKenney1-0/+1
It is currently up to the caller to handle stale return values from get_state_synchronize_rcu(). If poll_state_synchronize_rcu() returned true once, a grace period has elapsed, regardless of the fact that counter wrap might cause some future poll_state_synchronize_rcu() invocation to return false. For example, the caller might store a separate flag that indicates whether some previous call to poll_state_synchronize_rcu() determined that the relevant grace period had already ended. This approach works, but it requires extra storage and is easy to get wrong. This commit therefore introduces a get_completed_synchronize_rcu() that returns a cookie that causes poll_state_synchronize_rcu() to always return true. This already-completed cookie can be stored in place of the cookie that previously caused poll_state_synchronize_rcu() to return true. It can also be used to flag a given structure as not having been exposed to readers, and thus not requiring a grace period to elapse. This commit is in preparation for polled expedited grace periods. Link: https://lore.kernel.org/all/[email protected]/ Link: https://docs.google.com/document/d/1RNKWW9jQyfjxw2E8dsXVTdvZYh0HnYeSHDKog9jhdN8/edit?usp=sharing Cc: Brian Foster <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Al Viro <[email protected]> Cc: Ian Kent <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20Merge drm/drm-next into drm-misc-nextThomas Zimmermann335-2968/+7786
Backmerging to get new regmap APIs of v5.19-rc1. Signed-off-by: Thomas Zimmermann <[email protected]>
2022-06-20net: Introduce a new proto_ops ->read_skb()Cong Wang1-0/+4
Currently both splice() and sockmap use ->read_sock() to read skb from receive queue, but for sockmap we only read one entire skb at a time, so ->read_sock() is too conservative to use. Introduce a new proto_ops ->read_skb() which supports this sematic, with this we can finally pass the ownership of skb to recv actors. For non-TCP protocols, all ->read_sock() can be simply converted to ->read_skb(). Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-20wifi: ieee80211: add definitions for multi-link elementJohannes Berg1-0/+222
Add the definitions necessary to build and parse some of the multi-link element, the per-STA profile isn't fully included. Signed-off-by: Johannes Berg <[email protected]>
2022-06-20wifi: cfg80211: do some rework towards MLO link APIsJohannes Berg1-0/+3
In order to support multi-link operation with multiple links, start adding some APIs. The notable addition here is to have the link ID in a new nl80211 attribute, that will be used to differentiate the links in many nl80211 operations. So far, this patch adds the netlink NL80211_ATTR_MLO_LINK_ID attribute (as well as the NL80211_ATTR_MLO_LINKS attribute) and plugs it through the system in some places, checking the validity etc. along with other infrastructure needed for it. For now, I've decided to include only the over-the-air link ID in the API. I know we discussed that we eventually need to have to have other ways of identifying a link, but for local AP mode and auth/assoc commands as well as set_key etc. we'll use the OTA ID. Also included in this patch is some refactoring of the data structures in struct wireless_dev, splitting for the first time the data into type dependent pieces, to make reasoning about these things easier. Signed-off-by: Johannes Berg <[email protected]>
2022-06-20KVM: Rename/refactor kvm_is_reserved_pfn() to kvm_pfn_to_refcounted_page()Sean Christopherson1-1/+1
Rename and refactor kvm_is_reserved_pfn() to kvm_pfn_to_refcounted_page() to better reflect what KVM is actually checking, and to eliminate extra pfn_to_page() lookups. The kvm_release_pfn_*() an kvm_try_get_pfn() helpers in particular benefit from "refouncted" nomenclature, as it's not all that obvious why KVM needs to get/put refcounts for some PG_reserved pages (ZERO_PAGE and ZONE_DEVICE). Add a comment to call out that the list of exceptions to PG_reserved is all but guaranteed to be incomplete. The list has mostly been compiled by people throwing noodles at KVM and finding out they stick a little too well, e.g. the ZERO_PAGE's refcount overflowed and ZONE_DEVICE pages didn't get freed. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-06-20KVM: Take a 'struct page', not a pfn in kvm_is_zone_device_page()Sean Christopherson1-1/+1
Operate on a 'struct page' instead of a pfn when checking if a page is a ZONE_DEVICE page, and rename the helper accordingly. Generally speaking, KVM doesn't actually care about ZONE_DEVICE memory, i.e. shouldn't do anything special for ZONE_DEVICE memory. Rather, KVM wants to treat ZONE_DEVICE memory like regular memory, and the need to identify ZONE_DEVICE memory only arises as an exception to PG_reserved pages. In other words, KVM should only ever check for ZONE_DEVICE memory after KVM has already verified that there is a struct page associated with the pfn. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-06-20KVM: Remove kvm_vcpu_gfn_to_page() and kvm_vcpu_gpa_to_page()Sean Christopherson1-7/+0
Drop helpers to convert a gfn/gpa to a 'struct page' in the context of a vCPU. KVM doesn't require that guests be backed by 'struct page' memory, thus any use of helpers that assume 'struct page' is bound to be flawed, as was the case for the recently removed last user in x86's nested VMX. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-06-20KVM: Avoid pfn_to_page() and vice versa when releasing pagesSean Christopherson1-1/+0
Invert the order of KVM's page/pfn release helpers so that the "inner" helper operates on a page instead of a pfn. As pointed out by Linus[*], converting between struct page and a pfn isn't necessarily cheap, and that's not even counting the overhead of is_error_noslot_pfn() and kvm_is_reserved_pfn(). Even if the checks were dirt cheap, there's no reason to convert from a page to a pfn and back to a page, just to mark the page dirty/accessed or to put a reference to the page. Opportunistically drop a stale declaration of kvm_set_page_accessed() from kvm_host.h (there was no implementation). No functional change intended. [*] https://lore.kernel.org/all/CAHk-=wifQimj2d6npq-wCi5onYPjzQg4vyO4tFcPJJZr268cRw@mail.gmail.com Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-06-20Merge v5.19-rc3 into usb-nextGreg Kroah-Hartman20-407/+145
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-06-20Merge tag 'v5.19-rc3' into tty-nextGreg Kroah-Hartman20-407/+145
We need the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-06-20Merge 5.19-rc3 into staging-nextGreg Kroah-Hartman20-407/+145
This resolves the merge issue with: drivers/staging/r8188eu/os_dep/ioctl_linux.c Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-06-19block: remove queue from struct blk_independent_access_rangeDamien Le Moal1-1/+0
The request queue pointer in struct blk_independent_access_range is unused. Remove it. Signed-off-by: Damien Le Moal <[email protected]> Fixes: 41e46b3c2aa2 ("block: Fix potential deadlock in blk_ia_range_sysfs_show()") Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-20ata: make ata_port::fastdrain_cnt *unsigned int*Sergey Shtylyov1-1/+1
*unsigned long* ata_port::fastdrain_cnt (64-bit value in a 64-bit kernel) is always assigned from the 32-bit *unsigned int* variables, thus could also be made just *unsigned int*... Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-06-19random: quiet urandom warning ratelimit suppression messageJason A. Donenfeld1-4/+8
random.c ratelimits how much it warns about uninitialized urandom reads using __ratelimit(). When the RNG is finally initialized, it prints the number of missed messages due to ratelimiting. It has been this way since that functionality was introduced back in 2018. Recently, cc1e127bfa95 ("random: remove ratelimiting for in-kernel unseeded randomness") put a bit more stress on the urandom ratelimiting, which teased out a bug in the implementation. Specifically, when under pressure, __ratelimit() will print its own message and reset the count back to 0, making the final message at the end less useful. Secondly, it does so as a pr_warn(), which apparently is undesirable for people's CI. Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly for this purpose, so we set the flag. Fixes: 4e00b339e264 ("random: rate limit unseeded randomness warnings") Cc: [email protected] Reported-by: Jon Hunter <[email protected]> Reported-by: Ron Economos <[email protected]> Tested-by: Ron Economos <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-06-19Merge tag 'objtool-urgent-2022-06-19' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull build tooling updates from Thomas Gleixner: - Remove obsolete CONFIG_X86_SMAP reference from objtool - Fix overlapping text section failures in faddr2line for real - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it with finegrained annotations so objtool can validate that code correctly. * tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage faddr2line: Fix overlapping text section failures, the sequel objtool: Fix obsolete reference to CONFIG_X86_SMAP
2022-06-19net: mii: add mii_bmcr_encode_fixed()Russell King (Oracle)1-0/+35
Add a function to encode a fixed speed/duplex to a BMCR value. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-18iio: trigger: warn about non-registered iio trigger getting attemptDmitry Rokosov1-0/+5
As a part of patch series about wrong trigger register() and get() calls order in the some IIO drivers trigger initialization path: https://lore.kernel.org/all/[email protected]/ runtime WARN_ONCE() is added to alarm IIO driver authors who make such a mistake. When an IIO driver allocates a new IIO trigger, it should register it before calling the get() operation. In other words, each IIO driver must abide by IIO trigger alloc()/register()/get() calls order. Signed-off-by: Dmitry Rokosov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]>
2022-06-18Merge branch 'immutable-qcom-spmi-rradc' into togregJonathan Cameron1-0/+3
Immutable branch used to allow changes to SPMI and MFD subsystems needed by this driver to be pulled into those trees as well if relevant.
2022-06-18spmi: add a helper to look up an SPMI device from a device nodeCaleb Connolly1-0/+3
The helper function spmi_device_from_of() takes a device node and returns the SPMI device associated with it. This is like of_find_device_by_node but for SPMI devices. Signed-off-by: Caleb Connolly <[email protected]> Acked-by: Stephen Boyd <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]>
2022-06-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski5-38/+132
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-06-17 We've added 72 non-merge commits during the last 15 day(s) which contain a total of 92 files changed, 4582 insertions(+), 834 deletions(-). The main changes are: 1) Add 64 bit enum value support to BTF, from Yonghong Song. 2) Implement support for sleepable BPF uprobe programs, from Delyan Kratunov. 3) Add new BPF helpers to issue and check TCP SYN cookies without binding to a socket especially useful in synproxy scenarios, from Maxim Mikityanskiy. 4) Fix libbpf's internal USDT address translation logic for shared libraries as well as uprobe's symbol file offset calculation, from Andrii Nakryiko. 5) Extend libbpf to provide an API for textual representation of the various map/prog/attach/link types and use it in bpftool, from Daniel Müller. 6) Provide BTF line info for RV64 and RV32 JITs, and fix a put_user bug in the core seen in 32 bit when storing BPF function addresses, from Pu Lehui. 7) Fix libbpf's BTF pointer size guessing by adding a list of various aliases for 'long' types, from Douglas Raillard. 8) Fix bpftool to readd setting rlimit since probing for memcg-based accounting has been unreliable and caused a regression on COS, from Quentin Monnet. 9) Fix UAF in BPF cgroup's effective program computation triggered upon BPF link detachment, from Tadeusz Struk. 10) Fix bpftool build bootstrapping during cross compilation which was pointing to the wrong AR process, from Shahab Vahedi. 11) Fix logic bug in libbpf's is_pow_of_2 implementation, from Yuze Chi. 12) BPF hash map optimization to avoid grabbing spinlocks of all CPUs when there is no free element. Also add a benchmark as reproducer, from Feng Zhou. 13) Fix bpftool's codegen to bail out when there's no BTF, from Michael Mullin. 14) Various minor cleanup and improvements all over the place. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits) bpf: Fix bpf_skc_lookup comment wrt. return type bpf: Fix non-static bpf_func_proto struct definitions selftests/bpf: Don't force lld on non-x86 architectures selftests/bpf: Add selftests for raw syncookie helpers in TC mode bpf: Allow the new syncookie helpers to work with SKBs selftests/bpf: Add selftests for raw syncookie helpers bpf: Add helpers to issue and check SYN cookies in XDP bpf: Allow helpers to accept pointers with a fixed size bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie selftests/bpf: add tests for sleepable (uk)probes libbpf: add support for sleepable uprobe programs bpf: allow sleepable uprobe programs to attach bpf: implement sleepable uprobes by chaining gps bpf: move bpf_prog to bpf.h libbpf: Fix internal USDT address translation logic for shared libraries samples/bpf: Check detach prog exist or not in xdp_fwd selftests/bpf: Avoid skipping certain subtests selftests/bpf: Fix test_varlen verification failure with latest llvm bpftool: Do not check return value from libbpf_set_strict_mode() Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK" ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-17Merge tag 'printk-for-5.19-rc3' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fixes from Petr Mladek: "Make the global console_sem available for CPU that is handling panic() or shutdown. This is an old problem when an existing console lock owner might block console output, but it became more visible with the kthreads" * tag 'printk-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: Wait for the global console lock when the system is going down printk: Block console kthreads when direct printing will be required
2022-06-17Merge tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+2
Pull block fixes from Jens Axboe: - NVMe pull request from Christoph - Quirks, quirks, quirks to work around buggy consumer grade devices (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh) - Better kernel messages for devices that need quirking (Keith Bush) - Make a kernel message more useful (Thomas Weißschuh) - MD pull request from Song, with a few fixes - blk-mq sysfs locking fixes (Ming) - BFQ stats fix (Bart) - blk-mq offline queue fix (Bart) - blk-mq flush request tag fix (Ming) * tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-block: block/bfq: Enable I/O statistics blk-mq: don't clear flush_rq from tags->rqs[] blk-mq: avoid to touch q->elevator without any protection blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none block: Fix handling of offline queues in blk_mq_alloc_request_hctx() md/raid5-ppl: Fix argument order in bio_alloc_bioset() Revert "md: don't unregister sync_thread with reconfig_mutex held" nvme-pci: disable write zeros support on UMIC and Samsung SSDs nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs nvme-pci: sk hynix p31 has bogus namespace ids nvme-pci: smi has bogus namespace ids nvme-pci: phison e12 has bogus namespace ids nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50 nvme-pci: add trouble shooting steps for timeouts nvme: add bug report info for global duplicate id nvme: add device name to warning in uuid_show()
2022-06-17Merge tag 'fs_for_v5.19-rc3' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull writeback and ext2 fixes from Jan Kara: "A fix for writeback bug which prevented machines with kdevtmpfs from booting and also one small ext2 bugfix in IO error handling" * tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: init: Initialize noop_backing_dev_info early ext2: fix fs corruption when trying to remove a non-empty directory with IO error
2022-06-17Merge tag 'staging-5.19-rc3' of ↵Linus Torvalds1-344/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some small staging driver fixes for 5.19-rc3 that resolve reported issues: - remove visorbus.h which was forgotten in the -rc1 merge where the code that used it was removed - olpc_dcon: mark as broken to allow the DRM developers to evolve the fbdev api properly without having to deal with this obsolete driver. It will be removed soon if no one steps up to adopt it and fix the issues with it. - rtl8723bs driver fix - r8188eu driver fix to resolve many reports of the driver being broken with -rc1. All of these have been in linux-next for a while with no reported issues" * tag 'staging-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: Also remove the Unisys visorbus.h staging: rtl8723bs: Allocate full pwep structure staging: olpc_dcon: mark driver as broken staging: r8188eu: Fix warning of array overflow in ioctl_linux.c staging: r8188eu: fix rtw_alloc_hwxmits error detection for now
2022-06-17Merge tag 'tty-5.19-rc3' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes for 5.19-rc3 to resolve some reported problems: - 8250 lsr read bugfix - n_gsm line discipline allocation fix - qcom serial driver fix for reported lockups that happened in -rc1 - goldfish tty driver fix All have been in linux-next for a while now with no reported issues" * tag 'tty-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250: Store to lsr_save_flags after lsr read tty: goldfish: Fix free_irq() on remove tty: serial: qcom-geni-serial: Implement start_rx callback serial: core: Introduce callback for start_rx and do stop_rx in suspend only if this callback implementation is present. tty: n_gsm: Debug output allocation must use GFP_ATOMIC
2022-06-17Merge branch 'rework/kthreads' into for-linusPetr Mladek1-0/+5
2022-06-17bpf: Fix non-static bpf_func_proto struct definitionsJoanne Koong1-3/+0
This patch does two things: 1) Marks the dynptr bpf_func_proto structs that were added in [1] as static, as pointed out by the kernel test robot in [2]. 2) There are some bpf_func_proto structs marked as extern which can instead be statically defined. [1] https://lore.kernel.org/bpf/[email protected]/ [2] https://lore.kernel.org/bpf/62ab89f2.Pko7sI08RAKdF8R6%[email protected]/ Reported-by: kernel test robot <[email protected]> Signed-off-by: Joanne Koong <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-17soc: mediatek: mutex: add functions that operate registers by CMDQMoudy Ho1-0/+2
Due to HW limitations, MDP3 is necessary to enable MUTEX in each frame for SOF triggering and cooperate with CMDQ control to reduce the amount of interrupts generated(also, reduce frame latency). In response to the above situation, a new interface "mtk_mutex_enable_by_cmdq" has been added to achieve the purpose. Signed-off-by: Moudy Ho <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Reviewed-by: Rex-BC Chen <[email protected]> Reviewed-by: CK Hu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matthias Brugger <[email protected]>
2022-06-17soc: mediatek: mutex: add common interface for modules settingMoudy Ho1-0/+25
In order to allow multiple modules to operate MUTEX hardware through a common interfrace, two flexible indexes "mtk_mutex_mod_index" and "mtk_mutex_sof_index" need to be added to replace original component ID so that like DDP and MDP can add their own MOD table or SOF settings independently. In addition, 2 generic interface "mtk_mutex_write_mod" and "mtk_mutex_write_sof" have been added, which is expected to replace the "mtk_mutex_add_comp" and "mtk_mutex_remove_comp" pair originally dedicated to DDP in the future. Signed-off-by: Moudy Ho <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Reviewed-by: Rex-BC Chen <[email protected]> Reviewed-by: CK Hu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matthias Brugger <[email protected]>
2022-06-17block: serialize all debugfs operations using q->debugfs_mutexChristoph Hellwig1-4/+4
Various places like I/O schedulers or the QOS infrastructure try to register debugfs files on demans, which can race with creating and removing the main queue debugfs directory. Use the existing debugfs_mutex to serialize all debugfs operations that rely on q->debugfs_dir or the directories hanging off it. To make the teardown code a little simpler declare all debugfs dentry pointers and not just the main one uncoditionally in blkdev.h. Move debugfs_mutex next to the dentries that it protects and document what it is used for. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-17net: pcs: xpcs: add CL37 1000BASE-X AN supportOng Boon Leong1-0/+1
For CL37 1000BASE-X AN, DW xPCS does not support C22 method but offers C45 vendor-specific MII MMD for programming. We also add the ability to disable Autoneg (through ethtool for certain network switch that supports 1000BASE-X (1000Mbps and Full-Duplex) but not Autoneg capability. v4: Fixes to comment from Russell King. Thanks! https://patchwork.kernel.org/comment/24894239/ Make xpcs_modify_changed() as private, change to use mdiodev_modify_changed() for cleaner code. v3: Fixes to issues spotted by Russell King. Thanks! https://patchwork.kernel.org/comment/24890210/ Use phylink_mii_c22_pcs_decode_state(), remove unnecessary interrupt clearing and skip speed & duplex setting if AN is enabled. v2: Fixes to issues spotted by Russell King in v1. Thanks! https://patchwork.kernel.org/comment/24826650/ Use phylink_mii_c22_pcs_encode_advertisement() and implement C45 MII ADV handling since IP only support C45 access. Tested-by: Emilio Riva <[email protected]> Signed-off-by: Ong Boon Leong <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17net: make xpcs_do_config to accept advertising for pcs-xpcs and sja1105Ong Boon Leong1-1/+1
xpcs_config() has 'advertising' input that is required for C37 1000BASE-X AN in later patch series. So, we prepare xpcs_do_config() for it. For sja1105, xpcs_do_config() is used for xpcs configuration without depending on advertising input, so set to NULL. Reported-by: kernel test robot <[email protected]> Signed-off-by: Ong Boon Leong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17crypto: hisilicon/qm - modify event irq processingWeili Qian1-1/+7
When the driver receives an event interrupt, the driver will enable the event interrupt after handling all completed tasks on the function, tasks on the function are parsed through only one thread. If the task's user callback takes time, other tasks on the function will be blocked. Therefore, the event irq processing is modified as follows: 1. Obtain the ID of the queue that completes the task. 2. Enable event interrupt. 3. Parse the completed tasks in the queue and call the user callback. Enabling event interrupt in advance can quickly report pending event interrupts and process tasks in multiple threads. Signed-off-by: Weili Qian <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2022-06-17ata: make transfer mode masks *unsigned int*Sergey Shtylyov1-25/+24
The packed transfer mode masks and also the {pio|mwdma|udma}_mask fields of *struct*s ata_device and ata_port_info are declared as *unsigned long* (which is a 64-bit type on 64-bit architectures) but actually the packed masks occupy only 20 bits (7 PIO modes, 5 MWDMA modes, and 8 UDMA modes) and the PIO/MWDMA/UDMA masks easily fit into just 8 bits each, so we can safely use (always 32-bit) *unsigned int* variables instead. This saves 745 bytes of object code in libata-core.o alone, not to mention LLDDs... Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-06-16bpf: Allow helpers to accept pointers with a fixed sizeMaxim Mikityanskiy1-0/+13
Before this commit, the BPF verifier required ARG_PTR_TO_MEM arguments to be followed by ARG_CONST_SIZE holding the size of the memory region. The helpers had to check that size in runtime. There are cases where the size expected by a helper is a compile-time constant. Checking it in runtime is an unnecessary overhead and waste of BPF registers. This commit allows helpers to accept pointers to memory without the corresponding ARG_CONST_SIZE, given that they define the memory region size in struct bpf_func_proto and use ARG_PTR_TO_FIXED_SIZE_MEM type. arg_size is unionized with arg_btf_id to reduce the kernel image size, and it's valid because they are used by different argument types. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16linux/phy.h: add phydev_err_probe() wrapper for dev_err_probe()Rasmus Villemoes1-0/+3
The dev_err_probe() function is quite useful to avoid boilerplate related to -EPROBE_DEFER handling. Add a phydev_err_probe() helper to simplify making use of that from phy drivers which otherwise use the phydev_* helpers. Signed-off-by: Rasmus Villemoes <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski11-30/+98
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-16fs/kernel_read_file: allow to read files up-to ssize_tPasha Tatashin2-16/+17
Patch series "Allow to kexec with initramfs larger than 2G", v2. Currently, the largest initramfs that is supported by kexec_file_load() syscall is 2G. This is because kernel_read_file() returns int, and is limited to INT_MAX or 2G. On the other hand, there are kexec based boot loaders (i.e. u-root), that may need to boot netboot images that might be larger than 2G. The first patch changes the return type from int to ssize_t in kernel_read_file* functions. The second patch increases the maximum initramfs file size to 4G. Tested: verified that can kexec_file_load() works with 4G initramfs on x86_64. This patch (of 2): Currently, the maximum file size that is supported is 2G. This may be too small in some cases. For example, kexec_file_load() system call loads initramfs. In some netboot cases initramfs can be rather large. Allow to use up-to ssize_t bytes. The callers still can limit the maximum file size via buf_size. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Pasha Tatashin <[email protected]> Cc: Al Viro <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Sasha Levin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16include/linux/rbtree.h: replace kernel.h with the necessary inclusionsAndy Shevchenko1-1/+1
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16net: set proper memcg for net_init hooks allocationsVasily Averin1-1/+46
__register_pernet_operations() executes init hook of registered pernet_operation structure in all existing net namespaces. Typically, these hooks are called by a process associated with the specified net namespace, and all __GFP_ACCOUNT marked allocation are accounted for corresponding container/memcg. However __register_pernet_operations() calls the hooks in the same context, and as a result all marked allocations are accounted to one memcg for all processed net namespaces. This patch adjusts active memcg for each net namespace and helps to account memory allocated inside ops_init() into the proper memcg. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Shakeel Butt <[email protected]> Cc: Michal Koutný <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Florian Westphal <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Linux Kernel Functional Testing <[email protected]> Cc: Muchun Song <[email protected]> Cc: Naresh Kamboju <[email protected]> Cc: Qian Cai <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: kmem: make mem_cgroup_from_obj() vmalloc()-safeRoman Gushchin1-0/+6
Currently mem_cgroup_from_obj() is not working properly with objects allocated using vmalloc(). It creates problems in some cases, when it's called for static objects belonging to modules or generally allocated using vmalloc(). This patch makes mem_cgroup_from_obj() safe to be called on objects allocated using vmalloc(). It also introduces mem_cgroup_from_slab_obj(), which is a faster version to use in places when we know the object is either a slab object or a generic slab page (e.g. when adding an object to a lru list). Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Kefeng Wang <[email protected]> Signed-off-by: Roman Gushchin <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Acked-by: Shakeel Butt <[email protected]> Tested-by: Vasily Averin <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Muchun Song <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Naresh Kamboju <[email protected]> Cc: Qian Cai <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: David S. Miller <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Florian Westphal <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Michal Koutný <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: kmemleak: remove kmemleak_not_leak_phys() and the min_count argument to ↵Patrick Wang1-6/+2
kmemleak_alloc_phys() Patch series "mm: kmemleak: store objects allocated with physical address separately and check when scan", v4. The kmemleak_*_phys() interface uses "min_low_pfn" and "max_low_pfn" to check address. But on some architectures, kmemleak_*_phys() is called before those two variables initialized. The following steps will be taken: 1) Add OBJECT_PHYS flag and rbtree for the objects allocated with physical address 2) Store physical address in objects if allocated with OBJECT_PHYS 3) Check the boundary when scan instead of in kmemleak_*_phys() This patch set will solve: https://lore.kernel.org/r/[email protected] https://lore.kernel.org/r/[email protected] v3: https://lore.kernel.org/r/[email protected] v2: https://lore.kernel.org/r/[email protected] v1: https://lore.kernel.org/r/[email protected] This patch (of 4): Remove the unused kmemleak_not_leak_phys() function. And remove the min_count argument to kmemleak_alloc_phys() function, assume it's 0. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Patrick Wang <[email protected]> Suggested-by: Catalin Marinas <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Yee Lee <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/highmem: delete memmove_page()Fabio M. De Francesco1-13/+0
Matthew Wilcox reported that, while he was looking at memmove_page(), he realized that it can't actually work. The reasons are hidden in its implementation, which makes use of memmove() on logical addresses provided by kmap_local_page(). memmove() does the wrong thing when it tests "if (dest <= src)". Therefore, delete memmove_page(). No need to change any other code because we have no call sites of memmove_page() across the whole kernel. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Fabio M. De Francesco <[email protected]> Reported-by: Matthew Wilcox <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/damon: remove obsolete comments of kdamond_stopChengming Zhou1-9/+8
Since commit 0f91d13366a4 ("mm/damon: simplify stop mechanism") delete kdamond_stop and change to use kthread stop mechanism, these obsolete comments should be removed accordingly. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chengming Zhou <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: avoid unnecessary page fault retires on shared memory typesPeter Xu1-0/+2
I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peter Xu <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Vineet Gupta <[email protected]> Acked-by: Guo Ren <[email protected]> Acked-by: Max Filippov <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc) Acked-by: Catalin Marinas <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Acked-by: Russell King (Oracle) <[email protected]> [arm part] Acked-by: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Stafford Horne <[email protected]> Cc: David S. Miller <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Brian Cain <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Albert Ou <[email protected]> Cc: Anton Ivanov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: James Bottomley <[email protected]> Cc: Al Viro <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Will Deacon <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Simek <[email protected]> Cc: Matt Turner <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Stefan Kristiansson <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Rich Felker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Helge Deller <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16bpf: implement sleepable uprobes by chaining gpsDelyan Kratunov1-0/+52
uprobes work by raising a trap, setting a task flag from within the interrupt handler, and processing the actual work for the uprobe on the way back to userspace. As a result, uprobe handlers already execute in a might_fault/_sleep context. The primary obstacle to sleepable bpf uprobe programs is therefore on the bpf side. Namely, the bpf_prog_array attached to the uprobe is protected by normal rcu. In order for uprobe bpf programs to become sleepable, it has to be protected by the tasks_trace rcu flavor instead (and kfree() called after a corresponding grace period). Therefore, the free path for bpf_prog_array now chains a tasks_trace and normal grace periods one after the other. Users who iterate under tasks_trace read section would be safe, as would users who iterate under normal read sections (from non-sleepable locations). The downside is that the tasks_trace latency affects all perf_event-attached bpf programs (and not just uprobe ones). This is deemed safe given the possible attach rates for kprobe/uprobe/tp programs. Separately, non-sleepable programs need access to dynamically sized rcu-protected maps, so bpf_run_prog_array_sleepables now conditionally takes an rcu read section, in addition to the overarching tasks_trace section. Signed-off-by: Delyan Kratunov <[email protected]> Link: https://lore.kernel.org/r/ce844d62a2fd0443b08c5ab02e95bc7149f9aeb1.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16bpf: move bpf_prog to bpf.hDelyan Kratunov2-34/+36
In order to add a version of bpf_prog_run_array which accesses the bpf_prog->aux member, bpf_prog needs to be more than a forward declaration inside bpf.h. Given that filter.h already includes bpf.h, this merely reorders the type declarations for filter.h users. bpf.h users now have access to bpf_prog internals. Signed-off-by: Delyan Kratunov <[email protected]> Link: https://lore.kernel.org/r/3ed7824e3948f22d84583649ccac0ff0d38b6b58.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16mm/memory-failure: disable unpoison once hw error happenszhenwei pi1-0/+1
Currently unpoison_memory(unsigned long pfn) is designed for soft poison(hwpoison-inject) only. Since 17fae1294ad9d, the KPTE gets cleared on a x86 platform once hardware memory corrupts. Unpoisoning a hardware corrupted page puts page back buddy only, the kernel has a chance to access the page with *NOT PRESENT* KPTE. This leads BUG during accessing on the corrupted KPTE. Suggested by David&Naoya, disable unpoison mechanism when a real HW error happens to avoid BUG like this: Unpoison: Software-unpoisoned page 0x61234 BUG: unable to handle page fault for address: ffff888061234000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G M OE 5.18.0.bm.1-amd64 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ... RIP: 0010:clear_page_erms+0x7/0x10 Code: ... RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001 FS: 00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> prep_new_page+0x151/0x170 get_page_from_freelist+0xca0/0xe20 ? sysvec_apic_timer_interrupt+0xab/0xc0 ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 __alloc_pages+0x17e/0x340 __folio_alloc+0x17/0x40 vma_alloc_folio+0x84/0x280 __handle_mm_fault+0x8d4/0xeb0 handle_mm_fault+0xd5/0x2a0 do_user_addr_fault+0x1d0/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 exc_page_fault+0x78/0x170 asm_exc_page_fault+0x27/0x30 Link: https://lkml.kernel.org/r/[email protected] Fixes: 847ce401df392 ("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Signed-off-by: zhenwei pi <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> [5.8+] Signed-off-by: Andrew Morton <[email protected]>