aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2021-12-10net: add netns refcount tracker to struct sockEric Dumazet1-0/+2
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-10net: add networking namespace refcount trackerEric Dumazet3-8/+53
We have 100+ syzbot reports about netns being dismantled too soon, still unresolved as of today. We think a missing get_net() or an extra put_net() is the root cause. In order to find the bug(s), and be able to spot future ones, this patch adds CONFIG_NET_NS_REFCNT_TRACKER and new helpers to precisely pair all put_net() with corresponding get_net(). To use these helpers, each data structure owning a refcount should also use a "netns_tracker" to pair the get and put. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-10arch: Make ARCH_STACKWALK independent of STACKTRACEPeter Zijlstra1-17/+18
Make arch_stack_walk() available for ARCH_STACKWALK architectures without it being entangled in STACKTRACE. Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Peter Zijlstra (Intel) <[email protected]> [Mark: rebase, drop unnecessary arm change] Signed-off-by: Mark Rutland <[email protected]> Cc: Albert Ou <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2021-12-10EDAC: Add RDDR5 and LRDDR5 memory typesYazen Ghannam1-0/+6
Include Registered-DDR5 and Load-Reduced DDR5 in the list of memory types. Signed-off-by: Yazen Ghannam <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-10Merge tag 'drm-misc-next-2021-12-09' of ↵Dave Airlie7-137/+168
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.17: UAPI Changes: Cross-subsystem Changes: * dma-buf: Make fences mandatory in dma_resv_add_excl_fence Core Changes: * Move hashtable to legacy code * Return error pointers from struct drm_driver.gem_create_object * cma-helper: Improve public interfaces; Remove CONFIG_DRM_KMS_CMA_HELPER option * mipi-dbi: Don't depend on CMA helpers * ttm: Don't include DRM hashtable; Stop prunning fences after wait; Documentation Driver Changes: * aspeed: Select CONFIG_DRM_GEM_CMA_HELPER * bridge/lontium-lt9611: Fix HDMI sensing * bridge/parade-ps8640: Fixes * bridge/sn65dsi86: Defer probe is no dsi host found * fsl-dcu: Select CONFIG_DRM_GEM_CMA_HELPER * i915: Remove dma_resv_prune * omapdrm: Fix scatterlist export; Support virtual planes; Fixes * panel: Boe-tv110c9m,Inx-hj110iz: Update init code * qxl: Use dma-resv iterator * rockchip: Use generic fbdev emulation * tidss: Fixes * vmwgfx: Fix leak on probe errors; Fail probing on broken hosts; New placement for MOB page tables; Hide internal BOs from userspace; Cleanups Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-09pktdvd: stop using bdi congestion framework.NeilBrown1-0/+2
The bdi congestion framework isn't widely used and should be deprecated. pktdvd makes use of it to track congestion, but this can be done entirely internally to pktdvd, so it doesn't need to use the framework. So introduce a "congested" flag. When waiting for bio_queue_size to drop, set this flag and a var_waitqueue() to wait for it. When bio_queue_size does drop and this flag is set, clear the flag and call wake_up_var(). We don't use a wait_var_event macro for the waiting as we need to set the flag and drop the spinlock before calling schedule() and while that is possible with __wait_var_event(), result is not easy to read. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: NeilBrown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-12-10Merge tag 'amd-drm-next-5.17-2021-12-02' of ↵Dave Airlie1-0/+108
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.17-2021-12-02: amdgpu: - Use generic drm fb helpers - PSR fixes - Rework DCN3.1 clkmgr - DPCD 1.3 fixes - Misc display fixes can cleanups - Clock query fixes for APUs - LTTPR fixes - DSC fixes - Misc PM fixes - RAS fixes - OLED backlight fix - SRIOV fixes - Add STB (Smart Trace Buffer) for supported dGPUs - IH rework - Enable seamless boot for DCN3.01 amdkfd: - Rework more stuff around IP discovery enumeration - Further clean up of interfaces with amdgpu - SVM fixes radeon: - Indentation fixes UAPI: - Add a new KFD header that defines some of the sysfs bitfields and enums that userspace has been using for a while The corresponding bit-fields and enums in user mode are defined in https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/master/include/hsakmttypes.h Signed-off-by: Dave Airlie <[email protected]> # Conflicts: # drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-09kcsan: Turn barrier instrumentation into macrosMarco Elver1-11/+13
Some architectures use barriers in 'extern inline' functions, from which we should not refer to static inline functions. For example, building Alpha with gcc and W=1 shows: ./include/asm-generic/barrier.h:70:30: warning: 'kcsan_rmb' is static but used in inline function 'pmd_offset' which is not static 70 | #define smp_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0) | ^~~~~~~~~ ./arch/alpha/include/asm/pgtable.h:293:9: note: in expansion of macro 'smp_rmb' 293 | smp_rmb(); /* see above */ | ^~~~~~~ Which seems to warn about 6.7.4#3 of the C standard: "An inline definition of a function with external linkage shall not contain a definition of a modifiable object with static or thread storage duration, and shall not contain a reference to an identifier with internal linkage." Fix it by turning barrier instrumentation into macros, which matches definitions in <asm/barrier.h>. Perhaps we can revert this change in future, when there are no more 'extern inline' users left. Link: https://lkml.kernel.org/r/[email protected] Reported-by: kernel test robot <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09kcsan: Support WEAK_MEMORY with Clang where no objtool support existsMarco Elver1-1/+12
Clang and GCC behave a little differently when it comes to the __no_sanitize_thread attribute, which has valid reasons, and depending on context either one could be right. Traditionally, user space ThreadSanitizer [1] still expects instrumented builtin atomics (to avoid false positives) and __tsan_func_{entry,exit} (to generate meaningful stack traces), even if the function has the attribute no_sanitize("thread"). [1] https://clang.llvm.org/docs/ThreadSanitizer.html#attribute-no-sanitize-thread GCC doesn't follow the same policy (for better or worse), and removes all kinds of instrumentation if no_sanitize is added. Arguably, since this may be a problem for user space ThreadSanitizer, we expect this may change in future. Since KCSAN != ThreadSanitizer, the likelihood of false positives even without barrier instrumentation everywhere, is much lower by design. At least for Clang, however, to fully remove all sanitizer instrumentation, we must add the disable_sanitizer_instrumentation attribute, which is available since Clang 14.0. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09compiler_attributes.h: Add __disable_sanitizer_instrumentationAlexander Potapenko1-0/+18
The new attribute maps to __attribute__((disable_sanitizer_instrumentation)), which will be supported by Clang >= 14.0. Future support in GCC is also possible. This attribute disables compiler instrumentation for kernel sanitizer tools, making it easier to implement noinstr. It is different from the existing __no_sanitize* attributes, which may still allow certain types of instrumentation to prevent false positives. Signed-off-by: Alexander Potapenko <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09asm-generic/bitops, kcsan: Add instrumentation for barriersMarco Elver2-0/+6
Adds the required KCSAN instrumentation for barriers of atomic bitops. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09locking/atomics, kcsan: Add instrumentation for barriersMarco Elver1-1/+134
Adds the required KCSAN instrumentation for barriers of atomics. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09locking/barriers, kcsan: Support generic instrumentationMarco Elver1-0/+25
Thus far only smp_*() barriers had been defined by asm-generic/barrier.h based on __smp_*() barriers, because the !SMP case is usually generic. With the introduction of instrumentation, it also makes sense to have asm-generic/barrier.h assist in the definition of instrumented versions of mb(), rmb(), wmb(), dma_rmb(), and dma_wmb(). Because there is no requirement to distinguish the !SMP case, the definition can be simpler: we can avoid also providing fallbacks for the __ prefixed cases, and only check if `defined(__<barrier>)`, to finally define the KCSAN-instrumented versions. This also allows for the compiler to complain if an architecture accidentally defines both the normal and __ prefixed variant. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09locking/barriers, kcsan: Add instrumentation for barriersMarco Elver2-15/+16
Adds the required KCSAN instrumentation for barriers if CONFIG_SMP. KCSAN supports modeling the effects of: smp_mb() smp_rmb() smp_wmb() smp_store_release() Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09kcsan: Add core memory barrier instrumentation functionsMarco Elver1-2/+69
Add the core memory barrier instrumentation functions. These invalidate the current in-flight reordered access based on the rules for the respective barrier types and in-flight access type. To obtain barrier instrumentation that can be disabled via __no_kcsan with appropriate compiler-support (and not just with objtool help), barrier instrumentation repurposes __atomic_signal_fence(), instead of inserting explicit calls. Crucially, __atomic_signal_fence() normally does not map to any real instructions, but is still intercepted by fsanitize=thread. As a result, like any other instrumentation done by the compiler, barrier instrumentation can be disabled with __no_kcsan. Unfortunately Clang and GCC currently differ in their __no_kcsan aka __no_sanitize_thread behaviour with respect to builtin atomics (and __tsan_func_{entry,exit}) instrumentation. This is already reflected in Kconfig.kcsan's dependencies for KCSAN_WEAK_MEMORY. A later change will introduce support for newer versions of Clang that can implement __no_kcsan to also remove the additional instrumentation introduced by KCSAN_WEAK_MEMORY. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09kcsan: Add core support for a subset of weak memory modelingMarco Elver3-2/+21
Add support for modeling a subset of weak memory, which will enable detection of a subset of data races due to missing memory barriers. KCSAN's approach to detecting missing memory barriers is based on modeling access reordering, and enabled if `CONFIG_KCSAN_WEAK_MEMORY=y`, which depends on `CONFIG_KCSAN_STRICT=y`. The feature can be enabled or disabled at boot and runtime via the `kcsan.weak_memory` boot parameter. Each memory access for which a watchpoint is set up, is also selected for simulated reordering within the scope of its function (at most 1 in-flight access). We are limited to modeling the effects of "buffering" (delaying the access), since the runtime cannot "prefetch" accesses (therefore no acquire modeling). Once an access has been selected for reordering, it is checked along every other access until the end of the function scope. If an appropriate memory barrier is encountered, the access will no longer be considered for reordering. When the result of a memory operation should be ordered by a barrier, KCSAN can then detect data races where the conflict only occurs as a result of a missing barrier due to reordering accesses. Suggested-by: Dmitry Vyukov <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09kcsan: Avoid checking scoped accesses from nested contextsMarco Elver1-0/+1
Avoid checking scoped accesses from nested contexts (such as nested interrupts or in scheduler code) which share the same kcsan_ctx. This is to avoid detecting false positive races of accesses in the same thread with currently scoped accesses: consider setting up a watchpoint for a non-scoped (normal) access that also "conflicts" with a current scoped access. In a nested interrupt (or in the scheduler), which shares the same kcsan_ctx, we cannot check scoped accesses set up in the parent context -- simply ignore them in this case. With the introduction of kcsan_ctx::disable_scoped, we can also clean up kcsan_check_scoped_accesses()'s recursion guard, and do not need to modify the list's prev pointer. Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski13-40/+61
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-09percpu_ref: Replace kernel.h with the necessary inclusionsAndy Shevchenko1-1/+1
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Dennis Zhou <[email protected]>
2021-12-09skbuff: Extract list pointers to silence compiler warningsKees Cook1-8/+10
Under both -Warray-bounds and the object_size sanitizer, the compiler is upset about accessing prev/next of sk_buff when the object it thinks it is coming from is sk_buff_head. The warning is a false positive due to the compiler taking a conservative approach, opting to warn at casting time rather than access time. However, in support of enabling -Warray-bounds globally (which has found many real bugs), arrange things for sk_buff so that the compiler can unambiguously see that there is no intention to access anything except prev/next. Introduce and cast to a separate struct sk_buff_list, which contains _only_ the first two fields, silencing the warnings: In file included from ./include/net/net_namespace.h:39, from ./include/linux/netdevice.h:37, from net/core/netpoll.c:17: net/core/netpoll.c: In function 'refill_skbs': ./include/linux/skbuff.h:2086:9: warning: array subscript 'struct sk_buff[0]' is partly outside array bounds of 'struct sk_buff_head[1]' [-Warray-bounds] 2086 | __skb_insert(newsk, next->prev, next, list); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/netpoll.c:49:28: note: while referencing 'skb_pool' 49 | static struct sk_buff_head skb_pool; | ^~~~~~~~ This change results in no executable instruction differences. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-09Merge branches 'doc.2021.11.30c', 'exp.2021.12.07a', 'fastnohz.2021.11.30c', ↵Paul E. McKenney5-45/+70
'fixes.2021.11.30c', 'nocb.2021.12.09a', 'nolibc.2021.11.30c', 'tasks.2021.12.09a', 'torture.2021.12.07a' and 'torturescript.2021.11.30c' into HEAD doc.2021.11.30c: Documentation updates. exp.2021.12.07a: Expedited-grace-period fixes. fastnohz.2021.11.30c: Remove CONFIG_RCU_FAST_NO_HZ. fixes.2021.11.30c: Miscellaneous fixes. nocb.2021.12.09a: No-CB CPU updates. nolibc.2021.11.30c: Tiny in-kernel library updates. tasks.2021.12.09a: RCU-tasks updates, including update-side scalability. torture.2021.12.07a: Torture-test in-kernel module updates. torturescript.2021.11.30c: Torture-test scripting updates.
2021-12-09Merge tag 'net-5.16-rc5' of ↵Linus Torvalds9-32/+38
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap Current release - new code bugs: - bpf: fix bpf_check_mod_kfunc_call for built-in modules - ice: fixes for TC classifier offloads - vrf: don't run conntrack on vrf with !dflt qdisc Previous releases - regressions: - bpf: fix the off-by-two error in range markings - seg6: fix the iif in the IPv6 socket control block - devlink: fix netns refcount leak in devlink_nl_cmd_reload() - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports Previous releases - always broken: - ethtool: do not perform operations on net devices being unregistered - udp: use datalen to cap max gso segments - ice: fix races in stats collection - fec: only clear interrupt of handling queue in fec_enet_rx_queue() - m_can: pci: fix incorrect reference clock rate - m_can: disable and ignore ELO interrupt - mvpp2: fix XDP rx queues registering Misc: - treewide: add missing includes masked by cgroup -> bpf.h dependency" * tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports net: wwan: iosm: fixes unable to send AT command during mbim tx net: wwan: iosm: fixes net interface nonfunctional after fw flash net: wwan: iosm: fixes unnecessary doorbell send net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering MAINTAINERS: s390/net: remove myself as maintainer net/sched: fq_pie: prevent dismantle issue net: mana: Fix memory leak in mana_hwc_create_wq seg6: fix the iif in the IPv6 socket control block nfp: Fix memory leak in nfp_cpp_area_cache_add() nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done nfc: fix segfault in nfc_genl_dump_devices_done udp: using datalen to cap max gso segments net: dsa: mv88e6xxx: error handling for serdes_power functions can: kvaser_usb: get CAN clock frequency from device can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter net: mvpp2: fix XDP rx queues registering vmxnet3: fix minimum vectors alloc issue net, neigh: clear whole pneigh_entry at alloc time net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" ...
2021-12-09net: phylink: use legacy_pre_march2020Russell King (Oracle)1-0/+17
Use the legacy flag to indicate whether we should operate in legacy mode. This allows us to stop using the presence of a PCS as an indicator to the age of the phylink user, and make PCS presence optional. Legacy mode involves: 1) calling mac_config() whenever the link comes up 2) calling mac_config() whenever the inband advertisement changes, possibly followed by a call to mac_an_restart() 3) making use of mac_an_restart() 4) making use of mac_pcs_get_state() All the above functionality was moved to a seperate "PCS" block of operations in March 2020. Update the documents to indicate that the differences that this flag makes. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-09net: phylink: add legacy_pre_march2020 indicatorRussell King (Oracle)1-0/+3
Add a boolean to phylink_config to indicate whether a driver has not been updated for the changes in commit 7cceb599d15d ("net: phylink: avoid mac_config calls"), and thus are reliant on the old behaviour. We were currently keying the phylink behaviour on the presence of a PCS, but this is sub-optimal for modern drivers that may not have a PCS. This commit merely introduces the new flag, but does not add any use, since we need all legacy drivers to set this flag before it can be used. Once these legacy drivers have been updated, we can remove this flag. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-09fs_parse: allow parameter value to be emptyLukas Czerner1-1/+1
Allow parameter value to be empty by specifying fs_param_can_be_empty flag. Signed-off-by: Lukas Czerner <[email protected]> Cc: Al Viro <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-12-09Merge branch 'for-linus' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fixes for various drivers which assume that a HID device is on USB transport, but that might not necessarily be the case, as the device can be faked by uhid. (Greg, Benjamin Tissoires) - fix for spurious wakeups on certain Lenovo notebooks (Thomas Weißschuh) - a few other device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for Elan touchscreen on Asus UX550VE HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested HID: google: add eel USB id HID: add USB_HID dependancy to hid-prodikeys HID: add USB_HID dependancy to hid-chicony HID: bigbenff: prevent null pointer dereference HID: sony: fix error path in probe HID: add USB_HID dependancy on some USB HID drivers HID: check for valid USB device for many HID drivers HID: wacom: fix problems when device is not a valid USB device HID: add hid_is_usb() function to make it simpler for USB detection HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
2021-12-09aio: fix use-after-free due to missing POLLFREE handlingEric Biggers1-1/+1
signalfd_poll() and binder_poll() are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, by sending a POLLFREE notification to all waiters. Unfortunately, only eventpoll handles POLLFREE. A second type of non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't handle POLLFREE. This allows a use-after-free to occur if a signalfd or binder fd is polled with aio poll, and the waitqueue gets freed. Fix this by making aio poll handle POLLFREE. A patch by Ramji Jiyani <[email protected]> (https://lore.kernel.org/r/[email protected]) tried to do this by making aio_poll_wake() always complete the request inline if POLLFREE is seen. However, that solution had two bugs. First, it introduced a deadlock, as it unconditionally locked the aio context while holding the waitqueue lock, which inverts the normal locking order. Second, it didn't consider that POLLFREE notifications are missed while the request has been temporarily de-queued. The second problem was solved by my previous patch. This patch then properly fixes the use-after-free by handling POLLFREE in a deadlock-free way. It does this by taking advantage of the fact that freeing of the waitqueue is RCU-delayed, similar to what eventpoll does. Fixes: 2c14fa838cbe ("aio: implement IOCB_CMD_POLL") Cc: <[email protected]> # v4.18+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Eric Biggers <[email protected]>
2021-12-09wait: add wake_up_pollfree()Eric Biggers1-0/+26
Several ->poll() implementations are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'. However, that has a bug: wake_up_poll() calls __wake_up() with nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters, and the wakeup function for the first one returns a positive value, only that one will be called. That's *not* what's needed for POLLFREE; POLLFREE is special in that it really needs to wake up everyone. Considering the three non-blocking poll systems: - io_uring poll doesn't handle POLLFREE at all, so it is broken anyway. - aio poll is unaffected, since it doesn't support exclusive waits. However, that's fragile, as someone could add this feature later. - epoll doesn't appear to be broken by this, since its wakeup function returns 0 when it sees POLLFREE. But this is fragile. Although there is a workaround (see epoll), it's better to define a function which always sends POLLFREE to all waiters. Add such a function. Also make it verify that the queue really becomes empty after all waiters have been woken up. Reported-by: Linus Torvalds <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Eric Biggers <[email protected]>
2021-12-09drm/vmwgfx: Allow checking for gl43 contextsZack Rusin1-0/+1
To make sure we're running on top of hardware that can support GL4.3 we need to add a way of querying for those capabilities. DRM_VMW_PARAM_GL43 allows userspace to check for presence of GL4.3 capable contexts. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-09KVM: Introduce CONFIG_HAVE_KVM_DIRTY_RINGDavid Woodhouse1-4/+4
I'd like to make the build include dirty_ring.c based on whether the arch wants it or not. That's a whole lot simpler if there's a config symbol instead of doing it implicitly on KVM_DIRTY_LOG_PAGE_OFFSET being set to something non-zero. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-12-09bus: mhi: core: Add support for forced PM resumeLoic Poulain1-0/+13
For whatever reason, some devices like QCA6390, WCN6855 using ath11k are not in M3 state during PM resume, but still functional. The mhi_pm_resume should then not fail in those cases, and let the higher level device specific stack continue resuming process. Add an API mhi_pm_resume_force(), to force resuming irrespective of the current MHI state. This fixes a regression with non functional ath11k WiFi after suspend/resume cycle on some machines. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=214179 Link: https://lore.kernel.org/regressions/[email protected]/ Fixes: 020d3b26c07a ("bus: mhi: Early MHI resume failure in non M3 state") Cc: [email protected] #5.13 Reported-by: Kalle Valo <[email protected]> Reported-by: Pengyu Ma <[email protected]> Tested-by: Kalle Valo <[email protected]> Acked-by: Kalle Valo <[email protected]> Signed-off-by: Loic Poulain <[email protected]> [mani: Switched to API, added bug report, reported-by tags and CCed stable] Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-12-09mtd: Introduce an expert mode for forensics and debugging purposesMiquel Raynal1-0/+3
When developping NAND controller drivers or when debugging filesystem corruptions, it is quite common to need hacking locally into the MTD/NAND core in order to get access to the content of the bad blocks. Instead of having multiple implementations out there let's provide a simple yet effective specific MTD-wide debugfs entry to fully disable these checks on purpose. A warning is added to inform the user when this mode gets enabled. Signed-off-by: Miquel Raynal <[email protected]> Link: https://lore.kernel.org/linux-mtd/[email protected]
2021-12-09x86/sgx: Add an attribute for the amount of SGX memory in a NUMA nodeJarkko Sakkinen1-0/+4
== Problem == The amount of SGX memory on a system is determined by the BIOS and it varies wildly between systems. It can be as small as dozens of MB's and as large as many GB's on servers. Just like how applications need to know how much regular RAM is available, enclave builders need to know how much SGX memory an enclave can consume. == Solution == Introduce a new sysfs file: /sys/devices/system/node/nodeX/x86/sgx_total_bytes to enumerate the amount of SGX memory available in each NUMA node. This serves the same function for SGX as /proc/meminfo or /sys/devices/system/node/nodeX/meminfo does for normal RAM. 'sgx_total_bytes' is needed today to help drive the SGX selftests. SGX-specific swap code is exercised by creating overcommitted enclaves which are larger than the physical SGX memory on the system. They currently use a CPUID-based approach which can diverge from the actual amount of SGX memory available. 'sgx_total_bytes' ensures that the selftests can work efficiently and do not attempt stupid things like creating a 100,000 MB enclave on a system with 128 MB of SGX memory. == Implementation Details == Introduce CONFIG_HAVE_ARCH_NODE_DEV_GROUP opt-in flag to expose an arch specific attribute group, and add an attribute for the amount of SGX memory in bytes to each NUMA node: == ABI Design Discussion == As opposed to the per-node ABI, a single, global ABI was considered. However, this would prevent enclaves from being able to size themselves so that they fit on a single NUMA node. Essentially, a single value would rule out NUMA optimizations for enclaves. Create a new "x86/" directory inside each "nodeX/" sysfs directory. 'sgx_total_bytes' is expected to be the first of at least a few sgx-specific files to be placed in the new directory. Just scanning /proc/meminfo, these are the no-brainers that we have for RAM, but we need for SGX: MemTotal: xxxx kB // sgx_total_bytes (implemented here) MemFree: yyyy kB // sgx_free_bytes SwapTotal: zzzz kB // sgx_swapped_bytes So, at *least* three. I think we will eventually end up needing something more along the lines of a dozen. A new directory (as opposed to being in the nodeX/ "root") directory avoids cluttering the root with several "sgx_*" files. Place the new file in a new "nodeX/x86/" directory because SGX is highly x86-specific. It is very unlikely that any other architecture (or even non-Intel x86 vendor) will ever implement SGX. Using "sgx/" as opposed to "x86/" was also considered. But, there is a real chance this can get used for other arch-specific purposes. [ dhansen: rewrite changelog ] Signed-off-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-12-09Merge drm/drm-next into drm-intel-nextJani Nikula19-91/+496
Get the dependencies for merging drm-privacy-screen support. Signed-off-by: Jani Nikula <[email protected]>
2021-12-09drm: Replace kernel.h with the necessary inclusionsAndy Shevchenko3-3/+5
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-09genirq/msi: Handle PCI/MSI allocation fail in core codeThomas Gleixner1-4/+1
Get rid of yet another irqdomain callback and let the core code return the already available information of how many descriptors could be allocated. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> # PCI Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Make pci_msi_domain_check_cap() staticThomas Gleixner1-2/+0
No users outside of that file. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Move msi_lock to struct pci_devThomas Gleixner2-2/+1
It's only required for PCI/MSI. So no point in having it in every struct device. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Sanitize MSI-X table map handlingThomas Gleixner1-0/+1
Unmapping the MSI-X base mapping in the loops which allocate/free MSI descriptors is daft and in the way of allowing runtime expansion of MSI-X descriptors. Store the mapping in struct pci_dev and free it after freeing the MSI-X descriptors. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Split out irqdomain codeThomas Gleixner1-11/+0
Move the irqdomain specific code into its own file. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Make arch_restore_msi_irqs() less horrible.Thomas Gleixner1-4/+3
Make arch_restore_msi_irqs() return a boolean which indicates whether the core code should restore the MSI message or not. Get rid of the indirection in x86. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> # PCI Link: https://lore.kernel.org/r/[email protected]
2021-12-09genirq/msi, treewide: Use a named struct for PCI/MSI attributesThomas Gleixner1-43/+41
The unnamed struct sucks and is in the way of further cleanups. Stick the PCI related MSI data into a real data structure and cleanup all users. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Acked-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Remove msi_desc_to_pci_sysdata()Thomas Gleixner1-5/+0
Last user is gone long ago. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09PCI/MSI: Make pci_msi_domain_write_msg() staticThomas Gleixner1-1/+0
There is no point to have this function public as it is set by the PCI core anyway when a PCI/MSI irqdomain is created. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> # PCI Link: https://lore.kernel.org/r/[email protected]
2021-12-09genirq/msi: Fixup includesThomas Gleixner1-1/+1
Remove the kobject.h include from msi.h as it's not required and add a sysfs.h include to the core code instead. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09genirq/msi: Remove unused domain callbacksThomas Gleixner1-7/+4
No users and there is no need to grow them. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected]
2021-12-09genirq/msi: Guard sysfs codeThomas Gleixner1-0/+10
No point in building unused code when CONFIG_SYSFS=n. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Juergen Gross <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-09Merge tag 'drm-misc-next-2021-11-29' of ↵Daniel Vetter2-6/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.17: UAPI Changes: Cross-subsystem Changes: * Move 'nomodeset' kernel boot option into DRM subsystem Core Changes: * Replace several DRM_*() logging macros with drm_*() equivalents * panel: Add quirk for Lenovo Yoga Book X91F/L * ttm: Documentation fixes Driver Changes: * Cleanup nomodeset handling in drivers * Fixes * bridge/anx7625: Fix reading EDID; Fix error code * bridge/megachips: Probe both bridges before registering * vboxvideo: Fix ERR_PTR usage Signed-off-by: Daniel Vetter <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-09erofs: clean up erofs_map_blocks tracepointsGao Xiang1-2/+2
Since the new type of chunk-based files is introduced, there is no need to leave flatmode tracepoints. Rename to erofs_map_blocks instead. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Yue Hu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2021-12-08net: wwan: make debugfs optionalSergey Ryazanov1-0/+7
Debugfs interface is optional for the regular modem use. Some distros and users will want to disable this feature for security or kernel size reasons. So add a configuration option that allows to completely disable the debugfs interface of the WWAN devices. A primary considered use case for this option was embedded firmwares. For example, in OpenWrt, you can not completely disable debugfs, as a lot of wireless stuff can only be configured and monitored with the debugfs knobs. At the same time, reducing the size of a kernel and modules is an essential task in the world of embedded software. Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K (x86-64 build) of space for module storage. Not much, but already considerable when you only have 16MB of storage. So it is hard to just disable whole debugfs. Users need some fine grained set of options to control which debugfs interface is important and should be available and which is not. The new configuration symbol is enabled by default and is hidden under the EXPERT option. So a regular user would not be bothered by another one configuration question. While an embedded distro maintainer will be able to a little more reduce the final image size. Signed-off-by: Sergey Ryazanov <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Reviewed-by: Loic Poulain <[email protected]> Acked-by: M Chetan Kumar <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>