aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-05-02perf symbols: Fix return incorrect build_id size in elf_read_build_id()Yang Jihong1-1/+1
In elf_read_build_id(), if gnu build_id is found, should return the size of the actually copied data. If descsz is greater thanBuild_ID_SIZE, write_buildid data access may occur. Fixes: be96ea8ffa788dcc ("perf symbols: Fix issue with binaries using 16-bytes buildids (v2)") Reported-by: Will Ochowicz <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Tested-by: Will Ochowicz <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/lkml/CWLP265MB49702F7BA3D6D8F13E4B1A719C649@CWLP265MB4970.GBRP265.PROD.OUTLOOK.COM/T/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-02perf list: Modify the warning message about scandirat(3)Namhyung Kim1-1/+1
It should mention scandirat() instead of scandir(). Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-02perf list: Fix memory leaks in print_tracepoint_events()Namhyung Kim1-4/+8
It should free entries (not only the array) filled by scandirat() after use. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-02perf lock contention: Rework offset calculation with BPF CO-RENamhyung Kim1-7/+7
It seems BPF CO-RE reloc doesn't work well with the pattern that gets the field-offset only. Use offsetof() to make it explicit so that the compiler would generate the correct code. Fixes: 0c1228486befa3d6 ("perf lock contention: Support pre-5.14 kernels") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Co-developed-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-02perf lock contention: Fix struct rq lock accessNamhyung Kim1-4/+4
The BPF CO-RE's ignore suffix rule requires three underscores. Otherwise it'd fail like below: $ sudo perf lock contention -ab libbpf: prog 'collect_lock_syms': BPF program load failed: Invalid argument libbpf: prog 'collect_lock_syms': -- BEGIN PROG LOAD LOG -- reg type unsupported for arg#0 function collect_lock_syms#380 ; int BPF_PROG(collect_lock_syms) 0: (b7) r6 = 0 ; R6_w=0 1: (b7) r7 = 0 ; R7_w=0 2: (b7) r9 = 1 ; R9_w=1 3: <invalid CO-RE relocation> failed to resolve CO-RE relocation <byte_off> [381] struct rq__new.__lock (0:0 @ offset 0) Fixes: 0c1228486befa3d6 ("perf lock contention: Support pre-5.14 kernels") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-02crypto: api - Fix CRYPTO_USER checks for report functionOndrej Mosnacek9-9/+9
Checking the config via ifdef incorrectly compiles out the report functions when CRYPTO_USER is set to =m. Fix it by using IS_ENABLED() instead. Fixes: c0f9e01dd266 ("crypto: api - Check CRYPTO_USER instead of NET for report") Signed-off-by: Ondrej Mosnacek <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-05-02debugobject: Ensure pool refill (again)Thomas Gleixner1-6/+15
The recent fix to ensure atomicity of lookup and allocation inadvertently broke the pool refill mechanism. Prior to that change debug_objects_activate() and debug_objecs_assert_init() invoked debug_objecs_init() to set up the tracking object for statically initialized objects. That's not longer the case and debug_objecs_init() is now the only place which does pool refills. Depending on the number of statically initialized objects this can be enough to actually deplete the pool, which was observed by Ido via a debugobjects OOM warning. Restore the old behaviour by adding explicit refill opportunities to debug_objects_activate() and debug_objecs_assert_init(). Fixes: 63a759694eed ("debugobject: Prevent init race with static objects") Reported-by: Ido Schimmel <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Ido Schimmel <[email protected]> Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
2023-05-01RISC-V: include cpufeature.h in cpufeature.cConor Dooley1-0/+1
Automation complains: warning: symbol '__pcpu_scope_misaligned_access_speed' was not declared. Should it be static? cpufeature.c doesn't actually include the header of the same name, as it had not previously used anything from it. The per-cpu variable is declared there, so include it to silence the complaints. Fixes: 62a31d6e38bd ("RISC-V: hwprobe: Support probing of misaligned access performance") Signed-off-by: Conor Dooley <[email protected]> Reviewed-by: Evan Green <[email protected]> Link: https://lore.kernel.org/r/20230420-wound-gizzard-2b2b589d9bea@spud Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-05-01Merge tag 'input-for-v6.4-rc0' of ↵Linus Torvalds37-324/+1067
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a new driver for Novatek touch controllers - a new driver for power button for NXP BBNSM - a skeleton KUnit tests for the input core - improvements to Xpad game controller driver to support more devices - improvements to edt-ft5x06, hideep and other drivers * tag 'input-for-v6.4-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (42 commits) Revert "Input: xpad - fix support for some third-party controllers" dt-bindings: input: pwm-beeper: convert to dt schema Input: xpad - fix PowerA EnWired Controller guide button Input: xpad - add constants for GIP interface numbers Input: synaptics-rmi4 - fix function name in kerneldoc Input: raspberrypi-ts - fix refcount leak in rpi_ts_probe Input: edt-ft5x06 - select REGMAP_I2C Input: melfas_mip4 - report palm touches Input: cma3000_d0x - remove unneeded code Input: edt-ft5x06 - calculate points data length only once Input: edt-ft5x06 - unify the crc check Input: edt-ft5x06 - convert to use regmap API Input: edt-ft5x06 - don't print error messages with dev_dbg() Input: edt-ft5x06 - remove code duplication Input: edt-ft5x06 - don't recalculate the CRC Input: edt-ft5x06 - add spaces to ensure format specification Input: edt-ft5x06 - remove unnecessary blank lines Input: edt-ft5x06 - fix indentation Input: tsc2007 - enable cansleep pendown GPIO Input: Add KUnit tests for some of the input core helper functions ...
2023-05-01riscv: Move .rela.dyn to the init sectionsAlexandre Ghiti1-6/+6
The recent introduction of relocatable kernels prepared the move of .rela.dyn to the init section, but actually forgot to do so, so do it here. Before this patch: "Freeing unused kernel image (initmem) memory: 2592K" After this patch: "Freeing unused kernel image (initmem) memory: 6288K" The difference corresponds to the size of the .rela.dyn section: "[42] .rela.dyn RELA ffffffff8197e798 0127f798 000000000039c660 0000000000000018 A 47 0 8" Fixes: 559d1e45a16d ("riscv: Use --emit-relocs in order to move .rela.dyn in init") Signed-off-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-05-01dt-bindings: riscv: explicitly mention assumption of Zicsr & Zifencei supportConor Dooley1-0/+6
The dt-binding was defined before the extraction of csr access and fence.i into their own extensions, and thus the presence of the I base extension implies Zicsr and Zifencei. There's no harm in adding them obviously, but for backwards compatibility with DTs that existed prior to that extraction, software is unable to differentiate between "i" and "i_zicsr_zifencei" without any further information. Signed-off-by: Conor Dooley <[email protected]> Acked-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/20230427-fence-blurred-c92fb69d4137@wendy Signed-off-by: Palmer Dabbelt <[email protected]>
2023-05-01riscv: compat_syscall_table: Fixup compile warningGuo Ren1-0/+1
../arch/riscv/kernel/compat_syscall_table.c:12:41: warning: initialized field overwritten [-Woverride-init] 12 | #define __SYSCALL(nr, call) [nr] = (call), | ^ ../include/uapi/asm-generic/unistd.h:567:1: note: in expansion of macro '__SYSCALL' 567 | __SYSCALL(__NR_semget, sys_semget) Fixes: 59c10c52f573 ("riscv: compat: syscall: Add compat_sys_call_table implementation") Reviewed-by: Conor Dooley <[email protected]> Reported-by: kernel test robot <[email protected]> Tested-by: Jisheng Zhang <[email protected]> Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Drew Fustini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-05-01Merge branch 'next' into for-linusDmitry Torokhov13677-495415/+634653
Prepare input updates for 6.4 merge window.
2023-05-01Revert "Input: xpad - fix support for some third-party controllers"Dmitry Torokhov1-23/+0
This reverts commit db7220c48d8d71476f881a7ae1285e1df4105409 because it causes crashes when trying to dereference xpad->dev->dev in xpad_probe() which has not been set up yet. Reported-by: [email protected] Reported-by: Dongliang Mu <[email protected]> Link: https://groups.google.com/g/syzkaller-bugs/c/iMhTgpGuIbM Signed-off-by: Dmitry Torokhov <[email protected]>
2023-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds111-1854/+4008
Pull kvm updates from Paolo Bonzini: "s390: - More phys_to_virt conversions - Improvement of AP management for VSIE (nested virtualization) ARM64: - Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. - New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. - Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. - A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. - The usual selftest fixes and improvements. x86: - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled, and by giving the guest control of CR0.WP when EPT is enabled on VMX (VMX-only because SVM doesn't support per-bit controls) - Add CR0/CR4 helpers to query single bits, and clean up related code where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return as a bool - Move AMD_PSFD to cpufeatures.h and purge KVM's definition - Avoid unnecessary writes+flushes when the guest is only adding new PTEs - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations when emulating invalidations - Clean up the range-based flushing APIs - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle changed SPTE" overhead associated with writing the entire entry - Track the number of "tail" entries in a pte_list_desc to avoid having to walk (potentially) all descriptors during insertion and deletion, which gets quite expensive if the guest is spamming fork() - Disallow virtualizing legacy LBRs if architectural LBRs are available, the two are mutually exclusive in hardware - Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features - Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES - Apply PMU filters to emulated events and add test coverage to the pmu_event_filter selftest - AMD SVM: - Add support for virtual NMIs - Fixes for edge cases related to virtual interrupts - Intel AMX: - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is not being reported due to userspace not opting in via prctl() - Fix a bug in emulation of ENCLS in compatibility mode - Allow emulation of NOP and PAUSE for L2 - AMX selftests improvements - Misc cleanups MIPS: - Constify MIPS's internal callbacks (a leftover from the hardware enabling rework that landed in 6.3) Generic: - Drop unnecessary casts from "void *" throughout kvm_main.c - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct size by 8 bytes on 64-bit kernels by utilizing a padding hole Documentation: - Fix goof introduced by the conversion to rST" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits) KVM: s390: pci: fix virtual-physical confusion on module unload/load KVM: s390: vsie: clarifications on setting the APCB KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init() KVM: selftests: Test the PMU event "Instructions retired" KVM: selftests: Copy full counter values from guest in PMU event filter test KVM: selftests: Use error codes to signal errors in PMU event filter test KVM: selftests: Print detailed info in PMU event filter asserts KVM: selftests: Add helpers for PMC asserts in PMU event filter test KVM: selftests: Add a common helper for the PMU event filter guest code KVM: selftests: Fix spelling mistake "perrmited" -> "permitted" KVM: arm64: vhe: Drop extra isb() on guest exit KVM: arm64: vhe: Synchronise with page table walker on MMU update KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc() KVM: arm64: nvhe: Synchronise with page table walker on TLBI KVM: arm64: Handle 32bit CNTPCTSS traps KVM: arm64: nvhe: Synchronise with page table walker on vcpu run KVM: arm64: vgic: Don't acquire its_lock before config_lock KVM: selftests: Add test to verify KVM's supported XCR0 ...
2023-05-01Merge tag 'for-linus' of https://github.com/openrisc/linuxLinus Torvalds10-31/+101
Pull OpenRISC updates from Stafford Horne: "Two things for OpenRISC this cycle: - Small cleanup for device tree cpu iteration from Rob Herring - Add support for storing, restoring and accessing user space FPU state, to allow for libc to support the FPU on OpenRISC" * tag 'for-linus' of https://github.com/openrisc/linux: openrisc: Add floating point regset openrisc: Support floating point user api openrisc: Support storing and restoring fpu state openrisc: Properly store r31 to pt_regs on unhandled exceptions openrisc: Use common of_get_cpu_node() instead of open-coding
2023-05-01Merge tag 'rtc-6.4' of ↵Linus Torvalds65-197/+122
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Not much this cycle, there is the conversion to remove_new and many small fixes in drivers: Subsystem: - Convert to platform remove callback returning void Drivers: - meson-vrtc: fix a firmware display issue" * tag 'rtc-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (53 commits) rtc: armada38x: use devm_platform_ioremap_resource_byname() rtc: sunplus: use devm_platform_ioremap_resource_byname() rtc: jz4740: Make sure clock provider gets removed rtc: k3: handle errors while enabling wake irq rtc: meson-vrtc: Use ktime_get_real_ts64() to get the current time dt-bindings: rtc: Drop unneeded quotes rtc: pcf8523: remove unnecessary OR operation rtc: pcf8523: fix coding-style issues rtc: ds1390: mark OF related data as maybe unused rtc: omap: include header for omap_rtc_power_off_program prototype rtc: sun6i: Use of_property_present() for testing DT property presence rtc: mpfs: convert SOC_MICROCHIP_POLARFIRE to ARCH_MICROCHIP_POLARFIRE rtc: zynqmp: Convert to platform remove callback returning void rtc: xgene: Convert to platform remove callback returning void rtc: wm8350: Convert to platform remove callback returning void rtc: vt8500: Convert to platform remove callback returning void rtc: twl: Convert to platform remove callback returning void rtc: tps6586x: Convert to platform remove callback returning void rtc: tegra: Convert to platform remove callback returning void rtc: sunplus: Convert to platform remove callback returning void ...
2023-05-01Merge tag 'i3c/for-6.4' of ↵Linus Torvalds12-91/+779
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull i3c updates from Alexandre Belloni: "Subsystem: - OF alias bus numbering - convert to platform remove callback returning void New driver: - AST2600 controller, based on Synopsys DesignWare IP Driver update: - dw: add infrastructure to support different platform integrations" * tag 'i3c/for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: i3c: ast2600: set variable ast2600_i3c_ops storage-class-specifier to static i3c: ast2600: fix register setting for 545 ohm pullups i3c: ast2600: enable IBI support i3c: dw: Add a platform facility for IBI PEC workarounds i3c: dw: Add support for in-band interrupts i3c: dw: Turn DAT array entry into a struct i3c: dw: Create a generic fifo read function i3c: Allow OF-alias-based persistent bus numbering i3c: ast2600: Add AST2600 platform-specific driver dt-bindings: i3c: Add AST2600 i3c controller i3c: dw: Add infrastructure for platform-specific implementations i3c: dw: use bus mode rather than device reg for conditional tCAS setting i3c: dw: Return the length from a read priv_xfer i3c: svc: Convert to platform remove callback returning void i3c: mipi-i3c-hci: Convert to platform remove callback returning void i3c: cdns: Convert to platform remove callback returning void i3c: dw: Convert to platform remove callback returning void i3c: Make i3c_master_unregister() return void i3c: dw: drop of_match_ptr for ID table i3c: Correct reference to the I²C device data type
2023-05-01Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds4-36/+56
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Some ext4 regression and bug fixes" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: clean up error handling in __ext4_fill_super() ext4: reflect error codes from ext4_multi_mount_protect() to its callers ext4: fix lost error code reporting in __ext4_fill_super() ext4: fix unused iterator variable warnings ext4: fix use-after-free read in ext4_find_extent for bigalloc + inline ext4: fix i_disksize exceeding i_size problem in paritally written case
2023-05-01Merge tag '6.4-rc-smb3-client-fixes-part1' of ↵Linus Torvalds8-151/+109
git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: - deferred close fix for an important case when cached file should be closed immediately - two fixes for missing locks - eight minor cleanup * tag '6.4-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number for cifs.ko smb3: move some common open context structs to smbfs_common smb3: make query_on_disk_id open context consistent and move to common code SMB3.1.1: add new tree connect ShareFlags cifs: missing lock when updating session status SMB3: Close deferred file handles in case of handle lease break SMB3: Add missing locks to protect deferred close file list cifs: Avoid a cast in add_lease_context() cifs: Simplify SMB2_open_init() cifs: Simplify SMB2_open_init() cifs: Simplify SMB2_open_init()
2023-05-01Merge tag 'tpmdd-v6.4-rc1-fix-v2' of ↵Linus Torvalds4-11/+28
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fix from Jarkko Sakkinen: "This fixes a critical bug in my first pull request. I fixed the cherry pick issue and tested with real hardare and libvirt/qemu plus swtpm" * tag 'tpmdd-v6.4-rc1-fix-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM drivers
2023-05-01tools/perf: Add basic support for LoongArchHuacai Chen21-3/+518
Add basic support for LoongArch, which is very similar to the MIPS version. Signed-off-by: Ming Wang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: ftrace: Add direct call trampoline samples supportYouling Tang6-0/+152
The ftrace samples need per-architecture trampoline implementations to save and restore argument registers around the calls to my_direct_func* and to restore polluted registers (e.g: ra). Signed-off-by: Qing Zhang <[email protected]> Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: ftrace: Add direct call supportYouling Tang4-1/+33
Select the HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the register_ftrace_direct[_multi] interfaces allowing users to register the customed trampoline (direct_caller) as the mcount for one or more target functions. And modify_ftrace_direct[_multi] are also provided for modifying direct_caller. There are a few cases to distinguish: - If a direct call ops is the only one tracing a function AND the direct called trampoline is within the reach of a 'bl' instruction -> the ftrace patchsite jumps to the trampoline - Else -> the ftrace patchsite jumps to the ftrace_regs_caller trampoline points to ftrace_list_ops so it iterates over all registered ftrace ops, including the direct call ops and calls its call_direct_funcs handler which stores the direct called trampoline's address in the ftrace_regs and the ftrace_regs_caller trampoline will return to that address instead of returning to the traced function Signed-off-by: Qing Zhang <[email protected]> Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: ftrace: Implement ftrace_find_callable_addr() to simplify codeYouling Tang1-59/+57
In the module processing functions, the same logic can be reused by implementing ftrace_find_callable_addr(). Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: ftrace: Fix build error if DYNAMIC_FTRACE_WITH_REGS is not setYouling Tang1-3/+1
We can see the following build error if CONFIG_DYNAMIC_FTRACE_WITH_REGS is not set on LoongArch: arch/loongarch/kernel/ftrace_dyn.c: In function ‘ftrace_make_call’: arch/loongarch/kernel/ftrace_dyn.c:167:23: error: implicit declaration of function ‘__get_mod’ 167 | ret = __get_mod(&mod, pc); | ^~~~~~~~~ arch/loongarch/kernel/ftrace_dyn.c:171:24: error: implicit declaration of function ‘get_plt_addr’ 171 | addr = get_plt_addr(mod, addr); | ^~~~~~~~~~~~ The reason is that the __get_mod() and get_plt_addr() may be called in ftrace_make_{call,nop}. Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accessesQing Zhang1-0/+25
Add new ftrace_regs_{get,set}_*() helpers which can be used to manipulate ftrace_regs. When CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y, these can always be used on any ftrace_regs, and when CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS =n these can be used when regs are available. Signed-off-by: Qing Zhang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Add support for function error injectionTiezhu Yang4-0/+18
Inspired by the commit 42d038c4fb00f ("arm64: Add support for function error injection") and the commit ee55ff803b383 ("riscv: Add support for function error injection"), this patch supports function error injection for LoongArch. Mainly implement two functions: (1) regs_set_return_value() which is used to overwrite the return value, (2) override_function_with_return() which is used to override the probed function returning and jump to its caller. Here is a simple test under CONFIG_FUNCTION_ERROR_INJECTION and CONFIG_FAIL_FUNCTION: # echo sys_clone > /sys/kernel/debug/fail_function/inject # echo 100 > /sys/kernel/debug/fail_function/probability # dmesg bash: fork: Invalid argument # dmesg ... FAULT_INJECTION: forcing a failure. name fail_function, interval 1, probability 100, space 0, times 1 ... Call Trace: [<90000000002238f4>] show_stack+0x5c/0x180 [<90000000012e384c>] dump_stack_lvl+0x60/0x88 [<9000000000b1879c>] should_fail_ex+0x1b0/0x1f4 [<900000000032ead4>] fei_kprobe_handler+0x28/0x6c [<9000000000230970>] kprobe_breakpoint_handler+0xf0/0x118 [<90000000012e3e60>] do_bp+0x2c4/0x358 [<9000000002241924>] exception_handlers+0x1924/0x10000 [<900000000023b7d0>] sys_clone+0x0/0x4 [<90000000012e4744>] do_syscall+0x7c/0x94 [<9000000000221e44>] handle_syscall+0xc4/0x160 Tested-by: Hengqi Chen <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Add ARCH_HAS_FORTIFY_SOURCE selectionQing Zhang1-0/+1
FORTIFY_SOURCE could detect various overflows at compile and run time. ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run with CONFIG_FORTIFY_SOURCE. So select it in LoongArch. See more about this feature from commit 6974f0c4555e285 ("include/linux/ string.h: add the option of fortified string.h functions"). Signed-off-by: Qing Zhang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: crypto: Add crc32 and crc32c hw accelerationMin Zhou5-0/+329
With a blatant copy of some MIPS bits we introduce the crc32 and crc32c hw accelerated module to LoongArch. LoongArch has provided these instructions to calculate crc32 and crc32c: * crc.w.b.w crcc.w.b.w * crc.w.h.w crcc.w.h.w * crc.w.w.w crcc.w.w.w * crc.w.d.w crcc.w.d.w So we can make use of these instructions to improve the performance of calculation for crc32(c) checksums. As can be seen from the following test results, crc32(c) instructions can improve the performance by 58%. Software implemention Hardware acceleration Buffer size time cost (seconds) time cost (seconds) Accel. 100 KB 0.000845 0.000534 59.1% 1 MB 0.007758 0.004836 59.4% 10 MB 0.076593 0.047682 59.4% 100 MB 0.756734 0.479126 58.5% 1000 MB 7.563841 4.778266 58.5% Signed-off-by: Min Zhou <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Add checksum optimization for 64-bit systemBibo Mao3-1/+208
LoongArch platform is 64-bit system, which supports 8-bytes memory accessing, but generic checksum functions use 4-byte memory access. So add 8-bytes memory access optimization for checksum functions on LoongArch. And the code comes from arm64 system. When network hw checksum is disabled, iperf performance improves about 10% with this patch. Signed-off-by: Bibo Mao <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Optimize memory ops (memset/memcpy/memmove)WANG Rui5-167/+603
To optimize memset()/memcpy()/memmove() and so on, we use a jump table to dispatch cases for short data lengths; and for long data lengths, we split the destination into head part (first 8 bytes), tail part (last 8 bytes) and middle part. The head part and tail part may be at unaligned addresses, while the middle part is always aligned (the middle part is allowed to overlap the head/tail part). In this way, the first and last 8 bytes may be unaligned accesses, but we can make sure the data in the middle is processed at an aligned destination address. We have tested micro-bench[1] on a Loongson-3C5000 16-core machine (2.2GHz): 1. memset | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 696.191 | 1518.785 | 118.16% | | 8 | 0 | 1 | 696.325 | 1518.937 | 118.14% | | 50 | 0 | 0 | 969.976 | 8053.902 | 730.32% | | 50 | 0 | 1 | 970.034 | 8058.475 | 730.74% | | 300 | 0 | 0 | 5876.612 | 16544.703 | 181.53% | | 300 | 0 | 1 | 5030.849 | 16549.011 | 228.95% | | 1200 | 0 | 0 | 11797.077 | 16752.137 | 42.00% | | 1200 | 0 | 1 | 5687.141 | 16645.233 | 192.68% | | 4000 | 0 | 0 | 15723.27 | 16761.557 | 6.60% | | 4000 | 0 | 1 | 5906.114 | 16732.316 | 183.30% | | 8000 | 0 | 0 | 16751.403 | 16770.002 | 0.11% | | 8000 | 0 | 1 | 5995.449 | 16754.07 | 179.45% | 2. memcpy | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 696.2 | 1670.605 | 139.96% | | 8 | 0 | 1 | 696.325 | 1671.138 | 139.99% | | 50 | 0 | 0 | 969.974 | 8724.999 | 799.51% | | 50 | 0 | 1 | 970.032 | 8730.138 | 799.98% | | 300 | 0 | 0 | 5564.662 | 16272.652 | 192.43% | | 300 | 0 | 1 | 4670.436 | 14972.842 | 220.59% | | 1200 | 0 | 0 | 10740.23 | 16751.728 | 55.97% | | 1200 | 0 | 1 | 5027.741 | 14874.564 | 195.85% | | 4000 | 0 | 0 | 15122.367 | 16737.642 | 10.68% | | 4000 | 0 | 1 | 5536.918 | 14890.397 | 168.93% | | 8000 | 0 | 0 | 16505.453 | 16553.543 | 0.29% | | 8000 | 0 | 1 | 5821.619 | 14841.804 | 154.94% | 3. memmove | length | src offset | dst offset | speed before | speed after | % | |--------|------------|------------|--------------|-------------|---------| | 8 | 0 | 0 | 982.693 | 1670.568 | 70.00% | | 8 | 0 | 1 | 983.023 | 1671.174 | 70.00% | | 50 | 0 | 0 | 1230.87 | 8727.625 | 609.06% | | 50 | 0 | 1 | 1232.515 | 8730.138 | 608.32% | | 300 | 0 | 0 | 6490.375 | 16296.993 | 151.09% | | 300 | 0 | 1 | 4282.687 | 14972.842 | 249.61% | | 1200 | 0 | 0 | 11742.755 | 16752.546 | 42.66% | | 1200 | 0 | 1 | 5039.338 | 14872.951 | 195.14% | | 4000 | 0 | 0 | 15467.786 | 16737.09 | 8.21% | | 4000 | 0 | 1 | 5009.905 | 14890.542 | 197.22% | | 8000 | 0 | 0 | 16489.664 | 16553.273 | 0.39% | | 8000 | 0 | 1 | 5823.786 | 14858.646 | 155.14% | * speed: MB/s * length: byte [1] https://github.com/heiher/mem-bench Signed-off-by: WANG Rui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Provide kernel fpu functionsHuacai Chen3-1/+47
Provide kernel_fpu_begin()/kernel_fpu_end() to allow the kernel itself to use fpu. They can be used by some other kernel components, e.g., the AMDGPU graphic driver for DCN. Reported-by: WANG Xuerui <[email protected]> Tested-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Relay BCE exceptions to userland as SIGSEGV with si_code=SEGV_BNDERRWANG Xuerui3-0/+119
SEGV_BNDERR was introduced initially for supporting the Intel MPX, but fell into disuse after the MPX support was removed. The LoongArch bounds-checking instructions behave very differently than MPX, but overall the interface is still kind of suitable for conveying the information to userland when bounds-checking assertions trigger, so we wouldn't have to invent more UAPI. Specifically, when the BCE triggers, a SEGV_BNDERR is sent to userland, with si_addr set to the out-of-bounds address or value (in asrt{gt,le}'s case), and one of si_lower or si_upper set to the configured bound depending on the faulting instruction. The other bound is set to either 0 or ULONG_MAX to resemble a range with both lower and upper bounds. Note that it is possible to have si_addr == si_lower in case of a failing asrtgt or {ld,st}gt, because those instructions test for strict greater-than relationship. This should not pose a problem for userland, though, because the faulting PC is available for the application to associate back to the exact instruction for figuring out the expectation. Example exception context generated by a faulting `asrtgt.d t0, t1` (assert t0 > t1 or BCE) with t0=100 and t1=200: > pc 00005555558206a4 ra 00007ffff2d854fc tp 00007ffff2f2f180 sp 00007ffffbf9fb80 > a0 0000000000000002 a1 00007ffffbf9fce8 a2 00007ffffbf9fd00 a3 00007ffff2ed4558 > a4 0000000000000000 a5 00007ffff2f044c8 a6 00007ffffbf9fce0 a7 fffffffffffff000 > t0 0000000000000064 t1 00000000000000c8 t2 00007ffffbfa2d5e t3 00007ffff2f12aa0 > t4 00007ffff2ed6158 t5 00007ffff2ed6158 t6 000000000000002e t7 0000000003d8f538 > t8 0000000000000005 u0 0000000000000000 s9 0000000000000000 s0 00007ffffbf9fce8 > s1 0000000000000002 s2 0000000000000000 s3 00007ffff2f2c038 s4 0000555555820610 > s5 00007ffff2ed5000 s6 0000555555827e38 s7 00007ffffbf9fd00 s8 0000555555827e38 > ra: 00007ffff2d854fc > ERA: 00005555558206a4 > CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) > PRMD: 00000007 (PPLV3 +PIE -PWE) > EUEN: 00000000 (-FPE -SXE -ASXE -BTE) > ECFG: 0007181c (LIE=2-4,11-12 VS=7) > ESTAT: 000a0000 [BCE] (IS= ECode=10 EsubCode=0) > PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Tweak the BADV and CPUCFG.PRID lines in show_regs()WANG Xuerui1-3/+3
Use ISA manual names for BADV and CPUCFG.PRID lines in show_regs(), for stylistic consistency with the other lines already touched. While at it, also include current CPU's full name in show_regs() output. It may be more helpful for developers looking at the resulting dumps, because multiple distinct CPU models may share the same PRID. Not having this info available may hide problems only found on some but not all of the models sharing one specific PRID. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Humanize the ESTAT line when showing registersWANG Xuerui1-7/+75
Example output looks like: [ xx.xxxxxx] ESTAT: 00001000 [INT] (IS=12 ECode=0 EsubCode=0) Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Humanize the ECFG line when showing registersWANG Xuerui1-1/+14
Example output looks like: [ xx.xxxxxx] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Humanize the EUEN line when showing registersWANG Xuerui1-1/+11
Example output looks like: [ xx.xxxxxx] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Humanize the PRMD line when showing registersWANG Xuerui1-1/+10
Example output looks like: [ xx.xxxxxx] PRMD: 00000004 (PPLV0 +PIE -PWE) Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Humanize the CRMD line when showing registersWANG Xuerui1-1/+50
Example output looks like: [ xx.xxxxxx] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) Some initial machinery for this pretty-printing format has been included in this patch as well. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Fix format of CSR lines during show_regs()WANG Xuerui1-10/+6
Use uppercase CSR names throughout for consistency with the manual wording, and right-align the keys. The "CSR" part is inferrable from context, hence dropped for more horizontal space. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Print symbol info for $ra and CSR.ERA only for kernel-mode contextsWANG Xuerui1-5/+8
Otherwise the addresses wouldn't make sense at all. While at it, align the "map keys" to maintain right-alignment with the "estat:" line too; also swap the ERA and ra lines so all CSRs are shown together. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Print GPRs with ABI names when showing registersWANG Xuerui1-13/+23
Show PC (CSR.ERA) in place of $zero, and also show the syscall restart flag (conveniently stuffed in regs[0]) if non-zero. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Define regular names for BCE/WATCH/HVC/GSPR exceptionsWANG Xuerui1-4/+6
Define them according to the ISA manual, in order to enable matching the sub-exceptions for humanization purposes later. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01LoongArch: Clean up the architectural interrupt definitionsWANG Xuerui5-26/+29
While interrupts are assigned ECodes `64 + interrupt number`, all existing use sites of interrupt numbers want the 64 subtracted. Re-arrange the definitions so that the actual interrupt number is used everywhere, and make EXCCODE_INT_END inclusive as it is more intuitive that way. While at it, according to the asm/loongarch.h definitions, the total number of architectural interrupts should be 14, but various other places indicate otherwise (13 or 15). Those places have been adjusted to 14 as well for consistency. Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2023-05-01Merge branch 'rxrpc-timeout-fixes'David S. Miller8-22/+36
David Howells says: ==================== rxrpc: Timeout handling fixes Here are three patches to fix timeouts handling in AF_RXRPC: (1) The hard call timeout should be interpreted in seconds, not milliseconds. (2) Allow a waiting call to be aborted (thereby cancelling the call) in the case a signal interrupts sendmsg() and leaves it hanging until it is granted a channel on a connection. (3) Kernel-generated calls get the timer started on them even if they're still waiting to be attached to a connection. If the timer expires before the wait is complete and a conn is attached, an oops will occur. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-05-01rxrpc: Fix timeout of a call that hasn't yet been granted a channelDavid Howells8-19/+30
afs_make_call() calls rxrpc_kernel_begin_call() to begin a call (which may get stalled in the background waiting for a connection to become available); it then calls rxrpc_kernel_set_max_life() to set the timeouts - but that starts the call timer so the call timer might then expire before we get a connection assigned - leading to the following oops if the call stalled: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... CPU: 1 PID: 5111 Comm: krxrpcio/0 Not tainted 6.3.0-rc7-build3+ #701 RIP: 0010:rxrpc_alloc_txbuf+0xc0/0x157 ... Call Trace: <TASK> rxrpc_send_ACK+0x50/0x13b rxrpc_input_call_event+0x16a/0x67d rxrpc_io_thread+0x1b6/0x45f ? _raw_spin_unlock_irqrestore+0x1f/0x35 ? rxrpc_input_packet+0x519/0x519 kthread+0xe7/0xef ? kthread_complete_and_exit+0x1b/0x1b ret_from_fork+0x22/0x30 Fix this by noting the timeouts in struct rxrpc_call when the call is created. The timer will be started when the first packet is transmitted. It shouldn't be possible to trigger this directly from userspace through AF_RXRPC as sendmsg() will return EBUSY if the call is in the waiting-for-conn state if it dropped out of the wait due to a signal. Fixes: 9d35d880e0e4 ("rxrpc: Move client call connection to the I/O thread") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> cc: "David S. Miller" <[email protected]> cc: Eric Dumazet <[email protected]> cc: Jakub Kicinski <[email protected]> cc: Paolo Abeni <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2023-05-01rxrpc: Make it so that a waiting process can be abortedDavid Howells1-2/+5
When sendmsg() creates an rxrpc call, it queues it to wait for a connection and channel to be assigned and then waits before it can start shovelling data as the encrypted DATA packet content includes a summary of the connection parameters. However, sendmsg() may get interrupted before a connection gets assigned and further sendmsg() calls will fail with EBUSY until an assignment is made. Fix this so that the call can at least be aborted without failing on EBUSY. We have to be careful here as sendmsg() mustn't be allowed to start the call timer if the call doesn't yet have a connection assigned as an oops may follow shortly thereafter. Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> cc: "David S. Miller" <[email protected]> cc: Eric Dumazet <[email protected]> cc: Jakub Kicinski <[email protected]> cc: Paolo Abeni <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2023-05-01rxrpc: Fix hard call timeout unitsDavid Howells1-1/+1
The hard call timeout is specified in the RXRPC_SET_CALL_TIMEOUT cmsg in seconds, so fix the point at which sendmsg() applies it to the call to convert to jiffies from seconds, not milliseconds. Fixes: a158bdd3247b ("rxrpc: Fix timeout of a call that hasn't yet been granted a channel") Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: "David S. Miller" <[email protected]> cc: Eric Dumazet <[email protected]> cc: Jakub Kicinski <[email protected]> cc: Paolo Abeni <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2023-05-01net: atlantic: Define aq_pm_ops conditionally on CONFIG_PMTom Rix1-0/+2
For s390, gcc with W=1 reports drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:458:32: error: 'aq_pm_ops' defined but not used [-Werror=unused-const-variable=] 458 | static const struct dev_pm_ops aq_pm_ops = { | ^~~~~~~~~ The only use of aq_pm_ops is conditional on CONFIG_PM. The definition of aq_pm_ops and its functions should also be conditional on CONFIG_PM. Signed-off-by: Tom Rix <[email protected]> Signed-off-by: David S. Miller <[email protected]>