aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2019-11-28powerpc: Ultravisor: Add PPC_UV config optionAnshuman Khandual1-0/+17
CONFIG_PPC_UV adds support for ultravisor. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> [ Update config help and commit message ] Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Support reset of secure guestBharata B Rao5-0/+109
Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF. This ioctl will be issued by QEMU during reset and includes the the following steps: - Release all device pages of the secure guest. - Ask UV to terminate the guest via UV_SVM_TERMINATE ucall - Unpin the VPA pages so that they can be migrated back to secure side when guest becomes secure again. This is required because pinned pages can't be migrated. - Reinit the partition scoped page tables After these steps, guest is ready to issue UV_ESM call once again to switch to secure mode. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [Implementation of uv_svm_terminate() and its call from guest shutdown path] Signed-off-by: Ram Pai <linuxram@us.ibm.com> [Unpinning of VPA pages] Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Handle memory plug/unplug to secure VMBharata B Rao6-0/+76
Register the new memslot with UV during plug and unregister the memslot during unplug. In addition, release all the device pages during unplug. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Radix changes for secure guestBharata B Rao5-0/+66
- After the guest becomes secure, when we handle a page fault of a page belonging to SVM in HV, send that page to UV via UV_PAGE_IN. - Whenever a page is unmapped on the HV side, inform UV via UV_PAGE_INVAL. - Ensure all those routines that walk the secondary page tables of the guest don't do so in case of secure VM. For secure guest, the active secondary page tables are in secure memory and the secondary page tables in HV are freed when guest becomes secure. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Shared pages support for secure guestsBharata B Rao2-4/+84
A secure guest will share some of its pages with hypervisor (Eg. virtio bounce buffers etc). Support sharing of pages between hypervisor and ultravisor. Shared page is reachable via both HV and UV side page tables. Once a secure page is converted to shared page, the device page that represents the secure page is unmapped from the HV side page tables. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Support for running secure guestsBharata B Rao8-0/+769
A pseries guest can be run as secure guest on Ultravisor-enabled POWER platforms. On such platforms, this driver will be used to manage the movement of guest pages between the normal memory managed by hypervisor (HV) and secure memory managed by Ultravisor (UV). HV is informed about the guest's transition to secure mode via hcalls: H_SVM_INIT_START: Initiate securing a VM H_SVM_INIT_DONE: Conclude securing a VM As part of H_SVM_INIT_START, register all existing memslots with the UV. H_SVM_INIT_DONE call by UV informs HV that transition of the guest to secure mode is complete. These two states (transition to secure mode STARTED and transition to secure mode COMPLETED) are recorded in kvm->arch.secure_guest. Setting these states will cause the assembly code that enters the guest to call the UV_RETURN ucall instead of trying to enter the guest directly. Migration of pages betwen normal and secure memory of secure guest is implemented in H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls. H_SVM_PAGE_IN: Move the content of a normal page to secure page H_SVM_PAGE_OUT: Move the content of a secure page to normal page Private ZONE_DEVICE memory equal to the amount of secure memory available in the platform for running secure guests is created. Whenever a page belonging to the guest becomes secure, a page from this private device memory is used to represent and track that secure page on the HV side. The movement of pages between normal and secure memory is done via migrate_vma_pages() using UV_PAGE_IN and UV_PAGE_OUT ucalls. In order to prevent the device private pages (that correspond to pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN calls ksm_madvise() under read version of mmap_sem. However ksm_madvise() needs to be under write lock. Hence we call kvmppc_svm_page_in with mmap_sem held for writing, and it then downgrades to a read lock after calling ksm_madvise. [paulus@ozlabs.org - roll in patch "KVM: PPC: Book3S HV: Take write mmap_sem when calling ksm_madvise"] Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-27Merge tag 'trace-v5.5' of ↵Linus Torvalds8-6/+86
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - New PERMANENT flag to ftrace_ops when attaching a callback to a function. As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes" * tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits) tracing: Enable syscall optimization for MIPS tracing: Use xarray for syscall trace events tracing: Sample module to demonstrate kernel access to Ftrace instances. tracing: Adding new functions for kernel access to Ftrace instances tracing: Fix Kconfig indentation ring-buffer: Fix typos in function ring_buffer_producer ftrace: Use BIT() macro ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured ftrace: Rename ftrace_graph_stub to ftrace_stub_graph ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization ftrace: Add helper find_direct_entry() to consolidate code ftrace: Add another check for match in register_ftrace_direct() ftrace: Fix accounting bug with direct->count in register_ftrace_direct() ftrace/selftests: Fix spelling mistake "wakeing" -> "waking" tracing: Increase SYNTH_FIELDS_MAX for synthetic_events ftrace/samples: Add a sample module that implements modify_ftrace_direct() ftrace: Add modify_ftrace_direct() tracing: Add missing "inline" in stub function of latency_fsnotify() tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text tracing: Use seq_buf_hex_dump() to dump buffers ...
2019-11-27Merge tag 'microblaze-v5.5-rc1' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds6-8/+6
Pull Microblaze updates from Michal Simek: - extend DTB space - defconfig update - clean up rescheduling logic - enable SPARSE_IRQ * tag 'microblaze-v5.5-rc1' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Increase max dtb size to 64K from 32K microblaze: Enable SPARSE_IRQ microblaze: defconfig: Enable devtmps and tmpfs microblaze: entry: Remove unneeded need_resched() loop
2019-11-27Merge tag 'riscv/for-v5.5-rc1' of ↵Linus Torvalds69-450/+1067
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: "New features: - SECCOMP support - nommu support - SBI-less system support - M-Mode support - TLB flush optimizations Other improvements: - Pass the complete RISC-V ISA string supported by the CPU cores to userspace, rather than redacting parts of it in the kernel - Add platform DMA IP block data to the HiFive Unleashed board DT file - Add Makefile support for BZ2, LZ4, LZMA, LZO kernel image compression formats, in line with other architectures Cleanups: - Remove unnecessary PTE_PARENT_SIZE macro - Standardize include guard naming across arch/riscv" * tag 'riscv/for-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (22 commits) riscv: provide a flat image loader riscv: add nommu support riscv: clear the instruction cache and all registers when booting riscv: read the hart ID from mhartid on boot riscv: provide native clint access for M-mode riscv: dts: add support for PDMA device of HiFive Unleashed Rev A00 riscv: add support for MMIO access to the timer registers riscv: implement remote sfence.i using IPIs riscv: cleanup the default power off implementation riscv: poison SBI calls for M-mode riscv: don't allow selecting SBI based drivers for M-mode RISC-V: Add multiple compression image format. riscv: clean up the macro format in each header file riscv: Use PMD_SIZE to replace PTE_PARENT_SIZE riscv: abstract out CSR names for supervisor vs machine mode riscv: separate MMIO functions into their own header file riscv: enter WFI in default_power_off() if SBI does not shutdown RISC-V: Issue a tlb page flush if possible RISC-V: Issue a local tlbflush if possible. RISC-V: Do not invoke SBI call if cpumask is empty ...
2019-11-27Merge tag 'powerpc-spectre-rsb' of powerpc-CVE-2019-18660.bundleLinus Torvalds5-4/+95
Pull powerpc Spectre-RSB fixes from Michael Ellerman: "We failed to activate the mitigation for Spectre-RSB (Return Stack Buffer, aka. ret2spec) on context switch, on CPUs prior to Power9 DD2.3. That allows a process to poison the RSB (called Link Stack on Power CPUs) and possibly misdirect speculative execution of another process. If the victim process can be induced to execute a leak gadget then it may be possible to extract information from the victim via a side channel. The fix is to correctly activate the link stack flush mitigation on all CPUs that have any mitigation of Spectre v2 in userspace enabled. There's a second commit which adds a link stack flush in the KVM guest exit path. A leak via that path has not been demonstrated, but we believe it's at least theoretically possible. This is the fix for CVE-2019-18660" * tag 'powerpc-spectre-rsb' of /home/torvalds/Downloads/powerpc-CVE-2019-18660.bundle: KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel powerpc/book3s64: Fix link stack flush on context switch
2019-11-27Merge tag 'driver-core-5.5-rc1' of ↵Linus Torvalds37-90/+484
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the "big" set of driver core patches for 5.5-rc1 There's a few minor cleanups and fixes in here, but the majority of the patches in here fall into two buckets: - debugfs api cleanups and fixes - driver core device link support for boot dependancy issues The debugfs api cleanups are working to slowly refactor the debugfs apis so that it is even harder to use incorrectly. That work has been happening for the past few kernel releases and will continue over time, it's a long-term project/goal The driver core device link support missed 5.4 by just a bit, so it's been sitting and baking for many months now. It's from Saravana Kannan to help resolve the problems that DT-based systems have at boot time with dependancy graphs and kernel modules. Turns out that no one has actually tried to build a generic arm64 kernel with loads of modules and have it "just work" for a variety of platforms (like a distro kernel). The big problem turned out to be a lack of dependency information between different areas of DT entries, and the work here resolves that problem and now allows devices to boot properly, and quicker than a monolith kernel. All of these patches have been in linux-next for a long time with no reported issues" * tag 'driver-core-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (68 commits) tracing: Remove unnecessary DEBUG_FS dependency of: property: Add device link support for interrupt-parent, dmas and -gpio(s) debugfs: Fix !DEBUG_FS debugfs_create_automount of: property: Add device link support for "iommu-map" of: property: Fix the semantics of of_is_ancestor_of() i2c: of: Populate fwnode in of_i2c_get_board_info() drivers: base: Fix Kconfig indentation firmware_loader: Fix labels with comma for builtin firmware driver core: Allow device link operations inside sync_state() driver core: platform: Declare ret variable only once cpu-topology: declare parse_acpi_topology in <linux/arch_topology.h> crypto: hisilicon: no need to check return value of debugfs_create functions driver core: platform: use the correct callback type for bus_find_device firmware_class: make firmware caching configurable driver core: Clarify documentation for fwnode_operations.add_links() mailbox: tegra: Fix superfluous IRQ error message net: caif: Fix debugfs on 64-bit platforms mac80211: Use debugfs_create_xul() helper media: c8sectpfe: no need to check return value of debugfs_create functions of: property: Add device link support for iommus, mboxes and io-channels ...
2019-11-27Merge tag 'staging-5.5-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging / iio updates from Greg KH: "Here is the big staging and iio set of patches for the 5.5-rc1 release. It's the usual huge collection of cleanup patches all over the drivers/staging/ area, along with a new staging driver, and a bunch of new IIO drivers as well. Full details are in the shortlog, but all of these have been in linux-next for a long time with no reported issues" * tag 'staging-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (548 commits) staging: vchiq: Have vchiq_dump_* functions return an error code staging: vchiq: Refactor indentation in vchiq_dump_* functions staging: fwserial: Fix Kconfig indentation (seven spaces) staging: vchiq_dump: Replace min with min_t staging: vchiq: Fix block comment format in vchiq_dump() staging: octeon: indent with tabs instead of spaces staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error staging: most: core: remove sysfs attr remove_link staging: vc04: Fix Kconfig indentation staging: pi433: Fix Kconfig indentation staging: nvec: Fix Kconfig indentation staging: most: Fix Kconfig indentation staging: fwserial: Fix Kconfig indentation staging: fbtft: Fix Kconfig indentation fbtft: Drop OF dependency fbtft: Make use of device property API fbtft: Drop useless #ifdef CONFIG_OF and dead code fbtft: Describe function parameters in kernel-doc fbtft: Make sure string is NULL terminated staging: rtl8723bs: remove set but not used variable 'change', 'pos' ...
2019-11-27Merge tag 'mmc-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds6-314/+34
Pull MMC updates from Ulf Hansson: "These are the updates for MMC and MEMSTICK for v5.5. Note that this also contains quite some additional changes reaching beyond both the MMC and MEMSTICK subsystems. This is primarily because of fixing an old regression for a WiFi driver based on the SDIO interface on an OMAP openpandora board MMC core: - Add CMD13 polling for MMC IOCTLS with R1B response. - Add common DT properties for clk-phase-delays for various speed modes. - Fix size overflow for mmc gp-partitions. - Re-work HW reset for SDIO cards, which also includes a re-work for Marvell's WiFi mwifiex SDIO func driver. MMC host: - jz4740: Add support for X1000 and JZ4760. - jz4740: Add support for 8-bit bus and for low power mode. - mmci: Add support for HW busy timeout for the stm32_sdmmc variant. - owl-mmc: Add driver for Actions Semi Owl SoCs SD/MMC controller. - renesas_sdhi: Add support for r8a774b1. - sdhci_am654: Add support for Command Queuing Engine for J721E. - sdhci-milbeaut: Add driver for the Milbeaut SD controller. - sdhci-of-arasan: Add support for ZynqMP tap-delays. - sdhci-of-arasan: Add support for clk-phase-delays for SD cards. - sdhci-of-arasan: Add support for Intel LGM SDXC. - sdhci-of-aspeed: Allow inversion of the internal card detect signal. - sdhci-of-esdhc: Fixup workaround for erratum A-008171 for tunings. - sdhci-of-at91: Improve support for calibration. - sdhci-pci: Add support for Intel JSL. - sdhci-pci: Add quirk for AMD SDHC Device 0x7906. - tmio: Enable support for erase/discard/trim requests. MMC/OMAP/pandora/wl1251: The TI wl1251 WiFi driver for SDIO on the OMAP openpandora board has been broken since v4.7. To fix the problems, changes have been made cross subsystems, but also to OMAP2 machine code and to openpandora DTS files, as summarized below. Relevant changes have been tagged for stable. - mmc/wl1251: Re-introduce lost SDIO quirks and vendor-id for wl1251 - omap/omap_hsmmc: Remove redundant platform config for openpandora - omap_hsmmc: Initialize non-std SDIO card for wl1251 for pandora - omap/dts/pandora: Specify wl1251 through a child node of mmc3 - wl1251: Add devicetree support for TI wl1251 SDIO" * tag 'mmc-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (73 commits) dt-bindings: mmc: Correct the type of the clk phase properties Revert "mmc: tmio: remove workaround for NON_REMOVABLE" memstick: Fix Kconfig indentation mmc: sdhci-of-arasan: Add support for ZynqMP Platform Tap Delays Setup dt-bindings: mmc: arasan: Document 'xlnx,zynqmp-8.9a' controller firmware: xilinx: Add SDIO Tap Delay nodes mmc: sdhci-of-arasan: Add support to set clock phase delays for SD dt-bindings: mmc: Add optional generic properties for mmc mmc: sdhci-of-arasan: Add sampling clock for a phy to use dt-bindings: mmc: arasan: Update Documentation for the input clock mmc: sdhci-of-arasan: Separate out clk related data to another structure mmc: sdhci: Fix grammar in warning message mmc: sdhci-of-aspeed: add inversion signal presence mmc: sdhci-of-aspeed: enable CONFIG_MMC_SDHCI_IO_ACCESSORS mmc: sdhci_am654: Add Support for Command Queuing Engine to J721E mmc: core: Fix size overflow for mmc partitions mmc: tmio: Add MMC_CAP_ERASE to allow erase/discard/trim requests net: wireless: ti: remove local VENDOR_ID and DEVICE_ID definitions net: wireless: ti: wl1251 use new SDIO_VENDOR_ID_TI_WL1251 definition mmc: core: fix wl1251 sdio quirks ...
2019-11-27Merge tag 'pinctrl-v5.5-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "This is the bulk of pin control changes for v5.5. It is pretty much business as usual, the most interesting thing I think is the pin controller for a new Intel chip called Lightning Mountain, which is according to news reports some kind of embedded network processor and what is surprising about it is that Intel have decided to use device tree to describe the system rather than ACPI that they have traditionally favored. Core changes: - Avoid taking direct references to device tree-supplied device names: these may changed at runtime under certain circumstances to kstrdup them. GPIO related: - Work is ongoing to move to passing the irqchip along as a templated struct gpio_irq_chip when adding a standard gpiolib-based irqchip to a GPIO controller, a few patches in this cycle switches a few pin control drivers over to using this method. New hardware support: - Intel Lightning Mountain SoC pin controller and GPIO support, a first Intel platform to use device tree rather than ACPI to configure the system. News reports says that this SoC is a network processor. - Qualcomm MSM8976 and MSM8956 - Qualcomm PMIC GPIO now also supports PM6150 and PM6150L - Qualcomm SPMI MPP and SPMI GPIO for PM8950 and PMI8950 - Rockchip RK3308 - Renesas R8A77961 - Allwinner Meson-A1 Driver improvements: - get_multiple and set_multiple support for the AT91-PIO4 driver. - Convert Qualcomm SSBI GPIO to use the hierarchical IRQ helpers in the GPIOlib irqchip" * tag 'pinctrl-v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (93 commits) pinctrl: ingenic: Add OTG VBUS pin for the JZ4770 pinctrl: ingenic: Handle PIN_CONFIG_OUTPUT config pinctrl: Fix Kconfig indentation pinctrl: lewisburg: Update pin list according to v1.1v6 MAINTAINERS: Replace my email by one @kernel.org pinctrl: armada-37xx: Fix irq mask access in armada_37xx_irq_set_type() dt-bindings: pinctrl: intel: Add for new SoC pinctrl: Add pinmux & GPIO controller driver for a new SoC pinctrl: rza1: remove unnecessary static inline function pinctrl: meson: add pinctrl driver support for Meson-A1 SoC pinctrl: meson: add a new callback for SoCs fixup pinctrl: nomadik: db8500: Add mc0_a_2 pin group without direction control dt-bindings: pinctrl: Convert generic pin mux and config properties to schema pinctrl: cherryview: Missed type change to unsigned int pinctrl: intel: Missed type change to unsigned int pinctrl: use devm_platform_ioremap_resource() to simplify code pinctrl: just return if no valid maps dt-bindings: pinctrl: qcom-pmic-mpp: Add support for PM/PMI8950 pinctrl: qcom: spmi-mpp: Add PM/PMI8950 compatible strings dt-bindings: pinctrl: qcom-pmic-gpio: Add support for PM/PMI8950 ...
2019-11-27Merge tag 'for-v5.5' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply and reset updates from Sebastian Reichel: - test_power: add support for current and charge_counter - cpcap-charger: improve charge calculation and limit default charge voltage - ab8500: convert to IIO - misc small fixes all over drivers * tag 'for-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (29 commits) power: supply: bd70528: Add MODULE_ALIAS to allow module auto loading power: supply: ab8500_charger: Fix inconsistent IS_ERR and PTR_ERR power: supply: cpcap-charger: cpcap_charger_voltage_to_regval() can be static power: supply: cpcap-battery: Add basic coulomb counter calibrate support power: supply: cpcap-battery: Read and save integrator register CCI power: supply: cpcap-battery: Simplify short term power average calculation power: supply: cpcap-battery: Simplify coulomb counter calculation with div_s64 power: supply: cpcap-battery: Move coulomb counter units per lsb to ddata power: supply: cpcap-charger: Allow changing constant charge voltage power: supply: cpcap-battery: Fix handling of lowered charger voltage power: supply: cpcap-charger: Improve battery detection power: supply: cpcap-battery: Check voltage before orderly_poweroff power: supply: cpcap-charger: Limit voltage to 4.2V for battery power: supply: ab8500: Handle invalid IRQ from platform_get_irq_byname() power: supply: ab8500_fg: Do not free non-requested IRQs in probe's error path power: supply: ab8500: Cleanup probe in reverse order power: reset: at91: fix __le32 cast in reset code power: supply: abx500_chargalg: Fix code indentation mfd: Switch the AB8500 GPADC to IIO iio: adc: New driver for the AB8500 GPADC ...
2019-11-27KVM x86: Move kvm cpuid support out of svmPeter Gonda2-7/+5
Memory encryption support does not have module parameter dependencies and can be moved into the general x86 cpuid __do_cpuid_ent function. This changes maintains current behavior of passing through all of CPUID.8000001F. Suggested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Peter Gonda <pgonda@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-27x86/entry/32: Remove unused 'restore_all_notrace' local labelBorislav Petkov1-1/+0
Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-27perf/x86: Implement immediate enforcement of /sys/devices/cpu/rdpmc value of 0Anthony Steinhauser2-7/+15
When you successfully write 0 to /sys/devices/cpu/rdpmc, the RDPMC instruction should be disabled unconditionally and immediately (after you close the SYSFS file) by the documentation. Instead, in the current implementation the PMU must be reloaded which happens only eventually some time in the future. Only after that the RDPMC instruction becomes disabled (on ring 3) on the respective core. This change makes the treatment of the 0 value as blocking and as unconditional as the current treatment of the 2 value, only the CR4.PCE bit is naturally set to false instead of true. Signed-off-by: Anthony Steinhauser <asteinhauser@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: https://lkml.kernel.org/r/20191125054838.137615-1-asteinhauser@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-27powerpc: Define arch_is_kernel_initmem_freed() for lockdepMichael Ellerman1-0/+14
Under certain circumstances, we hit a warning in lockdep_register_key: if (WARN_ON_ONCE(static_obj(key))) return; This occurs when the key falls into initmem that has since been freed and can now be reused. This has been observed on boot, and under memory pressure. Define arch_is_kernel_initmem_freed(), which allows lockdep to correctly identify this memory as dynamic. This fixes a bug picked up by the powerpc64 syzkaller instance where we hit the WARN via alloc_netdev_mqs. Reported-by: Qian Cai <cai@lca.pw> Reported-by: ppc syzbot c/o Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Daniel Axtens <dja@axtens.net> Link: https://lore.kernel.org/r/87lfs4f7d6.fsf@dja-thinkpad.axtens.net
2019-11-27crypto: arch - conditionalize crypto api in arch glue for lib codeJason A. Donenfeld11-32/+53
For glue code that's used by Zinc, the actual Crypto API functions might not necessarily exist, and don't need to exist either. Before this patch, there are valid build configurations that lead to a unbuildable kernel. This fixes it to conditionalize those symbols on the existence of the proper config entry. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-26Merge tag 'media/v5.5-1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - uAPI documentation for stateless decoders - Added a new CEC ioctl together with its documentation - Improved IPU3 documentation - New i2c drivers: hi556 and imx290 - Added support on Vivid driver for meta streams - Added de-interlace support for sunxi subdriver - Added a few new remote controler keymaps - Added H.265 support for Sunxi Cedrus driver - Another round of random driver cleanups, fixes and improvements * tag 'media/v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (361 commits) media: Revert "media: mtk-vcodec: Remove extra area allocation in an input buffer on encoding" media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly media: hantro: Remove now unused H264 pic_size media: hantro: Use output buffer width and height for H264 decoding media: hantro: Reduce H264 extra space for motion vectors media: hantro: Fix H264 motion vector buffer offset media: ti-vpe: vpe: fix compatible to match bindings media: dt-bindings: media: ti-vpe: Document VPE driver media: zr364xx: remove redundant assigmnent to idx, clean up code media: Documentation: media: *_DEFAULT targets for subdevs media: hantro: Fix s_fmt for dynamic resolution changes media: i2c: Use the correct style for SPDX License Identifier media: siano: Use the correct style for SPDX License Identifier media: vicodec: media_device_cleanup was called too early media: vim2m: media_device_cleanup was called too early media: cedrus: Increase maximum supported size media: cedrus: Fix H264 4k support media: cedrus: Properly signal size in mode register media: v4l2-ctrl: Lock main_hdl on operations of requests_queued. media: si470x-i2c: add missed operations in remove ...
2019-11-26Merge tag 'acpi-5.5-rc1' of ↵Linus Torvalds9-25/+141
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream revision 20191018, add support for EFI specific purpose memory, update the ACPI EC driver to make it work on systems with hardware-reduced ACPI, improve ACPI-based device enumeration for some platforms, rework the lid blacklist handling in the button driver and add more lid quirks to it, unify ACPI _HID/_UID matching, fix assorted issues and clean up the code and documentation. Specifics: - Update the ACPICA code in the kernel to upstream revision 20191018 including: * Fixes for Clang warnings (Bob Moore) * Fix for possible overflow in get_tick_count() (Bob Moore) * Introduction of acpi_unload_table() (Bob Moore) * Debugger and utilities updates (Erik Schmauss) * Fix for unloading tables loaded via configfs (Nikolaus Voss) - Add support for EFI specific purpose memory to optionally allow either application-exclusive or core-kernel-mm managed access to differentiated memory (Dan Williams) - Fix and clean up processing of the HMAT table (Brice Goglin, Qian Cai, Tao Xu) - Update the ACPI EC driver to make it work on systems with hardware-reduced ACPI (Daniel Drake) - Always build in support for the Generic Event Device (GED) to allow one kernel binary to work both on systems with full hardware ACPI and hardware-reduced ACPI (Arjan van de Ven) - Fix the table unload mechanism to unregister platform devices created when the given table was loaded (Andy Shevchenko) - Rework the lid blacklist handling in the button driver and add more lid quirks to it (Hans de Goede) - Improve ACPI-based device enumeration for some platforms based on Intel BayTrail SoCs (Hans de Goede) - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC and prevent handlers from being registered for unhandled PMIC OpRegions (Hans de Goede) - Unify ACPI _HID/_UID matching (Andy Shevchenko) - Clean up documentation and comments (Cao jin, James Pack, Kacper Piwiński)" * tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits) ACPI: OSI: Shoot duplicate word ACPI: HMAT: use %u instead of %d to print u32 values ACPI: NUMA: HMAT: fix a section mismatch ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device ACPI: NUMA: HMAT: Register HMAT at device_initcall level device-dax: Add a driver for "hmem" devices dax: Fix alloc_dax_region() compile warning lib: Uplevel the pmem "region" ida to a global allocator x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP arm/efi: EFI soft reservation to memblock x86/efi: EFI soft reservation to E820 enumeration efi: Common enable/disable infrastructure for EFI soft reservation x86/efi: Push EFI_MEMMAP check into leaf routines efi: Enumerate EFI_MEMORY_SP ACPI: NUMA: Establish a new drivers/acpi/numa/ directory ACPICA: Update version to 20191018 ACPICA: debugger: remove leading whitespaces when converting a string to a buffer ACPICA: acpiexec: initialize all simple types and field units from user input ACPICA: debugger: add field unit support for acpi_db_get_next_token ...
2019-11-26Merge tag 'pm-5.5-rc1' of ↵Linus Torvalds43-64/+185
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These include cpuidle changes to use nanoseconds (instead of microseconds) as the unit of time and to simplify checks for disabled idle states in the idle loop, some cpuidle fixes and governor updates, assorted cpufreq updates (driver updates mostly and a few core fixes and cleanups), devfreq updates (dominated by the tegra30 driver changes), new CPU IDs for the RAPL power capping driver, relatively minor updates of the generic power domains (genpd) and operation performance points (OPP) frameworks, and assorted fixes and cleanups. There are also two maintainer information updates: Chanwoo Choi will be maintaining the devfreq subsystem going forward and Todd Brandt is going to maintain the pm-graph utility (created by him). Specifics: - Use nanoseconds (instead of microseconds) as the unit of time in the cpuidle core and simplify checks for disabled idle states in the idle loop (Rafael Wysocki) - Fix and clean up the teo cpuidle governor (Rafael Wysocki) - Fix the cpuidle registration error code path (Zhenzhong Duan) - Avoid excessive vmexits in the ACPI cpuidle driver (Yin Fengwei) - Extend the idle injection infrastructure to be able to measure the requested duration in nanoseconds and to allow an exit latency limit for idle states to be specified (Daniel Lezcano) - Fix cpufreq driver registration and clarify a comment in the cpufreq core (Viresh Kumar) - Add NULL checks to the show() and store() methods of sysfs attributes exposed by cpufreq (Kai Shen) - Update cpufreq drivers: * Fix for a plain int as pointer warning from sparse in intel_pstate (Jamal Shareef) * Fix for a hardcoded number of CPUs and stack bloat in the powernv driver (John Hubbard) * Updates to the ti-cpufreq driver and DT files to support new platforms and migrate bindings from opp-v1 to opp-v2 (Adam Ford, H. Nikolaus Schaller) * Merging of the arm_big_little and vexpress-spc drivers and related cleanup (Sudeep Holla) * Fix for imx's default speed grade value (Anson Huang) * Minor cleanup of the s3c64xx driver (Nathan Chancellor) * CPU speed bin detection fix for sun50i (Ondrej Jirman) - Appoint Chanwoo Choi as the new devfreq maintainer. - Update the devfreq core: * Check NULL governor in available_governors_show sysfs to prevent showing wrong governor information and fix a race condition between devfreq_update_status() and trans_stat_show() (Leonard Crestez) * Add new 'interrupt-driven' flag for devfreq governors to allow interrupt-driven governors to prevent the devfreq core from polling devices for status (Dmitry Osipenko) * Improve an error message in devfreq_add_device() (Matthias Kaehlcke) - Update devfreq drivers: * tegra30 driver fixes and cleanups (Dmitry Osipenko) * Removal of unused property from dt-binding documentation for the exynos-bus driver (Kamil Konieczny) * exynos-ppmu cleanup and DT bindings update (Lukasz Luba, Marek Szyprowski) - Add new CPU IDs for CometLake Mobile and Desktop to the Intel RAPL power capping driver (Zhang Rui) - Allow device initialization in the generic power domains (genpd) framework to be more straightforward and clean it up (Ulf Hansson) - Add support for adjusting OPP voltages at run time to the OPP framework (Stephen Boyd) - Avoid freeing memory that has never been allocated in the hibernation core (Andy Whitcroft) - Clean up function headers in a header file and coding style in the wakeup IRQs handling code (Ulf Hansson, Xiaofei Tan) - Clean up the SmartReflex adaptive voltage scaling (AVS) driver for ARM (Ben Dooks, Geert Uytterhoeven) - Wrap power management documentation to fit in 80 columns (Bjorn Helgaas) - Add pm-graph utility entry to MAINTAINERS (Todd Brandt) - Update the cpupower utility: * Fix the handling of set and info subcommands (Abhishek Goel) * Fix build warnings (Nathan Chancellor) * Improve mperf_monitor handling (Janakarajan Natarajan)" * tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits) PM: Wrap documentation to fit in 80 columns cpuidle: Pass exit latency limit to cpuidle_use_deepest_state() cpuidle: Allow idle injection to apply exit latency limit cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks cpuidle: teo: Avoid code duplication in conditionals cpufreq: Register drivers only after CPU devices have been registered cpuidle: teo: Avoid using "early hits" incorrectly cpuidle: teo: Exclude cpuidle overhead from computations PM / Domains: Convert to dev_to_genpd_safe() in genpd_syscore_switch() mmc: tmio: Avoid boilerplate code in ->runtime_suspend() PM / Domains: Implement the ->start() callback for genpd PM / Domains: Introduce dev_pm_domain_start() ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition PM / wakeirq: remove unnecessary parentheses power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call cpuidle: Use nanoseconds as the unit of time PM / OPP: Support adjusting OPP voltages at runtime PM / core: Clean up some function headers in power.h cpufreq: Add NULL checks to show() and store() methods of cpufreq cpufreq: intel_pstate: Fix plain int as pointer warning from sparse ...
2019-11-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 merge fix from Ingo Molnar: "I missed one other semantic conflict that can result in build failures on certain stripped down x86 32-bit configs, for example 32-bit 'allnoconfig' where CONFIG_X86_IOPL_IOPERM gets turned off" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/iopl: Make 'struct tss_struct' constant size again
2019-11-26Merge branch 'locking-core-for-linus' of ↵Linus Torvalds8-206/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - A comprehensive rewrite of the robust/PI futex code's exit handling to fix various exit races. (Thomas Gleixner et al) - Rework the generic REFCOUNT_FULL implementation using atomic_fetch_* operations so that the performance impact of the cmpxchg() loops is mitigated for common refcount operations. With these performance improvements the generic implementation of refcount_t should be good enough for everybody - and this got confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely, leaving the generic implementation enabled unconditionally. (Will Deacon) - Other misc changes, fixes, cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) lkdtm: Remove references to CONFIG_REFCOUNT_FULL locking/refcount: Remove unused 'refcount_error_report()' function locking/refcount: Consolidate implementations of refcount_t locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions locking/refcount: Move saturation warnings out of line locking/refcount: Improve performance of generic REFCOUNT_FULL code locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header locking/refcount: Remove unused refcount_*_checked() variants locking/refcount: Ensure integer operands are treated as signed locking/refcount: Define constants for saturation and max refcount values futex: Prevent exit livelock futex: Provide distinct return value when owner is exiting futex: Add mutex around futex exit futex: Provide state handling for exec() as well futex: Sanitize exit state handling futex: Mark the begin of futex exit explicitly futex: Set task::futex_state to DEAD right after handling futex exit futex: Split futex_mm_release() for exit/exec exit/exec: Seperate mm_release() futex: Replace PF_EXITPIDONE with a state ...
2019-11-26Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle were: - Dynamic tick (nohz) updates, perhaps most notably changes to force the tick on when needed due to lengthy in-kernel execution on CPUs on which RCU is waiting. - Linux-kernel memory consistency model updates. - Replace rcu_swap_protected() with rcu_prepace_pointer(). - Torture-test updates. - Documentation updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer() net/sched: Replace rcu_swap_protected() with rcu_replace_pointer() net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer() net/core: Replace rcu_swap_protected() with rcu_replace_pointer() bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer() fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer() drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer() drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer() x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer() rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer() rcu: Suppress levelspread uninitialized messages rcu: Fix uninitialized variable in nocb_gp_wait() rcu: Update descriptions for rcu_future_grace_period tracepoint rcu: Update descriptions for rcu_nocb_wake tracepoint rcu: Remove obsolete descriptions for rcu_barrier tracepoint rcu: Ensure that ->rcu_urgent_qs is set before resched IPI workqueue: Convert for_each_wq to use built-in list check rcu: Several rcu_segcblist functions can be static rcu: Remove unused function hlist_bl_del_init_rcu() Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch() ...
2019-11-26Merge branch 'sched-core-for-linus' of ↵Linus Torvalds4-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest changes in this cycle were: - Make kcpustat vtime aware (Frederic Weisbecker) - Rework the CFS load_balance() logic (Vincent Guittot) - Misc cleanups, smaller enhancements, fixes. The load-balancing rework is the most intrusive change: it replaces the old heuristics that have become less meaningful after the introduction of the PELT metrics, with a grounds-up load-balancing algorithm. As such it's not really an iterative series, but replaces the old load-balancing logic with the new one. We hope there are no performance regressions left - but statistically it's highly probable that there *is* going to be some workload that is hurting from these chnages. If so then we'd prefer to have a look at that workload and fix its scheduling, instead of reverting the changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) rackmeter: Use vtime aware kcpustat accessor leds: Use all-in-one vtime aware kcpustat accessor cpufreq: Use vtime aware kcpustat accessors for user time procfs: Use all-in-one vtime aware kcpustat accessor sched/vtime: Bring up complete kcpustat accessor sched/cputime: Support other fields on kcpustat_field() sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util() sched/fair: Add comments for group_type and balancing at SD_NUMA level sched/fair: Fix rework of find_idlest_group() sched/uclamp: Fix overzealous type replacement sched/Kconfig: Fix spelling mistake in user-visible help text sched/core: Further clarify sched_class::set_next_task() sched/fair: Use mul_u32_u32() sched/core: Simplify sched_class::pick_next_task() sched/core: Optimize pick_next_task() sched/core: Make pick_next_task_idle() more consistent sched/fair: Better document newidle_balance() leds: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM cpufreq: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM procfs: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM ...
2019-11-26Merge branch 'perf-core-for-linus' of ↵Linus Torvalds18-83/+329
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel side changes in this cycle were: - Various Intel-PT updates and optimizations (Alexander Shishkin) - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu) - Add support for LSM and SELinux checks to control access to the perf syscall (Joel Fernandes) - Misc other changes, optimizations, fixes and cleanups - see the shortlog for details. There were numerous tooling changes as well - 254 non-merge commits. Here are the main changes - too many to list in detail: - Enhancements to core tooling infrastructure, perf.data, libperf, libtraceevent, event parsing, vendor events, Intel PT, callchains, BPF support and instruction decoding. - There were updates to the following tools: perf annotate perf diff perf inject perf kvm perf list perf maps perf parse perf probe perf record perf report perf script perf stat perf test perf trace - And a lot of other changes: please see the shortlog and Git log for more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits) perf parse: Fix potential memory leak when handling tracepoint errors perf probe: Fix spelling mistake "addrees" -> "address" libtraceevent: Fix memory leakage in copy_filter_type libtraceevent: Fix header installation perf intel-bts: Does not support AUX area sampling perf intel-pt: Add support for decoding AUX area samples perf intel-pt: Add support for recording AUX area samples perf pmu: When using default config, record which bits of config were changed by the user perf auxtrace: Add support for queuing AUX area samples perf session: Add facility to peek at all events perf auxtrace: Add support for dumping AUX area samples perf inject: Cut AUX area samples perf record: Add aux-sample-size config term perf record: Add support for AUX area sampling perf auxtrace: Add support for AUX area sample recording perf auxtrace: Move perf_evsel__find_pmu() perf record: Add a function to test for kernel support for AUX area sampling perf tools: Add kernel AUX area sampling definitions perf/core: Make the mlock accounting simple again perf report: Jump to symbol source view from total cycles view ...
2019-11-26Merge branch 'efi-core-for-linus' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Wire up the EFI RNG code for x86. This enables an additional source of entropy during early boot. - Enable the TPM event log code on ARM platforms. - Update Ard's email address" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: libstub/tpm: enable tpm eventlog function for ARM platforms x86: efi/random: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG table efi/random: use arch-independent efi_call_proto() MAINTAINERS: update Ard's email address to @kernel.org
2019-11-26x86/ptrace: Document FSBASE and GSBASE ABI odditiesAndy Lutomirski1-0/+17
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/ptrace: Remove set_segment_reg() implementations for currentAndy Lutomirski1-12/+7
seg_segment_reg() should be unreachable with task == current. Rather than confusingly trying to make it work, just explicitly disable this case. (regset->get is used for current in the coredump code, but the ->set interface is only used for ptrace, and you can't ptrace yourself.) Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/traps: die() instead of panicking on a double faultAndy Lutomirski1-1/+1
A double fault has a decent chance of being recoverable by killing the offending thread. Use die() so that we at least try to recover. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/doublefault/32: Rewrite the x86_32 #DF handler and unify with 64-bitAndy Lutomirski4-33/+138
The old x86_32 doublefault_fn() was old and crufty, and it did not even try to recover. do_double_fault() is much nicer. Rewrite the 32-bit double fault code to sanitize CPU state and call do_double_fault(). This is mostly an exercise i386 archaeology. With this patch applied, 32-bit double faults get a real stack trace, just like 64-bit double faults. [ mingo: merged the patch to a later kernel base. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/doublefault/32: Move #DF stack and TSS to cpu_entry_areaAndy Lutomirski8-34/+113
There are three problems with the current layout of the doublefault stack and TSS. First, the TSS is only cacheline-aligned, which is not enough -- if the hardware portion of the TSS (struct x86_hw_tss) crosses a page boundary, horrible things happen [0]. Second, the stack and TSS are global, so simultaneous double faults on different CPUs will cause massive corruption. Third, the whole mechanism won't work if user CR3 is loaded, resulting in a triple fault [1]. Let the doublefault stack and TSS share a page (which prevents the TSS from spanning a page boundary), make it percpu, and move it into cpu_entry_area. Teach the stack dump code about the doublefault stack. [0] Real hardware will read past the end of the page onto the next *physical* page if a task switch happens. Virtual machines may have any number of bugs, and I would consider it reasonable for a VM to summarily kill the guest if it tries to task-switch to a page-spanning TSS. [1] Real hardware triple faults. At least some VMs seem to hang. I'm not sure what's going on. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/doublefault/32: Rename doublefault.c to doublefault_32.cAndy Lutomirski2-5/+3
doublefault.c now only contains 32-bit code. Rename it to doublefault_32.c. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/traps: Disentangle the 32-bit and 64-bit doublefault codeAndy Lutomirski4-22/+4
The 64-bit doublefault handler is much nicer than the 32-bit one. As a first step toward unifying them, make the 64-bit handler self-contained. This should have no effect no functional effect except in the odd case of x86_64 with CONFIG_DOUBLEFAULT=n in which case it will change the logging a bit. This also gets rid of CONFIG_DOUBLEFAULT configurability on 64-bit kernels. It didn't do anything useful -- CONFIG_DOUBLEFAULT=n didn't actually disable doublefault handling on x86_64. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all()Joerg Roedel1-1/+1
The job of vmalloc_sync_all() is to help the lazy freeing of vmalloc() ranges: before such vmap ranges are reused we make sure that they are unmapped from every task's page tables. This is really easy on pagetable setups where the kernel page tables are shared between all tasks - this is the case on 32-bit kernels with SHARED_KERNEL_PMD = 1. But on !SHARED_KERNEL_PMD 32-bit kernels this involves iterating over the pgd_list and clearing all pmd entries in the pgds that are cleared in the init_mm.pgd, which is the reference pagetable that the vmalloc() code uses. In that context the current practice of vmalloc_sync_all() iterating until FIX_ADDR_TOP is buggy: for (address = VMALLOC_START & PMD_MASK; address >= TASK_SIZE_MAX && address < FIXADDR_TOP; address += PMD_SIZE) { struct page *page; Because iterating up to FIXADDR_TOP will involve a lot of non-vmalloc address ranges: VMALLOC -> PKMAP -> LDT -> CPU_ENTRY_AREA -> FIX_ADDR This is mostly harmless for the FIX_ADDR and CPU_ENTRY_AREA ranges that don't clear their pmds, but it's lethal for the LDT range, which relies on having different mappings in different processes, and 'synchronizing' them in the vmalloc sense corrupts those pagetable entries (clearing them). This got particularly prominent with PTI, which turns SHARED_KERNEL_PMD off and makes this the dominant mapping mode on 32-bit. To make LDT working again vmalloc_sync_all() must only iterate over the volatile parts of the kernel address range that are identical between all processes. So the correct check in vmalloc_sync_all() is "address < VMALLOC_END" to make sure the VMALLOC areas are synchronized and the LDT mapping is not falsely overwritten. The CPU_ENTRY_AREA and the FIXMAP area are no longer synced either, but this is not really a proplem since their PMDs get established during bootup and never change. This change fixes the ldt_gdt selftest in my setup. [ mingo: Fixed up the changelog to explain the logic and modified the copying to only happen up until VMALLOC_END. ] Reported-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Fixes: 7757d607c6b3: ("x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32") Link: https://lkml.kernel.org/r/20191126111119.GA110513@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26x86/iopl: Make 'struct tss_struct' constant size againIngo Molnar1-2/+0
After the following commit: 05b042a19443: ("x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise") 'struct cpu_entry_area' has to be Kconfig invariant, so that we always have a matching CPU_ENTRY_AREA_PAGES size. This commit added a CONFIG_X86_IOPL_IOPERM dependency to tss_struct: 111e7b15cf10: ("x86/ioperm: Extend IOPL config to control ioperm() as well") Which, if CONFIG_X86_IOPL_IOPERM is turned off, reduces the size of cpu_entry_area by two pages, triggering the assert: ./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_202’ declared with attribute error: BUILD_BUG_ON failed: (CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE Simplify the Kconfig dependencies and make cpu_entry_area constant size on 32-bit kernels again. Fixes: 05b042a19443: ("x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise") Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26libfdt: define INT32_MAX and UINT32_MAX in libfdt_env.hMasahiro Yamada2-1/+5
The DTC v1.5.1 added references to (U)INT32_MAX. This is no problem for user-space programs since <stdint.h> defines (U)INT32_MAX along with (u)int32_t. For the kernel space, libfdt_env.h needs to be adjusted before we pull in the changes. In the kernel, we usually use s/u32 instead of (u)int32_t for the fixed-width types. Accordingly, we already have S/U32_MAX for their max values. So, we should not add (U)INT32_MAX to <linux/limits.h> any more. Instead, add them to the in-kernel libfdt_env.h to compile the latest libfdt. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Rob Herring <robh@kernel.org>
2019-11-26xtensa: fix syscall_set_return_valueMax Filippov1-1/+1
syscall return value is in the register a2, not a0. Cc: stable@vger.kernel.org # v5.0+ Fixes: 9f24f3c1067c ("xtensa: implement tracehook functions and enable HAVE_ARCH_TRACEHOOK") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: drop unneeded headers from coprocessor.SMax Filippov1-9/+1
A bunch of irrelevant headers is included into coprocessor.S. Remove them and add necessary asm/regs.h. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: entry: Remove unneeded need_resched() loopValentin Schneider1-1/+1
Since the enabling and disabling of IRQs within preempt_schedule_irq() is contained in a need_resched() loop, we don't need the outer arch code loop. Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Message-Id: <20190923143620.29334-10-valentin.schneider@arm.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: use MEMBLOCK_ALLOC_ANYWHERE for KASAN shadow mapMax Filippov1-1/+3
KASAN shadow map doesn't need to be accessible through the linear kernel mapping, allocate its pages with MEMBLOCK_ALLOC_ANYWHERE so that high memory can be used. This frees up to ~100MB of low memory on xtensa configurations with KASAN and high memory. Cc: stable@vger.kernel.org # v5.1+ Fixes: f240ec09bb8a ("memblock: replace memblock_alloc_base(ANYWHERE) with memblock_phys_alloc") Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: fix TLB sanity checkerMax Filippov1-2/+2
Virtual and translated addresses retrieved by the xtensa TLB sanity checker must be consistent, i.e. correspond to the same state of the checked TLB entry. KASAN shadow memory is mapped dynamically using auto-refill TLB entries and thus may change TLB state between the virtual and translated address retrieval, resulting in false TLB insanity report. Move read_xtlb_translation close to read_xtlb_virtual to make sure that read values are consistent. Cc: stable@vger.kernel.org Fixes: a99e07ee5e88 ("xtensa: check TLB sanity on return to userspace") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: get rid of __ARCH_USE_5LEVEL_HACKMike Rapoport6-10/+24
xtensa has 2-level page tables and already uses pgtable-nopmd for page table folding. Add walks of p4d level where appropriate and drop usage of __ARCH_USE_5LEVEL_HACK. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Message-Id: <1572964400-16542-3-git-send-email-rppt@kernel.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> [fix up arch/xtensa/include/asm/fixmap.h and arch/xtensa/mm/tlb.c]
2019-11-26xtensa: mm: fix PMD folding implementationMike Rapoport5-9/+19
There was a definition of pmd_offset() in arch/xtensa/include/asm/pgtable.h that shadowed the generic implementation defined in include/asm-generic/pgtable-nopmd.h. As the result, xtensa had shortcuts in page table traversal in several places instead of doing level unfolding. Remove local override for pmd_offset() and add page table unfolding where necessary. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Message-Id: <1572964400-16542-2-git-send-email-rppt@kernel.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: make stack dump size configurableMax Filippov2-1/+8
Introduce Kconfig symbol PRINT_STACK_DEPTH and use it to initialize kstack_depth_to_print. Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: improve stack dumpingMax Filippov1-16/+11
Calculate printable stack size and use print_hex_dump instead of opencoding it. Drop extra newline output in show_trace as its output format does not depend on CONFIG_KALLSYMS. Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: use "m" constraint instead of "r" in futex.h assemblyMax Filippov1-5/+5
Use "m" constraint instead of "r" for the address, as "m" allows compiler to access adjacent locations using base + offset, while "r" requires updating the base register every time. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-11-26xtensa: use "m" constraint instead of "a" in cmpxchg.h assemblyMax Filippov1-15/+16
Use "m" constraint instead of "r" for the address, as "m" allows compiler to access adjacent locations using base + offset, while "r" requires updating the base register every time. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>