Age | Commit message (Collapse) | Author | Files | Lines |
|
For at least the mvc and clc instructions llvm's integrated assembler can
generate incorrect code. In particular this happens with decompressor boot
code. The reason seems to be that relocations for the second displacement
of each instruction are at incorrect locations (-/+: gas vs llvm IAS):
mvc __LC_IO_NEW_PSW(16),.Lnewpsw
results in
4: d2 0f 01 f0 00 00 mvc 496(16,%r0),0
- 8: R_390_12 .head.text+0x10
+ 6: R_390_12 .head.text+0x10
and
clc 0(3,%r4),.L_hdr
results in
258: d5 02 40 00 00 00 clc 0(3,%r4),0
- 25c: R_390_12 .head.text+0x324
+ 25a: R_390_12 .head.text+0x324
Workaround this by writing the code in a different way.
Tested-by: Nathan Chancellor <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Link: https://github.com/llvm/llvm-project/issues/55411
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
llvm's integrated assembler cannot handle immediate values which are
calculated with two local labels:
arch/s390/purgatory/head.S:139:11: error: invalid operand for instruction
aghi %r8,-(.base_crash-purgatory_start)
Workaround this by partially rewriting the code.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
llvm's integrated assembler cannot handle immediate values which are
calculated with two local labels:
<instantiation>:3:13: error: invalid operand for instruction
clgfi %r14,.Lsie_done - .Lsie_gmap
Workaround this by adding clang specific code which reads the specific
value from memory. Since this code is within the hot paths of the kernel
and adds an additional memory reference, keep the original code, and add
ifdef'ed code.
Acked-by: Alexander Gordeev <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
clang fails to handle ".if" statements in inline assembly which are heavily
used in the alternatives code.
To work around this remove this code, and enforce that users of
alternatives must specify original and alternative instruction sequences
which have identical sizes. Add a compile time check with two ".org"
statements similar to arm64.
In result not only clang can handle this, but also quite a lot of code can
be removed.
Acked-by: Vasily Gorbik <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/1356
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Explicitly provide identical sized original/alternative instruction
sequences. This way there is no need for the s390 specific alternatives
infrastructure to generate padding sequences.
The code which generates such sequences will be removed with a follow on
patch.
Acked-by: Vasily Gorbik <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Until now, only the temperature sensors where exported thru
the thermal subsystem. Export the fans as "dell-smm-fan[1-3]" too
to make them available as cooling devices.
Also update Documentation and fix a minor issue with the alphabetic
ordering of the includes.
Signed-off-by: Armin Wolf <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Basing on information and testing provided by users [1] add support for
another board, ASUS ProArt X570 Creator WiFi.
[1] https://github.com/zeule/asus-ec-sensors/issues/17
Signed-off-by: Eugene Shalygin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Instead of open-coding the bad characters replacement in the hwmon name,
use the new devm_hwmon_sanitize_name().
Signed-off-by: Michael Walle <[email protected]>
Acked-by: Xu Yilun <[email protected]>
Reviewed-by: Tom Rix <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
More and more drivers will check for bad characters in the hwmon name
and all are using the same code snippet. Consolidate that code by adding
a new hwmon_sanitize_name() function.
Signed-off-by: Michael Walle <[email protected]>
Reviewed-by: Tom Rix <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Extend aquacomputer_d5next driver to expose hardware temperature sensors
and fans of the Aquacomputer Octo fan controller, which communicates
through a proprietary USB HID protocol.
Four temperature sensors and eight PWM controllable fans are available.
Additionally, serial number, firmware version and power-on count are
exposed through debugfs.
This driver has been tested on x86_64.
Signed-off-by: Aleksa Savic <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[groeck: Add missing "select CRC16"]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Use devm_delayed_work_autocancel() instead of hand writing it. This is
less verbose and saves a few lines of code.
devm_delayed_work_autocancel() uses devm_add_action() instead of
devm_add_action_or_reset(). This is fine, because if the underlying memory
allocation fails, no work has been scheduled yet. So there is nothing to
undo.
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Iwona Winiarska <[email protected]>
Link: https://lore.kernel.org/r/fd277a708ede3882d7df6831f02d2e3c0cb813b8.1644781718.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Guenter Roeck <[email protected]>
|
|
WS X570-ACE has a T_Sensor header on board according to manual[1].
I'm using a 10kΩ B=3435K thermsistor attached to the header of WS X570-ACE.
EC byte at 0x3d matches readings from BIOS sensor page and environment temperature.
[1]https://www.asus.com/Motherboards-Components/Motherboards/All-series/Pro-WS-X570-ACE/HelpDesk_Manual/
Signed-off-by: Wei Shuyu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Add a thermal zone interface to the devices added
under jc42 driver. This way, thermal zones described
in device tree can make use of the of nodes of these
devices.
Cc: Guenter Roeck <[email protected]> (maintainer:JC42.4 TEMPERATURE SENSOR DRIVER)
Cc: Jean Delvare <[email protected]> (maintainer:HARDWARE MONITORING)
Cc: [email protected] (open list:JC42.4 TEMPERATURE SENSOR DRIVER)
Cc: [email protected] (open list)
Signed-off-by: Eduardo Valentin <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
spin_lock_irq/spin_unlock_irq contains preempt_disable/enable().
Which can serve as RCU read-side critical region, so remove
rcu_read_lock/unlock().
Signed-off-by: Fanjun Kong <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
With the removal of seq_get_buf in blkcg_print_one_stat, we
cannot make adding the newline conditional on there being
relevant stats because the name was already written out
unconditionally.
Otherwise we may end up with multiple device names in one
line which is confusing and doesn't follow the nested-keyed
file format.
Signed-off-by: Wolfgang Bumiller <[email protected]>
Fixes: 252c651a4c85 ("blk-cgroup: stop using seq_get_buf")
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The typedefs u32 and u64 are not available in userspace. Thus user get
an error he try to use DMA_BUF_SET_NAME_A or DMA_BUF_SET_NAME_B:
$ gcc -Wall -c -MMD -c -o ioctls_list.o ioctls_list.c
In file included from /usr/include/x86_64-linux-gnu/asm/ioctl.h:1,
from /usr/include/linux/ioctl.h:5,
from /usr/include/asm-generic/ioctls.h:5,
from ioctls_list.c:11:
ioctls_list.c:463:29: error: ‘u32’ undeclared here (not in a function)
463 | { "DMA_BUF_SET_NAME_A", DMA_BUF_SET_NAME_A, -1, -1 }, // linux/dma-buf.h
| ^~~~~~~~~~~~~~~~~~
ioctls_list.c:464:29: error: ‘u64’ undeclared here (not in a function)
464 | { "DMA_BUF_SET_NAME_B", DMA_BUF_SET_NAME_B, -1, -1 }, // linux/dma-buf.h
| ^~~~~~~~~~~~~~~~~~
The issue was initially reported here[1].
[1]: https://github.com/jerome-pouiller/ioctl/pull/14
Signed-off-by: Jérôme Pouiller <[email protected]>
Reviewed-by: Christian König <[email protected]>
Fixes: a5bff92eaac4 ("dma-buf: Fix SET_NAME ioctl uapi")
CC: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Christian König <[email protected]>
|
|
of_find_node_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
Fixes: 0fbeae70ee7c ("regulator: add SCMI driver")
Signed-off-by: Miaoqian Lin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
Variable info is being assigned the same value twice, remove the
redundant assignment. Also assign variable v in the declaration.
Cleans up clang scan warning:
warning: Value stored to 'info' during its initialization is never read [deadcode.DeadStores]
No code changed:
# arch/x86/kernel/sev.o:
text data bss dec hex filename
19878 4487 4112 28477 6f3d sev.o.before
19878 4487 4112 28477 6f3d sev.o.after
md5:
bfbaa515af818615fd01fea91e7eba1b sev.o.before.asm
bfbaa515af818615fd01fea91e7eba1b sev.o.after.asm
[ bp: Running the before/after check on sev.c because sev-shared.c
gets included into it. ]
Fixes: 597cfe48212a ("x86/boot/compressed/64: Setup a GHCB-based VC Exception handler")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When call0 userspace ABI support by probing is enabled instructions that
cause illegal instruction exception when PS.WOE is clear are retried
with PS.WOE set before calling c-level exception handler. Record user pc
at which PS.WOE was set in the fast exception handler and clear PS.WOE
in the c-level exception handler if we get there from the same address.
Signed-off-by: Max Filippov <[email protected]>
|
|
On xtensa cores wihout hardware division option division support
functions from libgcc react to division by 0 attempt by executing
illegal instruction followed by the characters 'DIV0'. Recognize this
pattern in illegal instruction exception handler and convert it to
division by 0.
Signed-off-by: Max Filippov <[email protected]>
|
|
In vmxnet3_rq_create(), when dma_alloc_coherent() fails,
vmxnet3_rq_destroy() is called. It sets rq->rx_ring[i].base to NULL. Then
vmxnet3_rq_create() returns an error to its callers mxnet3_rq_create_all()
-> vmxnet3_change_mtu(). Then vmxnet3_change_mtu() calls
vmxnet3_force_close() -> dev_close() in error handling code. And the driver
calls vmxnet3_close() -> vmxnet3_quiesce_dev() -> vmxnet3_rq_cleanup_all()
-> vmxnet3_rq_cleanup(). In vmxnet3_rq_cleanup(),
rq->rx_ring[ring_idx].base is accessed, but this variable is NULL, causing
a NULL pointer dereference.
To fix this possible bug, an if statement is added to check whether
rq->rx_ring[0].base is NULL in vmxnet3_rq_cleanup() and exit early if so.
The error log in our fault-injection testing is shown as follows:
[ 65.220135] BUG: kernel NULL pointer dereference, address: 0000000000000008
...
[ 65.222633] RIP: 0010:vmxnet3_rq_cleanup_all+0x396/0x4e0 [vmxnet3]
...
[ 65.227977] Call Trace:
...
[ 65.228262] vmxnet3_quiesce_dev+0x80f/0x8a0 [vmxnet3]
[ 65.228580] vmxnet3_close+0x2c4/0x3f0 [vmxnet3]
[ 65.228866] __dev_close_many+0x288/0x350
[ 65.229607] dev_close_many+0xa4/0x480
[ 65.231124] dev_close+0x138/0x230
[ 65.231933] vmxnet3_force_close+0x1f0/0x240 [vmxnet3]
[ 65.232248] vmxnet3_change_mtu+0x75d/0x920 [vmxnet3]
...
Fixes: d1a890fa37f27 ("net: VMware virtual Ethernet NIC driver: vmxnet3")
Reported-by: TOTE Robot <[email protected]>
Signed-off-by: Zixuan Fu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In vmxnet3_rq_alloc_rx_buf(), when dma_map_single() fails, rbi->skb is
freed immediately. Similarly, in another branch, when dma_map_page() fails,
rbi->page is also freed. In the two cases, vmxnet3_rq_alloc_rx_buf()
returns an error to its callers vmxnet3_rq_init() -> vmxnet3_rq_init_all()
-> vmxnet3_activate_dev(). Then vmxnet3_activate_dev() calls
vmxnet3_rq_cleanup_all() in error handling code, and rbi->skb or rbi->page
are freed again in vmxnet3_rq_cleanup_all(), causing use-after-free bugs.
To fix these possible bugs, rbi->skb and rbi->page should be cleared after
they are freed.
The error log in our fault-injection testing is shown as follows:
[ 14.319016] BUG: KASAN: use-after-free in consume_skb+0x2f/0x150
...
[ 14.321586] Call Trace:
...
[ 14.325357] consume_skb+0x2f/0x150
[ 14.325671] vmxnet3_rq_cleanup_all+0x33a/0x4e0 [vmxnet3]
[ 14.326150] vmxnet3_activate_dev+0xb9d/0x2ca0 [vmxnet3]
[ 14.326616] vmxnet3_open+0x387/0x470 [vmxnet3]
...
[ 14.361675] Allocated by task 351:
...
[ 14.362688] __netdev_alloc_skb+0x1b3/0x6f0
[ 14.362960] vmxnet3_rq_alloc_rx_buf+0x1b0/0x8d0 [vmxnet3]
[ 14.363317] vmxnet3_activate_dev+0x3e3/0x2ca0 [vmxnet3]
[ 14.363661] vmxnet3_open+0x387/0x470 [vmxnet3]
...
[ 14.367309]
[ 14.367412] Freed by task 351:
...
[ 14.368932] __dev_kfree_skb_any+0xd2/0xe0
[ 14.369193] vmxnet3_rq_alloc_rx_buf+0x71e/0x8d0 [vmxnet3]
[ 14.369544] vmxnet3_activate_dev+0x3e3/0x2ca0 [vmxnet3]
[ 14.369883] vmxnet3_open+0x387/0x470 [vmxnet3]
[ 14.370174] __dev_open+0x28a/0x420
[ 14.370399] __dev_change_flags+0x192/0x590
[ 14.370667] dev_change_flags+0x7a/0x180
[ 14.370919] do_setlink+0xb28/0x3570
[ 14.371150] rtnl_newlink+0x1160/0x1740
[ 14.371399] rtnetlink_rcv_msg+0x5bf/0xa50
[ 14.371661] netlink_rcv_skb+0x1cd/0x3e0
[ 14.371913] netlink_unicast+0x5dc/0x840
[ 14.372169] netlink_sendmsg+0x856/0xc40
[ 14.372420] ____sys_sendmsg+0x8a7/0x8d0
[ 14.372673] __sys_sendmsg+0x1c2/0x270
[ 14.372914] do_syscall_64+0x41/0x90
[ 14.373145] entry_SYSCALL_64_after_hwframe+0x44/0xae
...
Fixes: 5738a09d58d5a ("vmxnet3: fix checks for dma mapping errors")
Reported-by: TOTE Robot <[email protected]>
Signed-off-by: Zixuan Fu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
* irq/gic-v3-nmi-fixes-5.19:
: .
: GICv3 pseudo-NMI fixes from Mark Rutland:
:
: "These patches fix a couple of issues with the way GICv3 pseudo-NMIs are
: handled:
:
: * The first patch adds a barrier we missed from NMI handling due to an
: oversight.
:
: * The second patch refactors some logic around reads from ICC_IAR1_EL1
: and adds commentary to explain what's going on.
:
: * The third patch descends into madness, reworking gic_handle_irq() to
: consistently manage ICC_PMR_EL1 + DAIF and avoid cases where these can
: be left in an inconsistent state while softirqs are processed."
: .
irqchip/gic-v3: Fix priority mask handling
irqchip/gic-v3: Refactor ISB + EOIR at ack time
irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling
Signed-off-by: Marc Zyngier <[email protected]>
|
|
* irq/misc-5.19:
: .
: Misc fixes and minor improvements:
:
: - GIC: Improve warning when the firmware tables are inconsistent
:
: - csky: Use true/false as boolean litterals
:
: - imx-irqsteer: Add runtime PM support
:
: - armada-370-xp: Enable CPU affinity for MSIs, avoid messing with
: PMU interrupts on some variants
:
: - aspeed: Fix handling of irq_of_parse_and_map() errors
:
: - sun6i: Fix sparse warnings
:
: - xtensa-mx: Fix initial IRQ affinity in non-SMP setup
:
: - exiu: Fix acknowledgment of edge-triggered interrupts
:
: - sunxi: Generalise configuration for further reuse
: .
irqchip: Add Kconfig symbols for sunxi drivers
irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x
irqchip/gic: Improved warning about incorrect type
irqchip/csky: Return true/false (not 1/0) from bool functions
irqchip/imx-irqsteer: Add runtime PM support
irqchip/imx-irqsteer: Constify irq_chip struct
irqchip/armada-370-xp: Enable MSI affinity configuration
irqchip/aspeed-scu-ic: Fix irq_of_parse_and_map() return value
irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value
irqchip/sun6i-r: Use NULL for chip_data
irqchip/xtensa-mx: Fix initial IRQ affinity in non-SMP setup
irqchip/exiu: Fix acknowledgment of edge triggered interrupts
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Not all of these drivers are needed on every ARCH_SUNXI platform. In
particular, the ARCH_SUNXI symbol will be reused for the Allwinner D1,
a RISC-V SoC which contains none of these irqchips.
Introduce Kconfig symbols so we can select only the drivers actually
used by a particular set of platforms. This also lets us move the
irqchip driver dependencies to a more appropriate location.
Signed-off-by: Samuel Holland <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The global blackhole_netdev has replaced pernet loopback_dev to become the
one given to the object that holds an netdev when ifdown in many places of
ipv4 and ipv6 since commit 8d7017fd621d ("blackhole_netdev: use
blackhole_netdev to invalidate dst entries").
Especially after commit faab39f63c1f ("net: allow out-of-order netdev
unregistration"), it's no longer safe to use loopback_dev that may be
freed before other netdev.
This patch is to set dst dev to blackhole_netdev instead of loopback_dev
in ifdown.
v1->v2:
- add Fixes tag as Eric suggested.
Fixes: faab39f63c1f ("net: allow out-of-order netdev unregistration")
Signed-off-by: Xin Long <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/e8c87482998ca6fcdab214f5a9d582899ec0c648.1652665047.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <[email protected]>
|
|
if devm_clk_get_optional() fails, we still need to go through the error
handling path.
Add the missing goto.
Fixes: 6328a126896ea ("net: systemport: Manage Wake-on-LAN clock")
Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/99d70634a81c229885ae9e4ee69b2035749f7edc.1652634040.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The temporary mappings of the low-level kexec and hibernate helpers are
created with both writable and executable attributes, which is not
necessary here, and generally best avoided. So use read-only, executable
attributes instead.
Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
|
|
There are a few code sections that are emitted into the kernel's
executable .text segment simply because they contain code, but are
actually never executed via this mapping, so they can happily live in a
region that gets mapped without executable permissions, reducing the
risk of being gadgetized.
Note that the kexec and hibernate region contents are always copied into
a fresh page, and so there is no need to align them as long as the
overall size of each is below 4 KiB.
Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
|
|
The following two scenarios were failing for lan966x.
1. If the port had the address X and then trying to assign the same
address, then the HW was just removing this address because first it
tries to learn new address and then delete the old one. As they are
the same the HW remove it.
2. If the port eth0 was assigned the same address as one of the other
ports eth1 then when assigning back the address to eth0 then the HW
was deleting the address of eth1.
The case 1. is fixed by checking if the port has already the same
address while case 2. is fixed by checking if the address is used by any
other port.
Fixes: e18aba8941b40b ("net: lan966x: add mactable support")
Signed-off-by: Horatiu Vultur <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
This reverts commit 1738890a3165ccd0da98ebd3e2d5f9b230d5afa8.
Commit 1738890a3165 ("clk: sunxi-ng: sun6i-rtc: Add support for H6")
breaks HDMI output on Tanix TX6 mini board. Exact reason isn't known,
but because that commit doesn't actually improve anything, let's just
revert it.
Cc: [email protected]
Signed-off-by: Jernej Skrabec <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
|
|
register_nmi_handler() has no sanity check whether a handler has been
registered already. Such an unintended double-add leads to list corruption
and hard to diagnose problems during the next NMI handling.
Init the list head in the static NMI action struct and check it for being
empty in register_nmi_handler().
[ bp: Fixups. ]
Reported-by: Sean Christopherson <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
|
|
The commit 09e3b18ca5de ("clk: bcm2835: Remove unused variable")
accidentially breaks the behavior of bcm2835_clock_choose_div() and
booting of Raspberry Pi. The removed do_div macro call had side effects,
so we need to restore it.
Fixes: 09e3b18ca5de ("clk: bcm2835: Remove unused variable")
Signed-off-by: Stefan Wahren <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Maxime Ripard <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
|
|
Instead of having one big enum add one for each register or field.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|
|
Use the %pg format specifier to save on stack consuption and code size.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The is_kmap_addr() and the is_vmalloc_addr() in the check_heap_object()
will not work, because the virt_addr_valid() will exclude the kmap and
vmalloc regions. So let's move the virt_addr_valid() below
the is_vmalloc_addr().
Signed-off-by: Yuanzheng Song <[email protected]>
Fixes: 4e140f59d285 ("mm/usercopy: Check kmap addresses properly")
Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns")
Cc: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
With all randstruct exceptions removed, remove all the exception
handling code. Any future warnings are likely to be shared between
this plugin and Clang randstruct, and will need to be addressed in a
more wholistic fashion.
Cc: Christoph Hellwig <[email protected]>
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
|
|
While preparing for Clang randstruct support (which duplicated many of
the warnings the randstruct GCC plugin warned about), one strange one
remained only for the randstruct GCC plugin. Eliminating this rids
the plugin of the last exception.
It seems the plugin is happy to dereference individual members of
a cross-struct cast, but it is upset about casting to a whole object
pointer. This only manifests in one place in the kernel, so just replace
the variable with individual member accesses. There is no change in
executable instruction output.
Drop the last exception from the randstruct GCC plugin.
Cc: "David S. Miller" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Al Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Acked-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Kees Cook <[email protected]>
|
|
Clang randstruct gets upset when it sees struct addresspace (which is
randomized) being assigned to a struct page (which is not randomized):
drivers/net/ethernet/sun/niu.c:3385:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct page *'
*link = (struct page *) page->mapping;
^
It looks like niu.c is looking for an in-line place to chain its allocated
pages together and is overloading the "mapping" member, as it is unused.
This is very non-standard, and is expected to be cleaned up in the
future[1], but there is no "correct" way to handle it today.
No meaningful machine code changes result after this change, and source
readability is improved.
Drop the randstruct exception now that there is no "confusing" cross-type
assignment.
[1] https://lore.kernel.org/lkml/[email protected]/
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Du Cheng <[email protected]>
Cc: Christophe JAILLET <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Acked-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Kees Cook <[email protected]>
|
|
The randstruct GCC plugin gets upset when it sees struct path (which is
randomized) being assigned from a "void *" (which it cannot type-check).
There's no need for these casts, as the entire internal payload use is
following a normal struct layout. Convert the enum-based void * offset
dereferencing to the new big_key_payload struct. No meaningful machine
code changes result after this change, and source readability is improved.
Drop the randstruct exception now that there is no "confusing" cross-type
assignment.
Cc: David Howells <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jarkko Sakkinen <[email protected]>
Cc: James Morris <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
|
|
Now we use huge_ptep_get() to get the pte value of a hugetlb page,
however it will only return one specific pte value for the CONT-PTE
or CONT-PMD size hugetlb on ARM64 system, which can contain several
continuous pte or pmd entries with same page table attributes. And it
will not take into account the subpages' dirty or young bits of a
CONT-PTE/PMD size hugetlb page.
So the huge_ptep_get() is inconsistent with huge_ptep_get_and_clear(),
which already takes account the dirty or young bits for any subpages
in this CONT-PTE/PMD size hugetlb [1]. Meanwhile we can miss dirty or
young flags statistics for hugetlb pages with current huge_ptep_get(),
such as the gather_hugetlb_stats() function, and CONT-PTE/PMD hugetlb
monitoring with DAMON.
Thus define an ARM64 specific huge_ptep_get() implementation as well as
enabling __HAVE_ARCH_HUGE_PTEP_GET, that will take into account any
subpages' dirty or young bits for CONT-PTE/PMD size hugetlb page, for
those functions that want to check the dirty and young flags of a hugetlb
page.
[1] https://lore.kernel.org/linux-mm/[email protected]/
Suggested-by: Muchun Song <[email protected]>
Signed-off-by: Baolin Wang <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/624109a80ac4bbdf1e462dfa0b49e9f7c31a7c0d.1652496622.git.baolin.wang@linux.alibaba.com
Signed-off-by: Catalin Marinas <[email protected]>
|
|
The original huge_ptep_get() on ARM64 is just a wrapper of ptep_get(),
which will not take into account any contig-PTEs dirty and access bits.
Meanwhile we will implement a new ARM64-specific huge_ptep_get()
interface in following patch, which will take into account any contig-PTEs
dirty and access bits. To keep the same efficient logic to get the pte
value, change to use ptep_get() as a preparation.
Signed-off-by: Baolin Wang <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/5113ed6e103f995e1d0f0c9fda0373b761bbcad2.1652496622.git.baolin.wang@linux.alibaba.com
Signed-off-by: Catalin Marinas <[email protected]>
|
|
A PCMD (Paging Crypto MetaData) page contains the PCMD
structures of enclave pages that have been encrypted and
moved to the shmem backing store. When all enclave pages
sharing a PCMD page are loaded in the enclave, there is no
need for the PCMD page and it can be truncated from the
backing store.
A few issues appeared around the truncation of PCMD pages. The
known issues have been addressed but the PCMD handling code could
be made more robust by loudly complaining if any new issue appears
in this area.
Add a check that will complain with a warning if the PCMD page is not
actually empty after it has been truncated. There should never be data
in the PCMD page at this point since it is was just checked to be empty
and truncated with enclave mutex held and is updated with the
enclave mutex held.
Suggested-by: Dave Hansen <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Tested-by: Haitao Huang <[email protected]>
Link: https://lkml.kernel.org/r/6495120fed43fafc1496d09dd23df922b9a32709.1652389823.git.reinette.chatre@intel.com
|
|
Haitao reported encountering a WARN triggered by the ENCLS[ELDU]
instruction faulting with a #GP.
The WARN is encountered when the reclaimer evicts a range of
pages from the enclave when the same pages are faulted back right away.
Consider two enclave pages (ENCLAVE_A and ENCLAVE_B)
sharing a PCMD page (PCMD_AB). ENCLAVE_A is in the
enclave memory and ENCLAVE_B is in the backing store. PCMD_AB contains
just one entry, that of ENCLAVE_B.
Scenario proceeds where ENCLAVE_A is being evicted from the enclave
while ENCLAVE_B is faulted in.
sgx_reclaim_pages() {
...
/*
* Reclaim ENCLAVE_A
*/
mutex_lock(&encl->lock);
/*
* Get a reference to ENCLAVE_A's
* shmem page where enclave page
* encrypted data will be stored
* as well as a reference to the
* enclave page's PCMD data page,
* PCMD_AB.
* Release mutex before writing
* any data to the shmem pages.
*/
sgx_encl_get_backing(...);
encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
mutex_unlock(&encl->lock);
/*
* Fault ENCLAVE_B
*/
sgx_vma_fault() {
mutex_lock(&encl->lock);
/*
* Get reference to
* ENCLAVE_B's shmem page
* as well as PCMD_AB.
*/
sgx_encl_get_backing(...)
/*
* Load page back into
* enclave via ELDU.
*/
/*
* Release reference to
* ENCLAVE_B' shmem page and
* PCMD_AB.
*/
sgx_encl_put_backing(...);
/*
* PCMD_AB is found empty so
* it and ENCLAVE_B's shmem page
* are truncated.
*/
/* Truncate ENCLAVE_B backing page */
sgx_encl_truncate_backing_page();
/* Truncate PCMD_AB */
sgx_encl_truncate_backing_page();
mutex_unlock(&encl->lock);
...
}
mutex_lock(&encl->lock);
encl_page->desc &=
~SGX_ENCL_PAGE_BEING_RECLAIMED;
/*
* Write encrypted contents of
* ENCLAVE_A to ENCLAVE_A shmem
* page and its PCMD data to
* PCMD_AB.
*/
sgx_encl_put_backing(...)
/*
* Reference to PCMD_AB is
* dropped and it is truncated.
* ENCLAVE_A's PCMD data is lost.
*/
mutex_unlock(&encl->lock);
}
What happens next depends on whether it is ENCLAVE_A being faulted
in or ENCLAVE_B being evicted - but both end up with ENCLS[ELDU] faulting
with a #GP.
If ENCLAVE_A is faulted then at the time sgx_encl_get_backing() is called
a new PCMD page is allocated and providing the empty PCMD data for
ENCLAVE_A would cause ENCLS[ELDU] to #GP
If ENCLAVE_B is evicted first then a new PCMD_AB would be allocated by the
reclaimer but later when ENCLAVE_A is faulted the ENCLS[ELDU] instruction
would #GP during its checks of the PCMD value and the WARN would be
encountered.
Noting that the reclaimer sets SGX_ENCL_PAGE_BEING_RECLAIMED at the time
it obtains a reference to the backing store pages of an enclave page it
is in the process of reclaiming, fix the race by only truncating the PCMD
page after ensuring that no page sharing the PCMD page is in the process
of being reclaimed.
Cc: [email protected]
Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
Reported-by: Haitao Huang <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Tested-by: Haitao Huang <[email protected]>
Link: https://lkml.kernel.org/r/ed20a5db516aa813873268e125680041ae11dfcf.1652389823.git.reinette.chatre@intel.com
|
|
Haitao reported encountering a WARN triggered by the ENCLS[ELDU]
instruction faulting with a #GP.
The WARN is encountered when the reclaimer evicts a range of
pages from the enclave when the same pages are faulted back
right away.
The SGX backing storage is accessed on two paths: when there
are insufficient free pages in the EPC the reclaimer works
to move enclave pages to the backing storage and as enclaves
access pages that have been moved to the backing storage
they are retrieved from there as part of page fault handling.
An oversubscribed SGX system will often run the reclaimer and
page fault handler concurrently and needs to ensure that the
backing store is accessed safely between the reclaimer and
the page fault handler. This is not the case because the
reclaimer accesses the backing store without the enclave mutex
while the page fault handler accesses the backing store with
the enclave mutex.
Consider the scenario where a page is faulted while a page sharing
a PCMD page with the faulted page is being reclaimed. The
consequence is a race between the reclaimer and page fault
handler, the reclaimer attempting to access a PCMD at the
same time it is truncated by the page fault handler. This
could result in lost PCMD data. Data may still be
lost if the reclaimer wins the race, this is addressed in
the following patch.
The reclaimer accesses pages from the backing storage without
holding the enclave mutex and runs the risk of concurrently
accessing the backing storage with the page fault handler that
does access the backing storage with the enclave mutex held.
In the scenario below a PCMD page is truncated from the backing
store after all its pages have been loaded in to the enclave
at the same time the PCMD page is loaded from the backing store
when one of its pages are reclaimed:
sgx_reclaim_pages() { sgx_vma_fault() {
...
mutex_lock(&encl->lock);
...
__sgx_encl_eldu() {
...
if (pcmd_page_empty) {
/*
* EPC page being reclaimed /*
* shares a PCMD page with an * PCMD page truncated
* enclave page that is being * while requested from
* faulted in. * reclaimer.
*/ */
sgx_encl_get_backing() <----------> sgx_encl_truncate_backing_page()
}
mutex_unlock(&encl->lock);
} }
In this scenario there is a race between the reclaimer and the page fault
handler when the reclaimer attempts to get access to the same PCMD page
that is being truncated. This could result in the reclaimer writing to
the PCMD page that is then truncated, causing the PCMD data to be lost,
or in a new PCMD page being allocated. The lost PCMD data may still occur
after protecting the backing store access with the mutex - this is fixed
in the next patch. By ensuring the backing store is accessed with the mutex
held the enclave page state can be made accurate with the
SGX_ENCL_PAGE_BEING_RECLAIMED flag accurately reflecting that a page
is in the process of being reclaimed.
Consistently protect the reclaimer's backing store access with the
enclave's mutex to ensure that it can safely run concurrently with the
page fault handler.
Cc: [email protected]
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Haitao Huang <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Tested-by: Jarkko Sakkinen <[email protected]>
Tested-by: Haitao Huang <[email protected]>
Link: https://lkml.kernel.org/r/fa2e04c561a8555bfe1f4e7adc37d60efc77387b.1652389823.git.reinette.chatre@intel.com
|
|
Recent commit 08999b2489b4 ("x86/sgx: Free backing memory
after faulting the enclave page") expanded __sgx_encl_eldu()
to clear an enclave page's PCMD (Paging Crypto MetaData)
from the PCMD page in the backing store after the enclave
page is restored to the enclave.
Since the PCMD page in the backing store is modified the page
should be marked as dirty to ensure the modified data is retained.
Cc: [email protected]
Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Tested-by: Haitao Huang <[email protected]>
Link: https://lkml.kernel.org/r/00cd2ac480db01058d112e347b32599c1a806bc4.1652389823.git.reinette.chatre@intel.com
|
|
SGX uses shmem backing storage to store encrypted enclave pages
and their crypto metadata when enclave pages are moved out of
enclave memory. Two shmem backing storage pages are associated with
each enclave page - one backing page to contain the encrypted
enclave page data and one backing page (shared by a few
enclave pages) to contain the crypto metadata used by the
processor to verify the enclave page when it is loaded back into
the enclave.
sgx_encl_put_backing() is used to release references to the
backing storage and, optionally, mark both backing store pages
as dirty.
Managing references and dirty status together in this way results
in both backing store pages marked as dirty, even if only one of
the backing store pages are changed.
Additionally, waiting until the page reference is dropped to set
the page dirty risks a race with the page fault handler that
may load outdated data into the enclave when a page is faulted
right after it is reclaimed.
Consider what happens if the reclaimer writes a page to the backing
store and the page is immediately faulted back, before the reclaimer
is able to set the dirty bit of the page:
sgx_reclaim_pages() { sgx_vma_fault() {
...
sgx_encl_get_backing();
... ...
sgx_reclaimer_write() {
mutex_lock(&encl->lock);
/* Write data to backing store */
mutex_unlock(&encl->lock);
}
mutex_lock(&encl->lock);
__sgx_encl_eldu() {
...
/*
* Enclave backing store
* page not released
* nor marked dirty -
* contents may not be
* up to date.
*/
sgx_encl_get_backing();
...
/*
* Enclave data restored
* from backing store
* and PCMD pages that
* are not up to date.
* ENCLS[ELDU] faults
* because of MAC or PCMD
* checking failure.
*/
sgx_encl_put_backing();
}
...
/* set page dirty */
sgx_encl_put_backing();
...
mutex_unlock(&encl->lock);
} }
Remove the option to sgx_encl_put_backing() to set the backing
pages as dirty and set the needed pages as dirty right after
receiving important data while enclave mutex is held. This ensures that
the page fault handler can get up to date data from a page and prepares
the code for a following change where only one of the backing pages
need to be marked as dirty.
Cc: [email protected]
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Suggested-by: Dave Hansen <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Tested-by: Haitao Huang <[email protected]>
Link: https://lore.kernel.org/linux-sgx/[email protected]/
Link: https://lkml.kernel.org/r/fa9f98986923f43e72ef4c6702a50b2a0b3c42e3.1652389823.git.reinette.chatre@intel.com
|
|
Fix the following sparse warnings:
CHECK security/integrity/platform_certs/keyring_handler.c
security/integrity/platform_certs/keyring_handler.c:76:16: warning: Using plain integer as NULL pointer
security/integrity/platform_certs/keyring_handler.c:91:16: warning: Using plain integer as NULL pointer
security/integrity/platform_certs/keyring_handler.c:106:16: warning: Using plain integer as NULL pointer
Signed-off-by: Stefan Berger <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
|
|
This reverts commit 1571d67dc190e50c6c56e8f88cdc39f7cc53166e.
This commit broke support for setting interrupt affinity. It looks like
that it is related to the chained IRQ handler. Revert this commit until
issue with setting interrupt affinity is fixed.
Fixes: 1571d67dc190 ("PCI: aardvark: Rewrite IRQ code to chained IRQ handler")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Pali Rohár <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
delta_ns is a s64, but it was being passed ptp_ocp_adjtime_coarse
as an u64. Also, it turns out that timespec64_add_ns() only handles
positive values, so perform the math with set_normalized_timespec().
Fixes: 90f8f4c0e3ce ("ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments")
Suggested-by: Vadim Fedorenko <[email protected]>
Signed-off-by: Jonathan Lemon <[email protected]>
Acked-by: Vadim Fedorenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|