Age | Commit message (Collapse) | Author | Files | Lines |
|
sa1100 provides its own variant of the clk API rather than using the
generic COMMON_CLK API. This generally works, but it causes some link
errors with drivers using the clk_set_rate, clk_get_parent, clk_set_parent
or clk_round_rate functions when a platform lacks those interfaces.
This adds trivial stub implementations for each of them, based on
the behavior of the COMMON_CLK implementation:
- set_rate() and set_parent() report success without doing anything
- round_rate() returns the clk rate
- get_parent() returns NULL.
This adds the minimal bloat and should do the right thing for
the simple clock hardware in this SoC.
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
davinci still has its own clk implementation, but lacks
a clk_get_parent() helper, which can lead to link errors
in randconfig builds.
This adds the usual implementation.
Acked-by: Sekhar Nori <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
In commit 3169663ac5902 "ARM: sa11x0/pxa: convert OS timer registers
to IOMEM", the definition of the OSCR macro was changed to be an
__iomem pointer, but the same register is also used by the XIP
code. This patch does the corresponding change here as well.
On PXA, the IRQ register definitions were removed even earlier, in
commit 5d284e353eb1 ("ARM: pxa: avoid accessing interrupt registers
directly"). This patch unfortunately brings some of that back. An
earlier version of my patch moved the code into an external function,
which could not work for CONFIG_XIP_KERNEL+CONFIG_MTD_XIP, so this
restores something close to the original code.
Link: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241716.html
Acked-by: Robert Jarzmik <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
A change to the platform data definitions caused a warning in the board code:
arch/arm/mach-davinci/board-da850-evm.c:1221:13: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
arch/arm/mach-davinci/board-da850-evm.c:1231:13: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
This is a bit unfortunate, since we generally like structure definitions to
be const, but as this is legacy code, the easiest way out is still to
remove the 'const' annotation here.
Fixes: 4a5f8ae50b66 ("[media] davinci: vpif_capture: get subdevs from DT when available")
Fixes: 231ce279e6e3 ("ARM: davinci: fix const warnings")
Acked-by: Sekhar Nori <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
The ZTE SoC drivers are only useful when building for a ZTE ZX platform.
Fixes: 4c2c2e39713b8cfb ("soc: zte: pm_domains: Prepare for supporting ARMv8 zx2967 family")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Baoyou Xie <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Due to a bugfix in wireless tree and the commit mentioned below a merge
was needed which went haywire. So the submitted change resulted in the
function brcmf_sdiod_sgtable_alloc() being called twice during the probe
thus leaking the memory of the first call.
Cc: [email protected] # 4.6.x
Fixes: 4d7928959832 ("brcmfmac: switch to new platform data")
Reported-by: Stefan Wahren <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Reviewed-by: Hante Meuleman <[email protected]>
Signed-off-by: Arend van Spriel <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
|
|
The commit to rework the headroom check in start_xmit() now calls
pxskb_expand_head() unconditionally if the header is CoW. Unfortunately,
it does so with the delta between the extant headroom and the header
length, which may be negative if there is already sufficient headroom.
pskb_expand_head() does allow for size being 0, in which case it just
copies, so clamp the header delta to zero.
Opening Chrome (and all my tabs) on a PCIE device was enough to reliably
hit this.
Fixes: 270a6c1f65fe ("brcmfmac: rework headroom check in .start_xmit()")
Signed-off-by: Daniel Stone <[email protected]>
Cc: Arend Van Spriel <[email protected]>
Cc: James Hughes <[email protected]>
Cc: Hante Meuleman <[email protected]>
Cc: Pieter-Paul Giesberts <[email protected]>
Cc: Franky Lin <[email protected]>
Tested-by: Hans de Goede <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
|
|
Drop the unused endpoints. They should only be used when there is an
actual remote-endpoint connected.
Cc: Sekhar Nori <[email protected]>
Signed-off-by: Kevin Hilman <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
|
|
Drop the unused endpoints. They should only be used when there is
an actual remote-endpoint connected.
Signed-off-by: Kevin Hilman <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
|
|
Pull "mvebu fixes for 4.13 (part 1)" from Gregory CLEMENT:
- Fix wrong irq type for gpio expeander on Armada 388 GP
- Use __pa_symbol instead of virt_to_phys in the mv98dx3236 platform
SMP code
* tag 'mvebu-fixes-4.13-1' of git://git.infradead.org/linux-mvebu:
ARM: dts: armada-38x: Fix irq type for pca955
ARM: mvebu: use __pa_symbol in the mv98dx3236 platform SMP code
|
|
Add necessary parent clocks for audss (Audio SubSystem, MAUDIO) clock
controller block.
This allows driver to keep EPLL enabled before accessing any MAUDIO
registers thus fixing silent hang. This silent hang appeared with
commit 6edfa11cb396 ("clk: samsung: Add enable/disable operation for
PLL36XX clocks"), e.g. on Odroid U3 usually with last (but unrelated)
messages:
[ 2.382741] input: gpio_keys as /devices/platform/gpio_keys/input/input0
[ 2.405686] usb 1-3: new high-speed USB device number 3 using exynos-ehci
[ 2.419843] max77686-rtc max77686-rtc: setting system clock to 2017-06-21 17:04:13 UTC (1498064653)
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Second Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:
Correct order of sound clock frequencies for ULCB boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
* tag 'renesas-fixes2-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: ulcb: sound clock-frequency needs descending order
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Pull "Few fixes for omaps for issues found recently" from Tony Lindgren:
- Fix disable_irq related shared IRQ warnings for omap3 PRM
- Fix omap4 legacy code regression that accidentally removed code that
we still need for PRM interrupts
- Fix dm8168-evm NAND pins and MMC write protect pin direction
- Fix dra71-evm mdio impedance values
* tag 'omap-for-v4.13/fixes-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: dra71-evm: mdio: Fix impedance values
ARM: dts: dm816x: Correct the state of the write protect pin
ARM: dts: dm816x: Correct NAND support nodes
ARM: OMAP4: Fix legacy code clean-up regression
ARM: OMAP2+: Fix omap3 prm shared irq
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:
Correct order of sound clock frequencies for Salvator boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
* tag 'renesas-fixes-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: renesas: salvator-common: sound clock-frequency needs descending order
|
|
Turns out that just writing CURPOS isn't sufficient to move the cursor
on some platforms. My 830 works just fine, but eg. 945 and PNV don't.
On those platforms we need to arm even the CURPOS update with a
CURBASE write.
Even worse, a write to any of the cursor register apart from CURBASE
will cancel an already pending cursor update. So if we have armed a
CURCNTR/CURBASE update, a subsequent CURPOS write prior to vblank
would cancel that armed update. Thus we're left with a cursor that
doesn't appear to move, or even change shape.
Fix the problem by always performing the CURBASE write after a
CURPOS write. Bspec is somewhat unclear which platforms actually
require this CURBASE write and which don't. So to keep it simple
and to make sure we really fix the problem across all supported
devices, let's just perform the CURBASE write unconditionally.
Cc: Paul Menzel <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101790
Fixes: 75343a44c901 ("drm/i915: Drop useless posting reads from cursor commit")
Signed-off-by: Ville Syrjälä <[email protected]>
Tested-by: Paul Menzel <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8753d2bc5e49daad301ce65f5dada57ed924fad6)
Signed-off-by: Daniel Vetter <[email protected]>
|
|
Fix the sizeof(ptr) vs. sizeof(*ptr) typo.
Fixes: 2889caa92321 ("drm/i915: Eliminate lots of iterations over the execobjects array")
Cc: Chris Wilson <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit edd9003f7f9dddd28fdd768e6e7569d996c769cb)
Signed-off-by: Daniel Vetter <[email protected]>
|
|
Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving
_sctp_walk_{params, errors}()") tried to fix the issue that it
may overstep the chunk end for _sctp_walk_{params, errors} with
'chunk_end > offset(length) + sizeof(length)'.
But it introduced a side effect: When processing INIT, it verifies
the chunks with 'param.v == chunk_end' after iterating all params
by sctp_walk_params(). With the check 'chunk_end > offset(length)
+ sizeof(length)', it would return when the last param is not yet
accessed. Because the last param usually is fwdtsn supported param
whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'
This is a badly issue even causing sctp couldn't process 4-shakes.
Client would always get abort when connecting to server, due to
the failure of INIT chunk verification on server.
The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
instead of 'chunk_end < offset(length) + sizeof(length)' for both
_sctp_walk_params and _sctp_walk_errors.
Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
Signed-off-by: Xin Long <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc
memory for rx.val, it should free tx.val before returning an
error.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk
properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue
exists on dccp_ipv4.
This patch is to fix it for dccp_ipv4.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In dccp_v6_conn_request, after reqsk gets alloced and hashed into
ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer,
one is for hlist, and the other one is for current using.
The problem is when dccp_v6_conn_request returns and finishes using
reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and
reqsk obj never gets freed.
Jianlin found this issue when running dccp_memleak.c in a loop, the
system memory would run out.
dccp_memleak.c:
int s1 = socket(PF_INET6, 6, IPPROTO_IP);
bind(s1, &sa1, 0x20);
listen(s1, 0x9);
int s2 = socket(PF_INET6, 6, IPPROTO_IP);
connect(s2, &sa1, 0x20);
close(s1);
close(s2);
This patch is to put the reqsk before dccp_v6_conn_request returns,
just as what tcp_conn_request does.
Reported-by: Jianlin Shi <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes: dad6f37c2602e ("powerpc: subpage_protect: Increase the array size to take care of 64TB")
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Tested-by: Ram Pai <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The rework of the exynos DRM clock handling introduced
warnings for configurations that have CONFIG_PM disabled:
drivers/gpu/drm/exynos/exynos_hdmi.c:736:13: error: 'hdmi_clk_disable_gates' defined but not used [-Werror=unused-function]
static void hdmi_clk_disable_gates(struct hdmi_context *hdata)
^~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/exynos/exynos_hdmi.c:717:12: error: 'hdmi_clk_enable_gates' defined but not used [-Werror=unused-function]
static int hdmi_clk_enable_gates(struct hdmi_context *hdata)
The problem is that the PM functions themselves are inside of
an #ifdef, but some functions they call are not.
This patch removes the #ifdef and instead marks the PM functions
as __maybe_unused, which is a more reliable way to get it right.
Link: https://patchwork.kernel.org/patch/8436281/
Fixes: 9be7e9898444 ("drm/exynos/hdmi: clock code re-factoring")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
If the s5p-cec driver is a module and the drm exynos driver is built-in, then
the CEC core will be a module also, causing the CEC notifier to fail (will be
compiled as empty functions).
To prevent this select CEC_CORE if CEC_NOTIFIER is set to ensure the CEC core
is also built into the kernel.
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
The "Fixes" patch was incorrectly merged, as a result PHY is prematurely
powered off and for example Odroid-U3 cannot disable TV power domain
when HDMI cable is unplugged.
Signed-off-by: Andrzej Hajda <[email protected]>
Reported-by: Marek Szyprowski <[email protected]>
Fixes: 625e63e2 ("drm/exynos/hdmi: fix pipeline disable order")
Tested-by: Marek Szyprowski <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
This patch moves drm_bridge_add call into probe.
It doesn't need to call drm_bridge_add call every time
bind callback is called.
Changelog v2
- moved drm_bridge_remove call into remove callback.
- corrected description.
Suggested-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Hoegeun Kwon <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
Remove the error handling of bridge_node because the bridge_node is
optional.
For example, In case of Exynos SoC, a bridge device such as mDNIe and
MIC could be placed between Display Controller and MIPI DSI device but
the bridge device is optional.
Signed-off-by: Hoegeun Kwon <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
It doesn't need to try to find a bridge if bridge node doesn't exist.
Reviewed-by: Shuah Khan <[email protected]>
Tested-by: Shuah Khan <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
12294 1192 0 13486 34ae drivers/gpu/drm/exynos/exynos_hdmi.o
File size after constify hdmi_match_types.
text data bss dec hex filename
13318 176 0 13494 34b6 drivers/gpu/drm/exynos/exynos_hdmi.o
Signed-off-by: Arvind Yadav <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
File size before:
text data bss dec hex filename
9983 1424 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o
File size after constify:
text data bss dec hex filename
11231 176 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o
Signed-off-by: Arvind Yadav <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
num_ioctls is already assigned when declaring the exynos_drm_driver
structure. No need to duplicate it here.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
The buffer passed to bpf_obj_get_info_by_fd() should be initialized
to zeros. Kernel will enforce that to guarantee we can safely extend
info structures in the future.
Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing
is problematic, however, since some members of the info structures
may need to be initialized by the callers (for instance pointers
to buffers to which kernel is to dump translated and jited images).
Remove the zeroing and fix up the in-tree callers before any kernel
has been released with this code.
As Daniel points out this seems to be the intended operation anyway,
since commit 95b9afd3987f ("bpf: Test for bpf ID") is itself setting
the buffer pointers before calling bpf_obj_get_info_by_fd().
Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer
when checking if the device name is set:
if (np->dev_name) {
...
However the field is a character array, therefore the condition always
yields true. Check instead whether the first byte of the array has a
non-zero value.
Signed-off-by: Matthias Kaehlcke <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
into drm-fixes
Three misc amd fixes.
* 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/powerplay: fix AVFS voltage offset for Vega10
drm/amdgpu/gfx9: simplify and fix GRBM index selection
drm/amdgpu: Fix blocking in RCU critical section(v2)
|
|
Commit bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's
access cache") changed how the access results are stored after an
access() call. An NFS v4 OPEN might have access bits returned with the
opendata, so we should use the NFS4_ACCESS values when determining the
return value in nfs4_opendata_access().
Fixes: bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's
access cache")
Reported-by: Eryu Guan <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
Tested-by: Takashi Iwai <[email protected]>
|
|
Commit de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring")
moves link status commitment into bond_mii_monitor(), but it still relies
on the return value of bond_miimon_inspect() as the hint. We need to return
non-zero as long as we propose a link status change.
Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring")
Reported-by: Benjamin Gilbert <[email protected]>
Tested-by: Benjamin Gilbert <[email protected]>
Cc: Mahesh Bandewar <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Mahesh Bandewar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Device lock bites again; if a device .remove() callback races a user
calling ioctl(VFIO_GROUP_GET_DEVICE_FD), the unbind request will hold
the device lock, but the user ioctl may have already taken a vfio_device
reference. In the case of a PCI device, the initial open will attempt
to reset the device, which again attempts to get the device lock,
resulting in deadlock. Use the trylock PCI reset interface and return
error on the open path if reset fails due to lock contention.
Link: https://lkml.org/lkml/2017/7/25/381
Reported-by: Wen Congyang <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
Currently dm_dax_flush() is not being called, even if underlying dax
device supports write cache, because DAXDEV_WRITE_CACHE is not being
propagated up to the DM dax device.
If the underlying dax device supports write cache, set
DAXDEV_WRITE_CACHE on the DM dax device. This will cause dm_dax_flush()
to be called.
Fixes: abebfbe2f7 ("dm: add ->flush() dax operation support")
Signed-off-by: Vivek Goyal <[email protected]>
Acked-by: Dan Williams <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
mempool_alloc() cannot fail for GFP_NOIO allocation, so there is no
point testing for failure.
One place the code tested for failure was passing "0" as the GFP
flags. This is most unusual and is probably meant to be GFP_NOIO,
so that is changed.
Also, allocation from ->extra_pool and ->prealloc_pool are repeated
before releasing the previous allocation. This can deadlock if the code
is servicing a write under high memory pressure. To avoid deadlocks,
change these to use GFP_NOWAIT and leave the error handling in place.
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
Use GFP_NOIO for memory allocations in the I/O path. Other memory
allocations in the initialization path can use GFP_KERNEL.
Reported-by: Mikulas Patocka <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
CONFIG_VFIO_SPAPR_EEH
When CONFIG_EEH=y and CONFIG_VFIO_SPAPR_EEH=n, build fails with the
following:
drivers/vfio/pci/vfio_pci.o: In function `.vfio_pci_release':
vfio_pci.c:(.text+0xa98): undefined reference to `.vfio_spapr_pci_eeh_release'
drivers/vfio/pci/vfio_pci.o: In function `.vfio_pci_open':
vfio_pci.c:(.text+0x1420): undefined reference to `.vfio_spapr_pci_eeh_open'
In this case, vfio_pci.c should use the empty definitions of
vfio_spapr_pci_eeh_open and vfio_spapr_pci_eeh_release functions.
This patch fixes it by guarding these function definitions with
CONFIG_VFIO_SPAPR_EEH, the symbol that controls whether vfio_spapr_eeh.c is
built, which is where the non-empty versions of these functions are. We need to
make use of IS_ENABLED() macro because CONFIG_VFIO_SPAPR_EEH is a tristate
option.
This issue was found during a randconfig build. Logs are here:
http://kisskb.ellerman.id.au/kisskb/buildresult/12982362/
Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
Pull virtio fixes and cleanups from Michael Tsirkin:
"Fixes some minor issues all over the codebase"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-net: fix module unloading
virtio-balloon: coding format cleanup
virtio-balloon: deflate via a page list
virtio_blk: Use sysfs_match_string() helper
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
Two commits which fix host crashes.
Signed-off-by: Paolo BOnzini <[email protected]>
|
|
Preempt can occur in the preemption timer expiration handler:
CPU0 CPU1
preemption timer vmexit
handle_preemption_timer(vCPU0)
kvm_lapic_expired_hv_timer
hv_timer_is_use == true
sched_out
sched_in
kvm_arch_vcpu_load
kvm_lapic_restart_hv_timer
restart_apic_timer
start_hv_timer
already-expired timer or sw timer triggerd in the window
start_sw_timer
cancel_hv_timer
/* back in kvm_lapic_expired_hv_timer */
cancel_hv_timer
WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops
This can be reproduced if CONFIG_PREEMPT is enabled.
------------[ cut here ]------------
WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16
RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
Call Trace:
handle_preemption_timer+0xe/0x20 [kvm_intel]
vmx_handle_exit+0xb8/0xd70 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
? kvm_arch_vcpu_load+0x47/0x230 [kvm]
? kvm_arch_vcpu_load+0x62/0x230 [kvm]
kvm_vcpu_ioctl+0x340/0x700 [kvm]
? kvm_vcpu_ioctl+0x340/0x700 [kvm]
? __fget+0xfc/0x210
do_vfs_ioctl+0xa4/0x6a0
? __fget+0x11d/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x81/0x220
entry_SYSCALL64_slow_path+0x25/0x25
------------[ cut here ]------------
WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16
RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
Call Trace:
kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm]
handle_preemption_timer+0xe/0x20 [kvm_intel]
vmx_handle_exit+0xb8/0xd70 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
? kvm_arch_vcpu_load+0x47/0x230 [kvm]
? kvm_arch_vcpu_load+0x62/0x230 [kvm]
kvm_vcpu_ioctl+0x340/0x700 [kvm]
? kvm_vcpu_ioctl+0x340/0x700 [kvm]
? __fget+0xfc/0x210
do_vfs_ioctl+0xa4/0x6a0
? __fget+0x11d/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x81/0x220
entry_SYSCALL64_slow_path+0x25/0x25
This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer
and start_sw_timer be in preemption-disabled regions, which trivially
avoid any reentrancy issue with preempt notifier.
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
[Add more WARNs. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Signed-off-by: Lin Ma <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Using variables instead of hard paths makes the requirements information
more accurate.
Signed-off-by: Lin Ma <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: fixup missing srcu lock
We need to hold the srcu lock when accessing memory slots
during migration
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:
Before NMI IRET test
Sending NMI to self
NMI isr running stack 0x461000
Sending nested NMI to self
After nested NMI to self
Nested NMI isr running rip=40038e
After iret
After NMI to self
FAIL: NMI
Commit 4c4a6f790ee862 (KVM: nVMX: track NMI blocking state separately
for each VMCS) tracks NMI blocking state separately for vmcs01 and
vmcs02. However it is not enough:
- The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
on IRET, so the L2 can generate #PF which can be intercepted by L0.
- L0 walks L1's guest page table and sees the mapping is invalid, it
resumes the L1 guest and injects the #PF into L1. At this point the
vmcs02 has nmi_known_unmasked=true.
- L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
- L1 executes VMRESUME to resume L2, causing a vmexit to L0
- during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
interruptibility-state field of vmcs02, but nmi_known_unmasked is
still true.
- L2 immediately exits to L0 with another page fault, because L0 still has
not updated the NGVA->HPA page tables. However, nmi_known_unmasked is
true so vmx_recover_nmi_blocking does not do anything.
The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The PI vector for L0 and L1 must be different. If dest vcpu0
is in guest mode while vcpu1 is delivering a non-nested PI to
vcpu0, there wont't be any vmexit so that the non-nested interrupt
will be delayed.
Signed-off-by: Wincy Van <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
We are using the same vector for nested/non-nested posted
interrupts delivery, this may cause interrupts latency in
L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.
This patch introduces a new vector which is only for nested
posted interrupts to solve the problems above.
Signed-off-by: Wincy Van <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
This reverts the change of commit f85c758dbee54cc3612a6e873ef7cecdb66ebee5,
as the behavior it modified was intended.
The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual
says:
Table 4-7. Use of CR3 with PAE Paging
Bit Position(s) Contents
4:0 Ignored
31:5 Physical address of the 32-Byte aligned
page-directory-pointer table used for linear-address
translation
63:32 Ignored (these bits exist only on processors supporting
the Intel-64 architecture)
To placate the static checker, write the mask explicitly as an
unsigned long constant instead of using a 32-bit unsigned constant.
Cc: Dan Carpenter <[email protected]>
Fixes: f85c758dbee54cc3612a6e873ef7cecdb66ebee5
Signed-off-by: Paolo Bonzini <[email protected]>
|