Age | Commit message (Collapse) | Author | Files | Lines |
|
If filelayout_decode_layout fail, _filelayout_free_lseg will causes
a double freeing of fh_array.
[ 1179.279800] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1179.280198] IP: [<ffffffffa027222d>] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.281010] PGD 0
[ 1179.281443] Oops: 0000 [#1]
[ 1179.281831] Modules linked in: nfs_layout_nfsv41_files(OE) nfsv4(OE) nfs(OE) fscache(E) xfs libcrc32c coretemp nfsd crct10dif_pclmul ppdev crc32_pclmul crc32c_intel auth_rpcgss ghash_clmulni_intel nfs_acl lockd vmw_balloon grace sunrpc parport_pc vmw_vmci parport shpchp i2c_piix4 vmwgfx drm_kms_helper ttm drm serio_raw mptspi scsi_transport_spi mptscsih e1000 mptbase ata_generic pata_acpi [last unloaded: fscache]
[ 1179.283891] CPU: 0 PID: 13336 Comm: cat Tainted: G OE 4.3.0-rc1-pnfs+ #244
[ 1179.284323] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[ 1179.285206] task: ffff8800501d48c0 ti: ffff88003e3c4000 task.ti: ffff88003e3c4000
[ 1179.285668] RIP: 0010:[<ffffffffa027222d>] [<ffffffffa027222d>] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.286612] RSP: 0018:ffff88003e3c77f8 EFLAGS: 00010202
[ 1179.287092] RAX: 0000000000000000 RBX: ffff88001fe78900 RCX: 0000000000000000
[ 1179.287731] RDX: ffffea0000f40760 RSI: ffff88001fe789c8 RDI: ffff88001fe789c0
[ 1179.288383] RBP: ffff88003e3c7810 R08: ffffea0000f40760 R09: 0000000000000000
[ 1179.289170] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88001fe789c8
[ 1179.289959] R13: ffff88001fe789c0 R14: ffff88004ec05a80 R15: ffff88004f935b88
[ 1179.290791] FS: 00007f4e66bb5700(0000) GS:ffffffff81c29000(0000) knlGS:0000000000000000
[ 1179.291580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1179.292209] CR2: 0000000000000000 CR3: 00000000203f8000 CR4: 00000000001406f0
[ 1179.292731] Stack:
[ 1179.293195] ffff88001fe78900 00000000000000d0 ffff88001fe78178 ffff88003e3c7868
[ 1179.293676] ffffffffa0272737 0000000000000001 0000000000000001 ffff88001fe78800
[ 1179.294151] 00000000614fffce ffffffff81727671 ffff88001fe78100 ffff88001fe78100
[ 1179.294623] Call Trace:
[ 1179.295092] [<ffffffffa0272737>] filelayout_alloc_lseg+0xa7/0x2d0 [nfs_layout_nfsv41_files]
[ 1179.295625] [<ffffffff81727671>] ? out_of_line_wait_on_bit+0x81/0xb0
[ 1179.296133] [<ffffffffa040407e>] pnfs_layout_process+0xae/0x320 [nfsv4]
[ 1179.296632] [<ffffffffa03e0a01>] nfs4_proc_layoutget+0x2b1/0x360 [nfsv4]
[ 1179.297134] [<ffffffffa0402983>] pnfs_update_layout+0x853/0xb30 [nfsv4]
[ 1179.297632] [<ffffffffa039db24>] ? nfs_get_lock_context+0x74/0x170 [nfs]
[ 1179.298158] [<ffffffffa0271807>] filelayout_pg_init_read+0x37/0x50 [nfs_layout_nfsv41_files]
[ 1179.298834] [<ffffffffa03a72d9>] __nfs_pageio_add_request+0x119/0x460 [nfs]
[ 1179.299385] [<ffffffffa03a6bd7>] ? nfs_create_request.part.9+0x37/0x2e0 [nfs]
[ 1179.299872] [<ffffffffa03a7cc3>] nfs_pageio_add_request+0xa3/0x1b0 [nfs]
[ 1179.300362] [<ffffffffa03a8635>] readpage_async_filler+0x85/0x260 [nfs]
[ 1179.300907] [<ffffffff81180cb1>] read_cache_pages+0x91/0xd0
[ 1179.301391] [<ffffffffa03a85b0>] ? nfs_read_completion+0x220/0x220 [nfs]
[ 1179.301867] [<ffffffffa03a8dc8>] nfs_readpages+0x128/0x200 [nfs]
[ 1179.302330] [<ffffffff81180ef3>] __do_page_cache_readahead+0x203/0x280
[ 1179.302784] [<ffffffff81180dc8>] ? __do_page_cache_readahead+0xd8/0x280
[ 1179.303413] [<ffffffff81181116>] ondemand_readahead+0x1a6/0x2f0
[ 1179.303855] [<ffffffff81181371>] page_cache_sync_readahead+0x31/0x50
[ 1179.304286] [<ffffffff811750a6>] generic_file_read_iter+0x4a6/0x5c0
[ 1179.304711] [<ffffffffa03a0316>] ? __nfs_revalidate_mapping+0x1f6/0x240 [nfs]
[ 1179.305132] [<ffffffffa039ccf2>] nfs_file_read+0x52/0xa0 [nfs]
[ 1179.305540] [<ffffffff811e343c>] __vfs_read+0xcc/0x100
[ 1179.305936] [<ffffffff811e3d15>] vfs_read+0x85/0x130
[ 1179.306326] [<ffffffff811e4a98>] SyS_read+0x58/0xd0
[ 1179.306708] [<ffffffff8172caaf>] entry_SYSCALL_64_fastpath+0x12/0x76
[ 1179.307094] Code: c4 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 8b 07 49 89 f4 85 c0 74 47 48 8b 06 49 89 fd <48> 8b 38 48 85 ff 74 22 31 db eb 0c 48 63 d3 48 8b 3c d0 48 85
[ 1179.308357] RIP [<ffffffffa027222d>] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.309177] RSP <ffff88003e3c77f8>
[ 1179.309582] CR2: 0000000000000000
Signed-off-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Commit 718ba5b87343, moved the responsibility for unlocking the socket to
xs_tcp_setup_socket, meaning that the socket will be unlocked before we
know that it has finished trying to connect. The following patch is based on
an initial patch by Russell King to ensure that we delay clearing the
XPRT_CONNECTING flag until we either know that we failed to initiate
a connection attempt, or the connection attempt itself failed.
Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from racing")
Reported-by: Russell King <[email protected]>
Reported-by: Russell King <[email protected]>
Tested-by: Russell King <[email protected]>
Tested-by: Benjamin Coddington <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications.
This makes nlm_flags in ipv6 replace notifications consistent
with ipv4.
Signed-off-by: Roopa Prabhu <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Reviewed-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Remove unneeded NULL test.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We're incorrectly assigning a loff_t return to an int. If SEEK_HOLE or
SEEK_DATA returns an offset over 2^31 then the application will see a
weird lseek() result (usually -EIO).
Cc: [email protected]
Fixes: bdcc2cd14e4e "NFSv4.2: handle NFS-specific llseek errors"
Signed-off-by: J. Bruce Fields <[email protected]>
Reviewed-by: Anna Schumaker <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
When we're destroying the socket transport, we need to ensure that
we cancel any existing delayed connection attempts, and order them
w.r.t. the call to xs_close().
Reported-by:"Suzuki K. Poulose" <[email protected]>
Acked-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We really want sizeof(struct page *) instead. Otherwise we limit
maximum IO size to 64 pages rather than 512 pages on a 64bit system.
Fixes 2e11f829(nfs: cap request size to fit a kmalloced page array).
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Fixes: 2e11f8296d22 ("nfs: cap request size to fit a kmalloced page array")
Signed-off-by: Trond Myklebust <[email protected]>
|
|
mount
A test case is as the description says:
open(foobar, O_WRONLY);
sleep() --> reboot the server
close(foobar)
The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few
line before going to restart, there is
clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags).
NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open
owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the
value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes
out state and when we go to close it, “call_close” doesn’t get set as
state flag is not set and CLOSE doesn’t go on the wire.
Signed-off-by: Olga Kornievskaia <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
It is perfectly legitimate for a PCI device to have an
PCI_INTERRUPT_PIN value of zero. This happens if the device doesn't
use interrupts, or on PCIe devices, where only MSI/MSI-X are
supported.
Silence the annoying "of_irq_parse_pci() failed with rc=-19" error
messages by moving the printing code into of_irq_parse_pci(), and only
emitting the message for cases where PCI_INTERRUPT_PIN == 0 is not the
cause for an early exit.
Signed-off-by: David Daney <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
The bma180 / bma250 accelerometers share a driver (at least under Linux),
so it makes sense to also have their bindings info in a single .txt.
This commit extends the bma180 bindings with bma250 bindings, specifically
it specifies how the 2 seperate interrupts the bma250 has must be listed
in devicetree. The existing bma180 driver is already fully compatible
with the specified bindings.
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
The cooling-{min,max}-level properties are marked as optional in
Documentation/devicetree/bindings/cpufreq/cpufreq-dt.txt and the usage
in various device tree matches this, i.e., some cooling device in the
device trees provide these properties while others do not.
Make the bindings in
Documentation/devicetree/bindings/thermal/thermal.txt consistent with
the cpufreq-dt bindings by marking the cooling-*-level properties as
optional.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Kumar Gala <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
The device trees in the kernel as well as the binding description in
Documentation/devicetree/bindings/cpufreq/cpufreq-dt.txt use the
cooling-{min,max}-level property.
Fix the inconsistency with the binding description in
Documentation/devicetree/bindings/thermal/thermal.txt by changing
cooling-*-state properties to cooling-*-level.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Kumar Gala <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
The GICv3 ITS uses sideband master identification data (known as a
DeviceID) to identify which master wrote to a doorbell, and this data is
used to determine how to react in response to the write.
Commit 1e6db000482fa65a ("irqchip/gicv3-its: Add platform MSI support")
added support per this binding, but failed to update the documentation.
This patch fixes the documentation.
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"These are both fixes to the new and improved keepalive2 behavior"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: advertise support for keepalive2
libceph: don't access invalid memory in keepalive2 path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
"twl4030-charger fixes"
* tag 'for-v4.3-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
twl4030_charger: fix another compile error
Revert "twl4030_charger: correctly handle -EPROBE_DEFER from devm_usb_get_phy_by_node"
|
|
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
Use a generic name for this kind of PLL
Correction in dts files are already done here:
commit 5eb26c605909 ("ARM: STi: DT: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x")
Signed-off-by: Gabriel Fernandez <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"This addresses some problems with filesystem writeback due to the
recently merged hardware DBM patches, which caused us to treat some
read-only pages as dirty.
There are also some other, less significant fixes that are described
in the summary below:
A mixture of fixes for regressions introduced during the merge window,
some longer standing problems that we spotted and a couple of hardware
errata. The main changes are:
- Fix fallout from the h/w DBM patches, causing filesystem writeback
issues on both v8 and v8.1 CPUs
- Workaround for Cortex-A53 erratum #843419 in the module loader
- Fix for long-standing issue with compat big-endian signal handlers
using the saved floating point state"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: errata: add module build workaround for erratum #843419
arm64: compat: fix vfp save/restore across signal handlers in big-endian
arm64: cpu hotplug: ensure we mask out CPU_TASKS_FROZEN in notifiers
arm64: head.S: initialise mdcr_el2 in el2_setup
arm64: enable generic idle loop
arm64: pgtable: use a single bit for PTE_WRITE regardless of DBM
arm64: Fix pte_modify() to preserve the hardware dirty information
arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled
arm64: dma-mapping: check whether cma area is initialized or not
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- misc fixes all around the map
- block non-root vm86(old) if mmap_min_addr != 0
- two small debuggability improvements
- removal of obsolete paravirt op
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform: Fix Geode LX timekeeping in the generic x86 build
x86/apic: Serialize LVTT and TSC_DEADLINE writes
x86/ioapic: Force affinity setting in setup_ioapic_dest()
x86/paravirt: Remove the unused pv_time_ops::get_tsc_khz method
x86/ldt: Fix small LDT allocation for Xen
x86/vm86: Fix the misleading CONFIG_VM86 Kconfig help text
x86/cpu: Print family/model/stepping in hex
x86/vm86: Block non-root vm86(old) if mmap_min_addr != 0
x86/alternatives: Make optimize_nops() interrupt safe and synced
x86/mm/srat: Print non-volatile flag in SRAT
x86/cpufeatures: Enable cpuid for Intel SHA extensions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"A fix for an abs()/abs64() bug that caused too slow NTP convergence on
32-bit kernels, plus a removal of an obsolete clockevents driver
facility after all users got converted during the merge window"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Remove unused set_mode() callback
time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A migrate_tasks() locking fix, and a late-coming nohz change plus a
nohz debug check"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: 'Annotate' migrate_tasks()
nohz: Assert existing housekeepers when nohz full enabled
nohz: Affine unpinned timers to housekeepers
|
|
The ret pointer passed to regulator_dev_lookup is only filled with a
valid error code if regulator_dev_lookup returned NULL. Currently
regulator_resolve_supply checks this ret value before it checks if a
regulator was returned, this can result in valid regulator lookups being
ignored.
Fixes: 6261b06de565 ("regulator: Defer lookup of supply to regulator_get")
Signed-off-by: Charles Keepax <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo MOlnar:
"Mostly tooling fixes, but also two x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tests: Fix software clock events test setting maps
perf tests: Fix task exit test setting maps
perf evlist: Fix create_syswide_maps() not propagating maps
perf evlist: Fix add() not propagating maps
perf evlist: Factor out a function to propagate maps for a single evsel
perf evlist: Make create_maps() use set_maps()
perf evlist: Make set_maps() more resilient
perf evsel: Add own_cpus member
perf evlist: Fix missing thread_map__put in propagate_maps()
perf evlist: Fix splice_list_tail() not setting evlist
perf evlist: Add has_user_cpus member
perf evlist: Remove redundant validation from propagate_maps()
perf evlist: Simplify set_maps() logic
perf evlist: Simplify propagate_maps() logic
perf top: Fix segfault pressing -> with no hist entries
perf header: Fixup reading of HEADER_NRCPUS feature
perf/x86/intel: Fix constraint access
perf/x86/intel/bts: Set event->hw.itrace_started in pmu::start to match the new logic
perf tools: Fix use of wrong event when processing exit events
perf tools: Fix parse_events_add_pmu caller
|
|
We are the client, but advertise keepalive2 anyway - for consistency,
if nothing else. In the future the server might want to know whether
its clients support keepalive2.
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Yan, Zheng <[email protected]>
|
|
This
struct ceph_timespec ceph_ts;
...
con_out_kvec_add(con, sizeof(ceph_ts), &ceph_ts);
wraps ceph_ts into a kvec and adds it to con->out_kvec array, yet
ceph_ts becomes invalid on return from prepare_write_keepalive(). As
a result, we send out bogus keepalive2 stamps. Fix this by encoding
into a ceph_timespec member, similar to how acks are read and written.
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Yan, Zheng <[email protected]>
|
|
When bio bounce is involved, one new bio and its biovecs are
cloned from the comming bio, which can be one fast-cloned bio
from upper layer(such as dm).
So it is obviously wrong to assume the start index of the coming(
original) bio's io vector is zero, which can be any value between
0 and (bi_max_vecs - 1), especially in case of bio split.
This patch fixes Fedora's booting oops on i386, often with the
following kernel log together:
> [ 9.026738] systemd[1]: Switching root.
> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1
> (systemd).
> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178
> index:0x16a
> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
> [ 9.087284] page dumped because: page still charged to cgroup
> [ 9.088772] bad because of flags:
> [ 9.089731] flags: 0x21(locked|lru)
> [ 9.090818] page->mem_cgroup:f2c3e400
Reported-by: Josh Boyer <[email protected]>
Tested-by: Adam Williamson <[email protected]>
Cc: Ming Lin <[email protected]>
Cc: Mike Snitzer <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
biovecs has become immutable since v3.13, so it isn't necessary
to allocate biovecs for the new cloned bios, then we can save
one extra biovecs allocation/copy, and the allocation is often
not fixed-length and a bit more expensive.
For example, if the 'max_sectors_kb' of null blk's queue is set
as 16(32 sectors) via sysfs just for making more splits, this patch
can increase throught about ~70% in the sequential read test over
null_blk(direct io, bs: 1M).
Cc: Christoph Hellwig <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Ming Lin <[email protected]>
Cc: Dongsu Park <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
This fixes a performance regression introduced by commit 54efd50bfd,
and allows us to take full advantage of the fact that we have immutable
bio_vecs. Hand applied, as it rejected violently with commit
5014c311baa2.
Signed-off-by: Jens Axboe <[email protected]>
|
|
pmem_rw_page() needs to call wmb_pmem() on writes to make sure that the
newly written data is durable. This flow was added to pmem_rw_bytes()
and pmem_make_request() with this commit:
commit 61031952f4c8 ("arch, x86: pmem api for ensuring durability of
persistent memory updates")
...the pmem_rw_page() path was missed.
Cc: <[email protected]>
Signed-off-by: Ross Zwisler <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Spinlock performance regression fix, plus documentation fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_keys: Fix up the static keys documentation
locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support
locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
locking/static_keys: Fix a silly typo
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fix from Ingo Molnar:
"Fix a false positive warning"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
security/device_cgroup: Fix RCU_LOCKDEP_WARN() condition
|
|
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Cc: <[email protected]>
Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
Second set of KVM/ARM changes for 4.3-rc2
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
|
|
Commit 6894258eda2f reversed the order of gfp_flags adjustment in
dma_alloc_attrs() for x86 [arch/x86/kernel/pci-dma.c] As a result,
relevant flags set by dma_alloc_coherent_gfp_flags() are just
discarded and cause coherent DMA memory allocation failure on some
devices.
Fixes: 6894258eda2f ("dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}")
Signed-off-by: Jun'ichi Nomura <[email protected]>
Tested-by: Tony Luck <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.
Whilst this breaks migrating a guest started on a previous version of
the kernel, it is much better to kill these (non existent) registers
as soon as possible.
Reviewed-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
[maz: added commend about migration]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: [email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: [email protected]
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Andrew Morton <[email protected]>
CC: [email protected]
CC: Heiko Carstens <[email protected]>
CC: [email protected]
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
This config option is completely irrelevant for zfcpdump and
unfortunately causes a kernel panic on recent kernels in
"mspro_block_init()/driver_register()".
Signed-off-by: Michael Holzheu <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The scaled cputime is supposed to be derived from the normal per-thread
cputime by dividing it with the average thread density in the last interval.
The calculation of the scaling values for the average thread density is
incorrect. The current, incorrect calculation:
Ci = cycle count with i active threads
T = unscaled cputime, sT = scaled cputime
sT = T * (C1 + C2 + ... + Cn) / (1*C1 + 2*C2 + ... + n*Cn)
The calculation happens to yield the correct numbers for the simple cases
with only one Ci value not zero. But for cases with multiple Ci values not
zero it fails. E.g. on a SMT-2 system with one thread active half the time
and two threads active for the other half of the time it fails, the scaling
factor should be 3/4 but the formula gives 2/3.
The correct formula is
sT = T * (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn)
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
Previously, the cpum_cf PMU returned -EPERM if a counter is requested and
the counter set to which the counter belongs is not authorized. According
to the perf_event_open() system call manual, an error code of EPERM indicates
an unsupported exclude setting or CAP_SYS_ADMIN is missing.
Use ENOENT to indicate that particular counters are not available when the
counter set which contains the counter is not authorized. For generic events,
this might trigger a fall back, for example, to a software event.
Signed-off-by: Hendrik Brueckner <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The uc_sigmask in the ucontext structure is an array of words to keep
the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
only has 64 bits).
For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
there are two 4 byte words. The compat signal handler code uses a
simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
As s390 is a big-endian architecture this is incorrect, the two words
in the 31 bit sigset_t array need to be swapped.
Cc: <[email protected]>
Reported-by: Stefan Liebler <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The critical section cleanup code misses to add the offset of the
thread_struct to the task address.
Therefore, if the critical section code gets executed, it may corrupt
the task struct or restore the contents of the floating point registers
from the wrong memory location.
Fixes d0164ee20d "s390/kernel: remove save_fpu_regs() parameter and use
__LC_CURRENT instead".
Signed-off-by: Heiko Carstens <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The swsusp_arch_suspend()/swsusp_arch_resume() functions currently only
save and restore the floating point registers. If the task that started
the hibernation process is using vector registers they can get lost.
To fix this just call save_fpu_regs in swsusp_arch_suspend(), the restore
will happen automatically on return to user space.
Reported-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The nf_log_unregister() function needs to call synchronize_rcu() to make sure
that the objects are not dereferenced anymore on module removal.
Fixes: 5962815a6a56 ("netfilter: nf_log: use an array of loggers instead of list")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
lead to a memory access using an incorrect address in certain sequences
headed by an ADRP instruction.
There is a linker fix to generate veneers for ADRP instructions, but
this doesn't work for kernel modules which are built as unlinked ELF
objects.
This patch adds a new config option for the erratum which, when enabled,
builds kernel modules with the mcmodel=large flag. This uses absolute
addressing for all kernel symbols, thereby removing the use of ADRP as
a PC-relative form of addressing. The ADRP relocs are removed from the
module loader so that we fail to load any potentially affected modules.
Cc: <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
When saving/restoring the VFP registers from a compat (AArch32)
signal frame, we rely on the compat registers forming a prefix of the
native register file and therefore make use of copy_{to,from}_user to
transfer between the native fpsimd_state and the compat_vfp_sigframe.
Unfortunately, this doesn't work so well in a big-endian environment.
Our fpsimd save/restore code operates directly on 128-bit quantities
(Q registers) whereas the compat_vfp_sigframe represents the registers
as an array of 64-bit (D) registers. The architecture packs the compat D
registers into the Q registers, with the least significant bytes holding
the lower register. Consequently, we need to swap the 64-bit halves when
converting between these two representations on a big-endian machine.
This patch replaces the __copy_{to,from}_user invocations in our
compat VFP signal handling code with explicit __put_user loops that
operate on 64-bit values and swap them accordingly.
Cc: <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
We have a couple of CPU hotplug notifiers for resetting the CPU debug
state to a sane value when a CPU comes online.
This patch ensures that we mask out CPU_TASKS_FROZEN so that we don't
miss any online events occuring due to suspend/resume.
Acked-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|