Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"A security fix (so a maliciously corrupted file system image won't
panic the kernel) and some fixes for CONFIG_VMAP_STACK"
* tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: sanity check the block and cluster size at mount time
fscrypto: don't use on-stack buffer for key derivation
fscrypto: don't use on-stack buffer for filename encryption
|
|
If the block size or cluster size is insane, reject the mount. This
is important for security reasons (although we shouldn't be just
depending on this check).
Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page. get_crypt_info() was using a stack buffer to hold the
output from the encryption operation used to derive the per-file key.
Fix it by using a heap buffer.
This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page. For short filenames, fname_encrypt() was encrypting a
stack buffer holding the padded filename. Fix it by encrypting the
filename in-place in the output buffer, thereby making the temporary
buffer unnecessary.
This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Some I2C driver bugfixes (and one documentation fix)"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-mux-pca954x: fix deselect enabling for device-tree
i2c: digicolor: use clk_disable_unprepare instead of clk_unprepare
i2c: mux: fix up dependencies
i2c: Documentation: i2c-topology: fix minor whitespace nit
i2c: mux: demux-pinctrl: make drivers with no pinctrl work again
|
|
Pull KVM fixes from Radim Krčmář:
"ARM:
- Fix handling of the 32bit cycle counter
- Fix cycle counter filtering
x86:
- Fix a race leading to double unregistering of user notifiers
- Amend oversight in kvm_arch_set_irq that turned Hyper-V code dead
- Use SRCU around kvm_lapic_set_vapic_addr
- Avoid recursive flushing of asynchronous page faults
- Do not rely on deferred update in KVM_GET_CLOCK, which fixes #GP
- Let userspace know that KVM_GET_CLOCK is useful with master clock;
4.9 changed the return value to better match the guest clock, but
didn't provide means to let guests take advantage of it"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: merge kvm_arch_set_irq and kvm_arch_set_irq_inatomic
KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
KVM: async_pf: avoid recursive flushing of work items
kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
KVM: Disable irq while unregistering user notifier
KVM: x86: do not go through vcpu in __get_kvmclock_ns
KVM: arm64: Fix the issues when guest PMCCFILTR is configured
arm64: KVM: pmu: Fix AArch32 cycle counter access
|
|
Deselect functionality can be ignored for device-trees with
"i2c-mux-idle-disconnect" entries if no platform_data is available.
By enabling the deselect functionality outside the platform_data
block the logic works as it did in previous kernels.
Fixes: 7fcac9807175 ("i2c: i2c-mux-pca954x: convert to use an explicit i2c mux core")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Alex Hemme <ahemme@cisco.com>
Signed-off-by: Ziyang Wu <ziywu@cisco.com>
[touched up a few minor issues /peda]
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fixes marked for stable:
- fix system reset interrupt winkle wakeups
- fix setting of AIL in hypervisor mode
Fixes for code merged this cycle:
- fix exception vector build with 2.23 era binutils
- fix missing update of HID register on secondary CPUs
Other:
- fix missing pr_cont()s
- invalidate ERAT on tlbiel for POWER9 DD1"
* tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix missing update of HID register on secondary CPUs
powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1
powerpc/64: Fix setting of AIL in hypervisor mode
powerpc/oops: Fix missing pr_cont()s in instruction dump
powerpc/oops: Fix missing pr_cont()s in show_regs()
powerpc/oops: Fix missing pr_cont()s in print_msr_bits() et. al.
powerpc/oops: Fix missing pr_cont()s in show_stack()
powerpc: Fix exception vector build with 2.23 era binutils
powerpc/64s: Fix system reset interrupt winkle wakeups
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- Compiler warning in caam driver that was the last one remaining
- Do not register aes-xts in caam drivers on unsupported platforms
- Regression in algif_hash interface that may lead to an oops"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_hash - Fix NULL hash crash with shash
crypto: caam - fix type mismatch warning
crypto: caam - do not register AES-XTS mode on LP units
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED subsystem update from Jacek Anaszewski:
"I'd like to announce a new co-maintainer - Pavel Machek"
* tag 'leds_4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: Add LED subsystem co-maintainer
|
|
git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"Some driver fixes which we pending in my tree:
- return error code fix in edma driver
- Kconfig fix for genric allocator in mmp_tdma
- fix uninitialized value in sun6i
- Runtime pm fixes for cppi"
* tag 'dmaengine-fix-4.9-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: cppi41: More PM runtime fixes
dmaengine: cpp41: Fix handling of error path
dmaengine: cppi41: Fix unpaired pm runtime when only a USB hub is connected
dmaengine: cppi41: Fix list not empty warning on module removal
dmaengine: sun6i: fix the uninitialized value for v_lli
dmaengine: mmp_tdma: add missing select GENERIC_ALLOCATOR in Kconfig
dmaengine: edma: Fix error return code in edma_alloc_chan_resources()
|
|
kvm_arch_set_irq is unused since commit b97e6de9c96. Merge
its functionality with kvm_arch_set_irq_inatomic.
Reported-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Reported by syzkaller:
[ INFO: suspicious RCU usage. ]
4.9.0-rc4+ #47 Not tainted
-------------------------------
./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
stack backtrace:
CPU: 1 PID: 6679 Comm: syz-executor Not tainted 4.9.0-rc4+ #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff880039e2f6d0 ffffffff81c2e46b ffff88003e3a5b40 0000000000000000
0000000000000001 ffffffff83215600 ffff880039e2f700 ffffffff81334ea9
ffffc9000730b000 0000000000000004 ffff88003c4f8420 ffff88003d3f8000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81c2e46b>] dump_stack+0xb3/0x118 lib/dump_stack.c:51
[<ffffffff81334ea9>] lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4445
[< inline >] __kvm_memslots include/linux/kvm_host.h:534
[< inline >] kvm_memslots include/linux/kvm_host.h:541
[<ffffffff8105d6ae>] kvm_gfn_to_hva_cache_init+0xa1e/0xce0 virt/kvm/kvm_main.c:1941
[<ffffffff8112685d>] kvm_lapic_set_vapic_addr+0xed/0x140 arch/x86/kvm/lapic.c:2217
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: fda4e2e85589191b123d31cdc21fd33ee70f50fd
Cc: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This was reported by syzkaller:
[ INFO: possible recursive locking detected ]
4.9.0-rc4+ #49 Not tainted
---------------------------------------------
kworker/2:1/5658 is trying to acquire lock:
([ 1644.769018] (&work->work)
[< inline >] list_empty include/linux/compiler.h:243
[<ffffffff8128dd60>] flush_work+0x0/0x660 kernel/workqueue.c:1511
but task is already holding lock:
([ 1644.769018] (&work->work)
[<ffffffff812916ab>] process_one_work+0x94b/0x1900 kernel/workqueue.c:2093
stack backtrace:
CPU: 2 PID: 5658 Comm: kworker/2:1 Not tainted 4.9.0-rc4+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events async_pf_execute
ffff8800676ff630 ffffffff81c2e46b ffffffff8485b930 ffff88006b1fc480
0000000000000000 ffffffff8485b930 ffff8800676ff7e0 ffffffff81339b27
ffff8800676ff7e8 0000000000000046 ffff88006b1fcce8 ffff88006b1fccf0
Call Trace:
...
[<ffffffff8128ddf3>] flush_work+0x93/0x660 kernel/workqueue.c:2846
[<ffffffff812954ea>] __cancel_work_timer+0x17a/0x410 kernel/workqueue.c:2916
[<ffffffff81295797>] cancel_work_sync+0x17/0x20 kernel/workqueue.c:2951
[<ffffffff81073037>] kvm_clear_async_pf_completion_queue+0xd7/0x400 virt/kvm/async_pf.c:126
[< inline >] kvm_free_vcpus arch/x86/kvm/x86.c:7841
[<ffffffff810b728d>] kvm_arch_destroy_vm+0x23d/0x620 arch/x86/kvm/x86.c:7946
[< inline >] kvm_destroy_vm virt/kvm/kvm_main.c:731
[<ffffffff8105914e>] kvm_put_kvm+0x40e/0x790 virt/kvm/kvm_main.c:752
[<ffffffff81072b3d>] async_pf_execute+0x23d/0x4f0 virt/kvm/async_pf.c:111
[<ffffffff8129175c>] process_one_work+0x9fc/0x1900 kernel/workqueue.c:2096
[<ffffffff8129274f>] worker_thread+0xef/0x1480 kernel/workqueue.c:2230
[<ffffffff812a5a94>] kthread+0x244/0x2d0 kernel/kthread.c:209
[<ffffffff831f102a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
The reason is that kvm_put_kvm is causing the destruction of the VM, but
the page fault is still on the ->queue list. The ->queue list is owned
by the VCPU, not by the work items, so we cannot just add list_del to
the work item.
Instead, use work->vcpu to note async page faults that have been resolved
and will be processed through the done list. There is no need to flush
those.
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Userspace can read the exact value of kvmclock by reading the TSC
and fetching the timekeeping parameters out of guest memory. This
however is brittle and not necessary anymore with KVM 4.11. Provide
a mechanism that lets userspace know if the new KVM_GET_CLOCK
semantics are in effect, and---since we are at it---if the clock
is stable across all VCPUs.
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Function user_notifier_unregister should be called only once for each
registered user notifier.
Function kvm_arch_hardware_disable can be executed from an IPI context
which could cause a race condition with a VCPU returning to user mode
and attempting to unregister the notifier.
Signed-off-by: Ignacio Alvarado <ikalvarado@google.com>
Cc: stable@vger.kernel.org
Fixes: 18863bdd60f8 ("KVM: x86 shared msr infrastructure")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Going through the first VCPU is wrong if you follow a KVM_SET_CLOCK with
a KVM_GET_CLOCK immediately after, without letting the VCPU run and
call kvm_guest_time_update.
To fix this, compute the kvmclock value ourselves, using the master
clock (tsc, nsec) pair as the base and the host CPU frequency as
the scale.
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM updates for v4.9-rc6
- Fix handling of the 32bit cycle counter
- Fix cycle counter filtering
|
|
Simon Wunderlich says:
====================
Here are two batman-adv bugfix patches:
- Revert a splat on disabling interface which created another problem,
by Sven Eckelmann
- Fix error handling when the primary interface disappears during a
throughput meter test, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Drop duplicate header scatterlist.h from iommu_common.h.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
at91ether_start_xmit() does not check for dma mapping errors.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"They fix an ACPI thermal management regression introduced by a recent
FADT handling cleanup, an ACPI tools build issue introduced by a
recent ACPICA commit and a PCC mailbox initialization bug causing
lockdep to complain loudly.
Specifics:
- Revert a recent ACPICA cleanup that attempted to get rid of all
FADT version 2 legacy, but broke ACPI thermal management on at
least one system (Rafael Wysocki).
- Fix cross-compiled builds of ACPI tools that stopped working after
a recent cleanup related to the handling of header files in ACPICA
(Lv Zheng).
- Fix a locking issue in the PCC channel initialization code that
invokes devm_request_irq() under a spinlock (among other things)
and causes lockdep to complain (Hoan Tran)"
* tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power/acpi: Remove direct kernel source include reference
mailbox: PCC: Fix lockdep warning when request PCC channel
Revert "ACPICA: FADT support cleanup"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"Here are some regression fixes for kbuild:
- modversion support for exported asm symbols (Nick Piggin). The
affected architectures need separate patches adding
asm-prototypes.h.
- fix rebuilds of lib-ksyms.o (Nick Piggin)
- -fno-PIE builds (Sebastian Siewior and Borislav Petkov). This is
not a kernel regression, but one of the Debian gcc package.
Nevertheless, it's quite annoying, so I think it should go into
mainline and stable now"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Steal gcc's pie from the very beginning
kbuild: be more careful about matching preprocessed asm ___EXPORT_SYMBOL
x86/kexec: add -fno-PIE
scripts/has-stack-protector: add -fno-PIE
kbuild: add -fno-PIE
kbuild: modversions for EXPORT_SYMBOL() for asm
kbuild: prevent lib-ksyms.o rebuilds
|
|
Pull nfsd bugfix from Bruce Fields:
"Just one fix for an NFS/RDMA crash"
* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports
|
|
Mark me as a co-maintainer of LED subsystem.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
|
|
* acpica-fixes:
Revert "ACPICA: FADT support cleanup"
* acpi-cppc-fixes:
mailbox: PCC: Fix lockdep warning when request PCC channel
* acpi-tools-fixes:
tools/power/acpi: Remove direct kernel source include reference
|
|
Babu Moger says:
====================
Adjust lockdep static allocations for sparc
These patches limit the static allocations for lockdep data structures
used for debugging locking correctness. For sparc, all the kernel's code,
data, and bss, must have locked translations in the TLB so that we don't
get TLB misses on kernel code and data. Current sparc chips have 8 TLB
entries available that may be locked down, and with a 4mb page size,
this gives a maximum of 32MB. With PROVE_LOCKING we could go over this
limit and cause system boot-up problems. These patches limit the static
allocations so that everything fits in current required size limit.
patch 1 : Adds new config parameter CONFIG_PROVE_LOCKING_SMALL
Patch 2 : Adjusts the sizes based on the new config parameter
v2-> v3:
Some more comments from Sam Ravnborg and Peter Zijlstra.
Defined PROVE_LOCKING_SMALL as invisible and moved the selection to
arch/sparc/Kconfig.
v1-> v2:
As suggested by Peter Zijlstra, keeping the default as is.
Introduced new config variable CONFIG_PROVE_LOCKING_SMALL
to handle sparc specific case.
v0:
Initial revision.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reduce the size of data structure for lockdep entries by half if
PROVE_LOCKING_SMALL if defined. This is used only for sparc.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This new config parameter limits the space used for "Lock debugging:
prove locking correctness" by about 4MB. The current sparc systems have
the limitation of 32MB size for kernel size including .text, .data and
.bss sections. With PROVE_LOCKING feature, the kernel size could grow
beyond this limit and causing system boot-up issues. With this option,
kernel limits the size of the entries of lock_chains, stack_trace etc.,
so that kernel fits in required size limit. This is not visible to user
and only used for sparc.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() . Instead of
marking state recovery as failed, mark the state for recovery again.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
sunbmac uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
drivers/net/ethernet/sun/sunbmac.c: In function ‘bigmac_ether_init’:
drivers/net/ethernet/sun/sunbmac.c:1166: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
This patch resolves above compiler warning.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sunqe uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
drivers/net/ethernet/sun/sunqe.c: In function ‘qec_ether_init’:
drivers/net/ethernet/sun/sunqe.c:883: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
drivers/net/ethernet/sun/sunqe.c:885: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
This patch resolves above compiler warnings.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ensure we test to see if the open stateid is actually set, before we
send a CLOSE.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Tushar Dave says:
====================
sparc: Enable sun4v hypervisor PCI IOMMU v2 APIs and ATU
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
sun4v hypervisor PCI IOMMU v2 APIs.
Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 500MB DVMA space per instance.
When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.
For example, we recently experienced legacy IOMMU limitation while
using i40e driver in system with large number of CPUs (e.g. 128).
Four ports of i40e, each request 128 QP (Queue Pairs). Each queue has
512 (default) descriptors. So considering only RX queues (because RX
premap DMA buffers), i40e takes 4*128*512 number of DMA entries in
IOMMU table. Legacy IOMMU can have at max (2G/8K)- 1 entries available
in table. So bringing up four instance of i40e alone saturate existing
IOMMU resource.
ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.
ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.
The patch set is tested on sun4v (T1000, T2000, T3, T4, T5, T7, S7)
and sun4u SPARC.
Thanks.
-Tushar
v2->v3:
- Patch #5 addresses comment by Joe Perches.
-- use %s, __func__ instead of embedding the function name.
v1->v2:
- Patch #2 addresses comments by Dave M.
-- use page allocator to allocate IOTSB.
-- use true/false with boolean variables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.
Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.
ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.
ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add missing NDA_VLAN attribute's size.
Fixes: 1e53d5bb8878 ("net: Pass VLAN ID to rtnl_fdb_notify.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file
descriptor not a pid. Fix the typo.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A few more bugfixes:
* limit # of scan results stored in memory - this is a long-standing bug
Jouni and I only noticed while discussing other things in Santa Fe
* revert AP_LINK_PS patch that was causing issues (Felix)
* various A-MSDU/A-MPDU fixes for TXQ code (Felix)
* interoperability workaround for peers with broken VHT capabilities
(Filip Matusiak)
* add bitrate definition for a VHT MCS that's supposed to be invalid
but gets used by some hardware anyway (Thomas Pedersen)
* beacon timer fix in hwsim (Benjamin Beichler)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
converts schedule_timeout() to its freezable version, it was probably
correct at that time, but later, commit 2b514574f7e8
("net: af_unix: implement splice for stream af_unix sockets") breaks
the strong requirement for a freezable sleep, according to
commit 0f9548ca1091:
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a
deadlock if the lock is later acquired in the suspend or hibernate path
(e.g. by dpm). Holding a lock can also cause a deadlock in the case of
cgroup_freezer if a lock is held inside a frozen cgroup that is later
acquired by a process outside that group.
The pipe_lock is still held at that point.
So use freezable version only for the recvmsg call path, avoid impact for
Android.
Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Colin Cross <ccross@android.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Johan Hovold says:
====================
net: cpsw: fix leaks and probe deferral
This series fixes as number of leaks and issues in the cpsw probe-error
and driver-unbind paths, some which specifically prevented deferred
probing.
v2
- Keep platform device runtime-resumed throughout probe instead of
resuming in the probe error path as suggested by Grygorii (patch
1/7).
- Runtime-resume platform device before registering any children in
order to make sure it is synchronously suspended after deregistering
children in the error path (patch 3/7).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to propagate errors from of_phy_register_fixed_link() which
can fail with -EPROBE_DEFER.
Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link
PHY")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to check for allocation failures before dereferencing a
NULL-pointer during probe.
Fixes: 649a1688c960 ("net: ethernet: ti: cpsw: create common struct to
hold shared driver data")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to deregister the primary device in case the secondary emac
fails to probe.
kernel BUG at /home/johan/work/omicron/src/linux/net/core/dev.c:7743!
...
[<c05b3dec>] (free_netdev) from [<c04fe6c0>] (cpsw_probe+0x9cc/0xe50)
[<c04fe6c0>] (cpsw_probe) from [<c047b28c>] (platform_drv_probe+0x5c/0xc0)
Fixes: d9ba8f9e6298 ("driver: net: ethernet: cpsw: dual emac interface
implementation")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to drop references taken and deregister devices registered
during probe on probe errors (including deferred probe) and driver
unbind.
Specifically, PHY of-node references were never released and fixed-link
PHY devices were never deregistered.
Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing")
Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link
PHY")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to deregister all child devices also on probe errors to avoid
leaks and to fix probe deferral:
cpsw 4a100000.ethernet: omap_device: omap_device_enable() called from invalid state 1
cpsw 4a100000.ethernet: use pm_runtime_put_sync_suspend() in driver?
cpsw: probe of 4a100000.ethernet failed with error -22
Add generic helper to undo the effects of cpsw_probe_dt(), which will
also be used in a follow-on patch to fix further leaks that have been
introduced more recently.
Note that the platform device is now runtime-resumed before registering
any child devices in order to make sure that it is synchronously
suspended after having deregistered the children in the error path.
Fixes: 1fb19aa730e4 ("net: cpsw: Add parent<->child relation support
between cpsw and mdio")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|