Age | Commit message (Collapse) | Author | Files | Lines |
|
'best' is always less or equals to 'pos', so `best - pos' returns
a negative value which is then getting casted to `unsigned int'
and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
for policy->freq_table selection. This results in
BUG: unable to handle kernel paging request at ffff881019b469f8
IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
PGD 267f067
PUD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
Workqueue: events dbs_work_handler
task: ffff88041b808000 task.stack: ffff88041b810000
RIP: 0010:[<ffffffffa00356c1>] [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
RSP: 0018:ffff88041b813c60 EFLAGS: 00010282
RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
FS: 0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
Stack:
ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
Call Trace:
[<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
[<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
[<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
[<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
[<ffffffff81336fda>] dbs_work_handler+0x39/0x62
[<ffffffff81057823>] process_one_work+0x280/0x4a5
[<ffffffff81058719>] worker_thread+0x24f/0x397
[<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
[<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
[<ffffffff8105d2b7>] kthread+0xfc/0x104
[<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
[<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
[<ffffffff814b2092>] ret_from_fork+0x22/0x30
Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
RIP [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
RSP <ffff88041b813c60>
CR2: ffff881019b469f8
---[ end trace 16d9fc7a17897d37 ]---
[ rjw: In some cases this bug may also cause incorrect frequencies to
be selected by cpufreq governors. ]
Fixes: 899bb6642f2a (cpufreq: skip invalid entries when searching the frequency)
Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2
Reported-and-tested-by: Sedat Dilek <[email protected]>
Reported-and-tested-by: Jörg Otte <[email protected]>
Signed-off-by: Sergey Senozhatsky <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: 4.8+ <[email protected]> # 4.8+
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
The following commit:
c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")
... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.
This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.
We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.
The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions. This is not a
problem when keeping these members within thread_info.
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode. As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.
The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.
Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Linux API <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tycho Andersen <[email protected]>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This allows the file system to tell a FIEMAP from a read operation, and thus
avoids the need to report flags that aren't actually used in the read path.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
|
|
This patch addresses a bug where EXTENDED_COPY across multiple LUNs
results in a CHECK_CONDITION when the source + destination are not
located on the same physical node.
ESX Host environments expect sense COPY_ABORTED w/ COPY TARGET DEVICE
NOT REACHABLE to be returned when this occurs, in order to signal
fallback to local copy method.
As described in section 6.3.3 of spc4r22:
"If it is not possible to complete processing of a segment because the
copy manager is unable to establish communications with a copy target
device, because the copy target device does not respond to INQUIRY,
or because the data returned in response to INQUIRY indicates
an unsupported logical unit, then the EXTENDED COPY command shall be
terminated with CHECK CONDITION status, with the sense key set to
COPY ABORTED, and the additional sense code set to COPY TARGET DEVICE
NOT REACHABLE."
Tested on v4.1.y with ESX v5.5u2+ with BlockCopy across multiple nodes.
Reported-by: Nixon Vincent <[email protected]>
Tested-by: Nixon Vincent <[email protected]>
Cc: Nixon Vincent <[email protected]>
Tested-by: Dinesh Israni <[email protected]>
Signed-off-by: Dinesh Israni <[email protected]>
Cc: Dinesh Israni <[email protected]>
Cc: [email protected] # 3.14+
Signed-off-by: Nicholas Bellinger <[email protected]>
|
|
Ported over from nvme-cli.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This makes life easier for nvme-cli and we don't really need the uuid
type anyway to start with.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Jay Freyensee <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Import a few updates to nvme.h from nvme-cli. This mostly includes a few
new fields and error codes, but also a few renames that so far are only
used in user space. Also one field is moved from an array of two le64
values to one of 16 u8 values so that we can more easily access it.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
NVMe 1.2.1 specification adds a tertiary element to the version number.
This updates the macro and its callers to include the final number and
fixup a single place in nvmet where the version was generated manually.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Merge the gup_flags cleanups from Lorenzo Stoakes:
"This patch series adjusts functions in the get_user_pages* family such
that desired FOLL_* flags are passed as an argument rather than
implied by flags.
The purpose of this change is to make the use of FOLL_FORCE explicit
so it is easier to grep for and clearer to callers that this flag is
being used. The use of FOLL_FORCE is an issue as it overrides missing
VM_READ/VM_WRITE flags for the VMA whose pages we are reading
from/writing to, which can result in surprising behaviour.
The patch series came out of the discussion around commit 38e088546522
("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
which addressed a BUG_ON() being triggered when a page was faulted in
with PROT_NONE set but having been overridden by FOLL_FORCE.
do_numa_page() was run on the assumption the page _must_ be one marked
for NUMA node migration as an actual PROT_NONE page would have been
dealt with prior to this code path, however FOLL_FORCE introduced a
situation where this assumption did not hold.
See
https://marc.info/?l=linux-mm&m=147585445805166
for the patch proposal"
Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.
[ This branch was rebased recently to add a few more acked-by's and
reviewed-by's ]
* gup_flag-cleanups:
mm: replace access_process_vm() write parameter with gup_flags
mm: replace access_remote_vm() write parameter with gup_flags
mm: replace __access_remote_vm() write parameter with gup_flags
mm: replace get_user_pages_remote() write/force parameters with gup_flags
mm: replace get_user_pages() write/force parameters with gup_flags
mm: replace get_vaddr_frames() write/force parameters with gup_flags
mm: replace get_user_pages_locked() write/force parameters with gup_flags
mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
mm: remove write/force parameters from __get_user_pages_unlocked()
mm: remove write/force parameters from __get_user_pages_locked()
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
|
|
This removes the 'write' argument from access_process_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.
We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Jesper Nilsson <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the 'write' argument from access_remote_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.
We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Jesper Nilsson <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the 'write' and 'force' from get_vaddr_frames() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the 'write' and 'force' use from get_user_pages_locked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The offload flag is a status flag and should not be used by
FIB semantics for comparison.
Fixes: 37ed9493699c ("rtnetlink: add RTNH_F_EXTERNAL flag for fib offload")
Signed-off-by: Jiri Pirko <[email protected]>
Reviewed-by: Andy Gospodarek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.8.0-rc7 #1
Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
task: ffff88017edfea40 task.stack: ffff88017ee10000
RIP: 0010:[<ffffffff815dcc73>] [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
Call Trace:
<IRQ>
[<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
[<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
[<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
[<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
[<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
[<ffffffff815f2eb6>] neigh_update+0x306/0x740
[<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
[<ffffffff8165ea3f>] arp_process+0x66f/0x700
[<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
[<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
[<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
[<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
[<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
[<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
[<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
[<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
[<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
[<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
[<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
[<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
[<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
[<ffffffff81374815>] ? find_next_bit+0x15/0x20
[<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
[<ffffffff810c9968>] ? load_balance+0x178/0x9b0
[<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
[<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
[<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
[<ffffffff81091986>] tasklet_action+0xf6/0x110
[<ffffffff81704556>] __do_softirq+0xf6/0x280
[<ffffffff8109213f>] irq_exit+0xdf/0xf0
[<ffffffff817042b4>] do_IRQ+0x54/0xd0
[<ffffffff8170214c>] common_interrupt+0x8c/0x8c
The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.
Fix this by advancing the iterator and avoid the infinite loop.
Fixes: 7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Tamir Winetroub <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This removes the redundant 'write' and 'force' parameters from
__get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.
Reported-and-tested-by: Phil "not Paul" Oester <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Four tooling fixes, two kprobes KASAN related fixes and an x86 PMU
driver fix/cleanup"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf jit: Fix build issue on Ubuntu
perf jevents: Handle events including .c and .o
perf/x86/intel: Remove an inconsistent NULL check
kprobes: Unpoison stack in jprobe_return() for KASAN
kprobes: Avoid false KASAN reports during stack copy
perf header: Set nr_numa_nodes only when we parsed all the data
perf top: Fix refreshing hierarchy entries on TUI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
This is relatively small, mostly to get the SG/crypto
from stack removal fix that crashes things when VMAP
stack is used in conjunction with software crypto.
Aside from that, we have:
* a fix for AP_VLAN usage with the nl80211 frame command
* two fixes (and two preparation patches) for A-MSDU, one
to discard group-addressed (multicast) and unexpected
4-address A-MSDUs, the other to validate A-MSDU inner
MAC addresses properly to prevent controlled port bypass
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The new introduced macro CLK_OF_DECLARE_DRIVER is usually used to
declare clock driver init functions, which are mostly decorated with
__init. Add __init decoration for CLK_OF_DECLARE_DRIVER function to
avoid causing section mismatch warnings on client clock drivers.
Signed-off-by: Shawn Guo <[email protected]>
Fixes: c7296c51ce5d ("clk: core: New macro CLK_OF_DECLARE_DRIVER")
Signed-off-by: Stephen Boyd <[email protected]>
|
|
pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches. They were fully excised from the
x86 code, but some cruft was left in the generic syscall code. The
C++ comments were intended to help to make it more glaring to me to
fix them before actually submitting them. That technique worked,
but later than I would have liked.
I test-compiled this for arm64.
Fixes: a60f7b69d92c0 ("generic syscalls: Wire up memory protection keys syscalls")
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8
bits.
Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue")
Cc: [email protected]
Signed-off-by: Vladimir Murzin <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Currently, socket lookups for l3mdev (vrf) use cases can match a socket
that is bound to a port but not a device (ie., a global socket). If the
sysctl tcp_l3mdev_accept is not set this leads to ack packets going out
based on the main table even though the packet came in from an L3 domain.
The end result is that the connection does not establish creating
confusion for users since the service is running and a socket shows in
ss output. Fix by requiring an exact dif to sk_bound_dev_if match if the
skb came through an interface enslaved to an l3mdev device and the
tcp_l3mdev_accept is not set.
skb's through an l3mdev interface are marked by setting a flag in
inet{6}_skb_parm. The IPv6 variant is already set; this patch adds the
flag for IPv4. Using an skb flag avoids a device lookup on the dif. The
flag is set in the VRF driver using the IP{6}CB macros. For IPv4, the
inet_skb_parm struct is moved in the cb per commit 971f10eca186, so the
match function in the TCP stack needs to use TCP_SKB_CB. For IPv6, the
move is done after the socket lookup, so IP6CB is used.
The flags field in inet_skb_parm struct needs to be increased to add
another flag. There is currently a 1-byte hole following the flags,
so it can be expanded to u16 without increasing the size of the struct.
Fixes: 193125dbd8eb ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When CONFIG_PCC is disabled, pcc_mbox_request_channel() needs to
return ERR_PTR(-ENODEV), not a NULL pointer, as the callers of
this function use IS_ERR() to check for error code.
Signed-off-by: Duc Dang <[email protected]>
Signed-off-by: Hoan Tran <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
I observed false KSAN positives in the sctp code, when
sctp uses jprobe_return() in jsctp_sf_eat_sack().
The stray 0xf4 in shadow memory are stack redzones:
[ ] ==================================================================
[ ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
[ ] Read of size 1 by task syz-executor/18535
[ ] page:ffffea00017923c0 count:0 mapcount:0 mapping: (null) index:0x0
[ ] flags: 0x1fffc0000000000()
[ ] page dumped because: kasan: bad access detected
[ ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
[ ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ ] ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
[ ] ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
[ ] ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
[ ] Call Trace:
[ ] [<ffffffff82d2b849>] dump_stack+0x12e/0x185
[ ] [<ffffffff817d3169>] kasan_report+0x489/0x4b0
[ ] [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
[ ] [<ffffffff82d49529>] memcmp+0xe9/0x150
[ ] [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
[ ] [<ffffffff817d2031>] save_stack+0xb1/0xd0
[ ] [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
[ ] [<ffffffff817d05b8>] kfree+0xc8/0x2a0
[ ] [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
[ ] [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
[ ] [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
[ ] [<ffffffff85b11348>] consume_skb+0x138/0x370
[ ] [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
[ ] [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
[ ] [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
[ ] [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
[ ] [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
[ ] [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
[ ... ]
[ ] Memory state around the buggy address:
[ ] ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ^
[ ] ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ ] ==================================================================
KASAN stack instrumentation poisons stack redzones on function entry
and unpoisons them on function exit. If a function exits abnormally
(e.g. with a longjmp like jprobe_return()), stack redzones are left
poisoned. Later this leads to random KASAN false reports.
Unpoison stack redzones in the frames we are going to jump over
before doing actual longjmp in jprobe_return().
Signed-off-by: Dmitry Vyukov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"
* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Some fixes from Omar and Dave Sterba for our new free space tree.
This isn't heavily used yet, but as we move toward making it the new
default we wanted to nail down an endian bug"
* 'for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: tests: uninline member definitions in free_space_extent
btrfs: tests: constify free space extent specs
Btrfs: expand free space tree sanity tests to catch endianness bug
Btrfs: fix extent buffer bitmap tests on big-endian systems
Btrfs: catch invalid free space trees
Btrfs: fix mount -o clear_cache,space_cache=v2
Btrfs: fix free space tree bitmaps on big-endian systems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This update contains fixes to the "use mounter's permission to access
underlying layers" area, and miscellaneous other fixes and cleanups.
No new features this time"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: use vfs_get_link()
vfs: add vfs_get_link() helper
ovl: use generic_readlink
ovl: explain error values when removing acl from workdir
ovl: Fix info leak in ovl_lookup_temp()
ovl: during copy up, switch to mounter's creds early
ovl: lookup: do getxattr with mounter's permission
ovl: copy_up_xattr(): use strnlen
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:
- EXPORT_SYMBOL for asm source by Al Viro.
This does bring a regression, because genksyms no longer generates
checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
working on a patch to fix this.
Plus, we are talking about functions like strcpy(), which rarely
change prototypes.
- Fixes for PPC fallout of the above by Stephen Rothwell and Nick
Piggin
- fixdep speedup by Alexey Dobriyan.
- preparatory work by Nick Piggin to allow architectures to build with
-ffunction-sections, -fdata-sections and --gc-sections
- CONFIG_THIN_ARCHIVES support by Stephen Rothwell
- fix for filenames with colons in the initramfs source by me.
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
initramfs: Escape colons in depfile
ppc: there is no clear_pages to export
powerpc/64: whitelist unresolved modversions CRCs
kbuild: -ffunction-sections fix for archs with conflicting sections
kbuild: add arch specific post-link Makefile
kbuild: allow archs to select link dead code/data elimination
kbuild: allow architectures to use thin archives instead of ld -r
kbuild: Regenerate genksyms lexer
kbuild: genksyms fix for typeof handling
fixdep: faster CONFIG_ search
ia64: move exports to definitions
sparc32: debride memcpy.S a bit
[sparc] unify 32bit and 64bit string.h
sparc: move exports to definitions
ppc: move exports to definitions
arm: move exports to definitions
s390: move exports to definitions
m68k: move exports to definitions
alpha: move exports to actual definitions
x86: move exports to actual definitions
...
|
|
Pull one more documentation update from Jonathan Corbet:
"A single commit converting the mac80211 DocBook template over to
Sphinx. Only 32 more to go..."
* tag 'docs-4.9-2' of git://git.lwn.net/linux:
docs-rst: sphinxify 802.11 documentation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma qedr RoCE driver from Doug Ledford:
"Early on in the merge window I mentioned I had a backlog of new
drivers waiting to be reviewed and that, in addition to the hns-roce
driver, I wanted to get possible a couple more reviewed. I ended up
only having the time to complete one of the additional drivers.
During Dave Miller's pull request this go around, there were a series
of 9 patches to the QLogic qed net driver that add basic support for a
paired RoCE driver. That support is currently not functional because
it is missing the matching RoCE driver in the RDMA subsystem. I
managed to finish that review. However, because it goes against part
of Dave's net pull, and a part that was accepted a day or two after
the merge window opened, to apply cleanly it has to be applied to
either the tip of Dave's net branch, or as I did in this case, I just
applied it to your master after you had taken Dave's pull request."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
qedr: Add events support and register IB device
qedr: Add GSI support
qedr: Add LL2 RoCE interface
qedr: Add support for data path
qedr: Add support for memory registeration verbs
qedr: Add support for QP verbs
qedr: Add support for PD,PKEY and CQ verbs
qedr: Add support for user context verbs
qedr: Add support for RoCE HW init
qedr: Add RoCE driver framework
|
|
Sparse was complaining when we went to prototype some code
using ethtool_cmd_speed_set and SPEED_100000, which uses
the upper 16 bits of __u32 speed for the first time.
CHECK
...
.../uapi/linux/ethtool.h:123:28: warning:
cast truncates bits from constant value (186a0 becomes 86a0)
The warning is actually bogus, as no bits are really lost, but
we can get rid of the sparse warning with this one small change.
Reported-by: Preethi Banala <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"This includes a couple of fixes needed after recent changes, two ACPI
driver fixes (fan and "Processor Aggregator"), an update of the ACPI
device properties handling code and a new MAINTAINERS entry for ACPI
on ARM64.
Specifics:
- Fix an unused function warning that started to appear after recent
changes in the ACPI EC driver (Eric Biggers).
- Fix the KERN_CONT usage in acpi_os_vprintf() that has become
(particularly) annoying recently (Joe Perches).
- Fix the fan status checking in the ACPI fan driver to avoid
returning incorrect error codes sometimes (Srinivas Pandruvada).
- Fix the ACPI Processor Aggregator driver (PAD) to always let the
special processor_aggregator driver from Xen take over when running
as Xen dom0 (Juergen Gross).
- Update the handling of reference device properties in ACPI by
allowing empty rows ("holes") to appear in reference property lists
(Mika Westerberg).
- Add a new MAINTAINERS entry for ACPI on ARM64 (Lorenzo Pieralisi)"
* tag 'acpi-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
acpi_os_vprintf: Use printk_get_level() to avoid unnecessary KERN_CONT
ACPI / PAD: don't register acpi_pad driver if running as Xen dom0
ACPI / property: Allow holes in reference properties
MAINTAINERS: Add ARM64-specific ACPI maintainers entry
ACPI / EC: Fix unused function warning when CONFIG_PM_SLEEP=n
ACPI / fan: Fix error reading cur_state
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"This includes a couple of fixes for cpufreq regressions introduced in
4.8, a rework of the intel_pstate algorithm used on Atom processors
(that took some time to test) plus a fix and a couple of cleanups in
that driver, a CPPC cpufreq driver fix, and a some devfreq fixes and
cleanups (core and exynos-nocp).
Specifics:
- Fix two cpufreq regressions causing undesirable changes in behavior
to appear (one in the core and one in the conservative governor)
introduced during the 4.8 cycle (Aaro Koskinen, Rafael Wysocki).
- Fix the way the intel_pstate driver accesses MSRs related to the
hardware-managed P-states (HWP) feature during the initialization
which currently is unsafe and may cause the processor to generate a
general protection fault (Srinivas Pandruvada).
- Rework the intel_pstate's P-state selection algorithm used on Atom
processors to avoid known problems with the current one and to make
the computation more straightforward, which also happens to improve
performance in multiple benchmarks a bit (Rafael Wysocki).
- Improve two comments in the intel_pstate driver (Rafael Wysocki).
- Fix the desired performance computation in the CPPC cpufreq driver
(Hoan Tran).
- Fix the devfreq core to avoid printing misleading error messages in
some cases (Tobias Jakobi).
- Fix the error code path in devfreq_add_device() to use proper
locking around list modifications (Axel Lin).
- Fix a build failure and remove a couple of redundant updates of
variables in the exynos-nocp devfreq driver (Axel Lin)"
* tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: CPPC: Correct desired_perf calculation
cpufreq: conservative: Fix next frequency selection
cpufreq: skip invalid entries when searching the frequency
cpufreq: intel_pstate: Fix struct pstate_adjust_policy kerneldoc
cpufreq: intel_pstate: Proportional algorithm for Atom
PM / devfreq: Skip status update on uninitialized previous_freq
PM / devfreq: Add proper locking around list_del()
PM / devfreq: exynos-nocp: Remove redundant code
PM / devfreq: exynos-nocp: Select REGMAP_MMIO
cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance()
cpufreq: intel_pstate: Fix unsafe HWP MSR access
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- tracepoints for basic cgroup management operations added
- kernfs and cgroup path formatting functions updated to behave in the
style of strlcpy()
- non-critical bug fixes
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
cgroup: fix error handling regressions in proc_cgroup_show() and cgroup_release_agent()
cpuset: fix error handling regression in proc_cpuset_show()
cgroup: add tracepoints for basic operations
cgroup: make cgroup_path() and friends behave in the style of strlcpy()
kernfs: remove kernfs_path_len()
kernfs: make kernfs_path*() behave in the style of strlcpy()
kernfs: add dummy implementation of kernfs_path_from_node()
|
|
Add support for Queue Pair verbs which adds, deletes,
modifies and queries Queue Pairs.
Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: Ram Amrani <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add support for protection domain and completion queue verbs.
Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: Ram Amrani <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add support for ucontext, query port, add and del gid verbs.
Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: Ram Amrani <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Adds a skeletal implementation of the qed* RoCE driver -
basically the ability to communicate with the qede driver and
receive notifications from it regarding various init/exit events.
Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: Ram Amrani <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
- Nick improved generic implementations of percpu operations which
modify the variable and return so that they calculate the physical
address only once.
- percpu_ref percpu <-> atomic mode switching improvements. The
patchset was originally posted about a year ago but fell through the
crack.
- misc non-critical fixes.
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk()
mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk()
percpu: eliminate two sparse warnings
percpu: improve generic percpu modify-return implementation
percpu-refcount: init ->confirm_switch member properly
percpu_ref: allow operation mode switching operations to be called concurrently
percpu_ref: restructure operation mode switching
percpu_ref: unify staggered atomic switching wait behavior
percpu_ref: reorganize __percpu_ref_switch_to_atomic() and relocate percpu_ref_switch_to_atomic()
percpu_ref: remove unnecessary RCU grace period for staggered atomic switching confirmation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
- Write same support added
- Minor ahci MSIX irq handling updates
- Non-critical SCSI command translation fixes
- Controller specific changes
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: qoriq: Revert "ahci: qoriq: Disable NCQ on ls2080a SoC"
libata: remove <asm-generic/libata-portmap.h>
libata: remove unused definitions from <asm/libata-portmap.h>
pata_at91: Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
ata: Replace BUG() with BUG_ON().
ata: sata_mv: Replacing dma_pool_alloc and memset with a single call dma_pool_zalloc.
libata: Some drives failing on SCT Write Same
ahci: use pci_alloc_irq_vectors
libata: SCT Write Same handle ATA_DFLAG_PIO
libata: SCT Write Same / DSM Trim
libata: Add support for SCT Write Same
libata: Safely overwrite attached page in WRITE SAME xlat
ahci: also use a per-port lock for the multi-MSIX case
ARM: dts: STiH407-family: Add ports-implemented property in sata nodes
ahci: st: Add ports-implemented property in support
ahci: qoriq: enable snoopable sata read and write
ahci: qoriq: adjust sata parameter
libata-scsi: fix MODE SELECT translation for Control mode page
libata-scsi: use u8 array to store mode page copy
|
|
This easy-to-trigger warning shows up instantly when running
Trinity on a kernel with CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS disabled.
At most this should have been a printk, but the -EINVAL alone should be more
than adequate indicator that something isn't available.
Signed-off-by: Dave Jones <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
"Some more powerpc updates for 4.9:
Freescale updates from Scott Wood:
- qbman support (a prerequisite for datapath drivers such as ethernet)
- a PCI DMA fix+improvement
- reset handler changes
- more 8xx optimizations
- some cleanups and fixes.'
Fixes:
- selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman)
- selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman)
- powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour)
- powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin)
- powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras)
- powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman)
Other:
- MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson)
- MAINTAINERS: Drop separate pseries entry (Michael Ellerman)
- MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman):
* tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (35 commits)
powerpc/mm/hash64: Fix might_have_hea() check
powerpc/64: Fix incorrect return value from __copy_tofrom_user
powerpc/64s: Fix power4_fixup_nap placement
powerpc/pseries: Fix stack corruption in htpe code
selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes
MAINTAINERS: Update powerpc website & add selftests
MAINTAINERS: Drop separate pseries entry
MAINTAINERS: Remove myself from PA Semi entries
selftests/powerpc: Add missing binaries to .gitignores
arch/powerpc: Add CONFIG_FSL_DPAA to corenetXX_smp_defconfig
soc/qman: Add self-test for QMan driver
soc/bman: Add self-test for BMan driver
soc/fsl: Introduce DPAA 1.x QMan device driver
soc/fsl: Introduce DPAA 1.x BMan device driver
powerpc/8xx: make user addr DTLB miss the short path
powerpc/8xx: Move additional DTLBMiss handlers out of exception area
powerpc/8xx: use r3 to scratch CR in ITLBmiss
soc/fsl/qe: fix gpio save_regs functions
powerpc/8xx: add dedicated machine check handler
powerpc/8xx: add system_reset_exception
...
|
|
The qedr driver would require a tristate Kconfig option [to allow
it to compile as a module], and toward that end we've added the
INFINIBAND_QEDR option. But as we've made the compilation of the
qed/qede infrastructure required for RoCE dependent on the option
we'd be facing linking difficulties in case that QED=y or QEDE=y,
and INFINIBAND_QEDR=m.
To resolve this, we seperate between the INFINIBAND_QEDR option
and the infrastructure support in qed/qede by introducing a new
QED_RDMA option which would be selected by INFINIBAND_QEDR but would
be a boolean instead of a tristate; Following that, the qed/qede is
fixed based on this new option so that all config combinations would
be supported.
Fixes: cee9fbd8e2e9 ("qede: add qedr framework")
Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Yuval Mintz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The IPv6 temporary address generation uses a variable called DESYNC_FACTOR
to prevent hosts updating the addresses at the same time. Quoting RFC 4941:
... The value DESYNC_FACTOR is a random value (different for each
client) that ensures that clients don't synchronize with each other and
generate new addresses at exactly the same time ...
DESYNC_FACTOR is defined as:
DESYNC_FACTOR -- A random value within the range 0 - MAX_DESYNC_FACTOR.
It is computed once at system start (rather than each time it is used)
and must never be greater than (TEMP_VALID_LIFETIME - REGEN_ADVANCE).
First, I believe the RFC has a typo in it and meant to say: "and must
never be greater than (TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE)"
The reason is that at various places in the RFC, DESYNC_FACTOR is used in
a calculation like (TEMP_PREFERRED_LIFETIME - DESYNC_FACTOR) or
(TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE - DESYNC_FACTOR). It needs to be
smaller than (TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE) for the result of
these calculations to be larger than zero. It's never used in a
calculation together with TEMP_VALID_LIFETIME.
I already submitted an errata to the rfc-editor:
https://www.rfc-editor.org/errata_search.php?rfc=4941
The Linux implementation of DESYNC_FACTOR is very wrong:
max_desync_factor is used in places DESYNC_FACTOR should be used.
max_desync_factor is initialized to the RFC-recommended value for
MAX_DESYNC_FACTOR (600) but the whole point is to get a _random_ value.
And nothing ensures that the value used is not greater than
(TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE), which leads to underflows. The
effect can easily be observed when setting the temp_prefered_lft sysctl
e.g. to 60. The preferred lifetime of the temporary addresses will be
bogus.
TEMP_PREFERRED_LIFETIME and REGEN_ADVANCE are not constants and can be
influenced by these three sysctls: regen_max_retry, dad_transmits and
temp_prefered_lft. Thus, the upper bound for desync_factor needs to be
re-calculated each time a new address is generated and if desync_factor is
larger than the new upper bound, a new random value needs to be
re-generated.
And since we already have max_desync_factor configurable per interface, we
also need to calculate and store desync_factor per interface.
Signed-off-by: Jiri Bohac <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The randomized interface identifier (rndid) was periodically updated from
the regen_timer timer. Simplify the code by updating the rndid only when
needed by ipv6_try_regen_rndid().
This makes the follow-up DESYNC_FACTOR fix much simpler. Also it fixes a
reference counting error in this error path, where an in6_dev_put was
missing:
err = addrconf_sysctl_register(ndev);
if (err) {
ipv6_mc_destroy_dev(ndev);
- del_timer(&ndev->regen_timer);
snmp6_unregister_dev(ndev);
goto err_release;
Signed-off-by: Jiri Bohac <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|