aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-08-28bridge: check for null fdb->dst before notifying switchdev driversRoopa Prabhu1-1/+1
current switchdev drivers dont seem to support offloading fdb entries pointing to the bridge device which have fdb->dst not set to any port. This patch adds a NULL fdb->dst check in the switchdev notifier code. This patch fixes the below NULL ptr dereference: $bridge fdb add 00:02:00:00:00:33 dev br0 self [ 69.953374] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 69.954044] IP: br_switchdev_fdb_notify+0x29/0x80 [ 69.954044] PGD 66527067 [ 69.954044] P4D 66527067 [ 69.954044] PUD 7899c067 [ 69.954044] PMD 0 [ 69.954044] [ 69.954044] Oops: 0000 [#1] SMP [ 69.954044] Modules linked in: [ 69.954044] CPU: 1 PID: 3074 Comm: bridge Not tainted 4.13.0-rc6+ #1 [ 69.954044] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 [ 69.954044] task: ffff88007b827140 task.stack: ffffc90001564000 [ 69.954044] RIP: 0010:br_switchdev_fdb_notify+0x29/0x80 [ 69.954044] RSP: 0018:ffffc90001567918 EFLAGS: 00010246 [ 69.954044] RAX: 0000000000000000 RBX: ffff8800795e0880 RCX: 00000000000000c0 [ 69.954044] RDX: ffffc90001567920 RSI: 000000000000001c RDI: ffff8800795d0600 [ 69.954044] RBP: ffffc90001567938 R08: ffff8800795d0600 R09: 0000000000000000 [ 69.954044] R10: ffffc90001567a88 R11: ffff88007b849400 R12: ffff8800795e0880 [ 69.954044] R13: ffff8800795d0600 R14: ffffffff81ef8880 R15: 000000000000001c [ 69.954044] FS: 00007f93d3085700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 [ 69.954044] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 69.954044] CR2: 0000000000000008 CR3: 0000000066551000 CR4: 00000000000006e0 [ 69.954044] Call Trace: [ 69.954044] fdb_notify+0x3f/0xf0 [ 69.954044] __br_fdb_add.isra.12+0x1a7/0x370 [ 69.954044] br_fdb_add+0x178/0x280 [ 69.954044] rtnl_fdb_add+0x10a/0x200 [ 69.954044] rtnetlink_rcv_msg+0x1b4/0x240 [ 69.954044] ? skb_free_head+0x21/0x40 [ 69.954044] ? rtnl_calcit.isra.18+0xf0/0xf0 [ 69.954044] netlink_rcv_skb+0xed/0x120 [ 69.954044] rtnetlink_rcv+0x15/0x20 [ 69.954044] netlink_unicast+0x180/0x200 [ 69.954044] netlink_sendmsg+0x291/0x370 [ 69.954044] ___sys_sendmsg+0x180/0x2e0 [ 69.954044] ? filemap_map_pages+0x2db/0x370 [ 69.954044] ? do_wp_page+0x11d/0x420 [ 69.954044] ? __handle_mm_fault+0x794/0xd80 [ 69.954044] ? vma_link+0xcb/0xd0 [ 69.954044] __sys_sendmsg+0x4c/0x90 [ 69.954044] SyS_sendmsg+0x12/0x20 [ 69.954044] do_syscall_64+0x63/0xe0 [ 69.954044] entry_SYSCALL64_slow_path+0x25/0x25 [ 69.954044] RIP: 0033:0x7f93d2bad690 [ 69.954044] RSP: 002b:00007ffc7217a638 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 69.954044] RAX: ffffffffffffffda RBX: 00007ffc72182eac RCX: 00007f93d2bad690 [ 69.954044] RDX: 0000000000000000 RSI: 00007ffc7217a670 RDI: 0000000000000003 [ 69.954044] RBP: 0000000059a1f7f8 R08: 0000000000000006 R09: 000000000000000a [ 69.954044] R10: 00007ffc7217a400 R11: 0000000000000246 R12: 00007ffc7217a670 [ 69.954044] R13: 00007ffc72182a98 R14: 00000000006114c0 R15: 00007ffc72182aa0 [ 69.954044] Code: 1f 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 47 20 04 74 0a 83 fe 1c 74 09 83 fe 1d 74 2c c9 66 90 c3 48 8b 47 10 48 8d 55 e8 <48> 8b 70 08 0f b7 47 1e 48 83 c7 18 48 89 7d f0 bf 03 00 00 00 [ 69.954044] RIP: br_switchdev_fdb_notify+0x29/0x80 RSP: ffffc90001567918 [ 69.954044] CR2: 0000000000000008 [ 69.954044] ---[ end trace 03e9eec4a82c238b ]--- Fixes: 6b26b51b1d13 ("net: bridge: Add support for notifying devices about FDB add/del") Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configsTejun Heo1-1/+5
When !NUMA, cpumask_of_node(@node) equals cpu_online_mask regardless of @node. The assumption seems that if !NUMA, there shouldn't be more than one node and thus reporting cpu_online_mask regardless of @node is correct. However, that assumption was broken years ago to support DISCONTIGMEM and whether a system has multiple nodes or not is separately controlled by NEED_MULTIPLE_NODES. This means that, on a system with !NUMA && NEED_MULTIPLE_NODES, cpumask_of_node() will report cpu_online_mask for all possible nodes, indicating that the CPUs are associated with multiple nodes which is an impossible configuration. This bug has been around forever but doesn't look like it has caused any noticeable symptoms. However, it triggers a WARN recently added to workqueue to verify NUMA affinity configuration. Fix it by reporting empty cpumask on non-zero nodes if !NUMA. Signed-off-by: Tejun Heo <[email protected]> Reported-and-tested-by: Geert Uytterhoeven <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2017-08-28ARCv2: SMP: Mask only private-per-core IRQ lines on boot at core intcAlexey Brodkin2-3/+10
Recent commit a8ec3ee861b6 "arc: Mask individual IRQ lines during core INTC init" breaks interrupt handling on ARCv2 SMP systems. That commit masked all interrupts at onset, as some controllers on some boards (customer as well as internal), would assert interrutps early before any handlers were installed. For SMP systems, the masking was done at each cpu's core-intc. Later, when the IRQ was actually requested, it was unmasked, but only on the requesting cpu. For "common" interrupts, which were wired up from the 2nd level IDU intc, this was as issue as they needed to be enabled on ALL the cpus (given that IDU IRQs are by default served Round Robin across cpus) So fix that by NOT masking "common" interrupts at core-intc, but instead at the 2nd level IDU intc (latter already being done in idu_of_init()) Fixes: a8ec3ee861b6 ("arc: Mask individual IRQ lines during core INTC init") Signed-off-by: Alexey Brodkin <[email protected]> [vgupta: reworked changelog, removed the extraneous idu_irq_mask_raw()] Signed-off-by: Vineet Gupta <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-28fs/select: Fix memory corruption in compat_get_fd_set()Helge Deller1-5/+1
Commit 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()") changed the calculation on how many bytes need to be zeroed when userspace handed over a NULL pointer for a fdset array in the select syscall. The calculation was changed in compat_get_fd_set() wrongly from memset(fdset, 0, ((nr + 1) & ~1)*sizeof(compat_ulong_t)); to memset(fdset, 0, ALIGN(nr, BITS_PER_LONG)); The ALIGN(nr, BITS_PER_LONG) calculates the number of _bits_ which need to be zeroed in the target fdset array (rounded up to the next full bits for an unsigned long). But the memset() call expects the number of _bytes_ to be zeroed. This leads to clearing more memory than wanted (on the stack area or even at kmalloc()ed memory areas) and to random kernel crashes as we have seen them on the parisc platform. The correct change should have been memset(fdset, 0, (ALIGN(nr, BITS_PER_LONG) / BITS_PER_LONG) * BYTES_PER_LONG); which is the same as can be archieved with a call to zero_fd_set(nr, fdset). Fixes: 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()" Acked-by:: Al Viro <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-28ipv6: set dst.obsolete when a cached route has expiredXin Long2-2/+5
Now it doesn't check for the cached route expiration in ipv6's dst_ops->check(), because it trusts dst_gc that would clean the cached route up when it's expired. The problem is in dst_gc, it would clean the cached route only when it's refcount is 1. If some other module (like xfrm) keeps holding it and the module only release it when dst_ops->check() fails. But without checking for the cached route expiration, .check() may always return true. Meanwhile, without releasing the cached route, dst_gc couldn't del it. It will cause this cached route never to expire. This patch is to set dst.obsolete with DST_OBSOLETE_KILL in .gc when it's expired, and check obsolete != DST_OBSOLETE_FORCE_CHK in .check. Note that this is even needed when ipv6 dst_gc timer is removed one day. It would set dst.obsolete in .redirect and .update_pmtu instead, and check for cached route expiration when getting it, just like what ipv4 route does. Reported-by: Jianlin Shi <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28ipv6: fix sparse warning on rt6i_nodeWei Wang4-7/+11
Commit c5cff8561d2d adds rcu grace period before freeing fib6_node. This generates a new sparse warning on rt->rt6i_node related code: net/ipv6/route.c:1394:30: error: incompatible types in comparison expression (different address spaces) ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison expression (different address spaces) This commit adds "__rcu" tag for rt6i_node and makes sure corresponding rcu API is used for it. After this fix, sparse no longer generates the above warning. Fixes: c5cff8561d2d ("ipv6: add rcu grace period before freeing fib6_node") Signed-off-by: Wei Wang <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28cxgb4: Fix stack out-of-bounds read due to wrong size to t4_record_mbox()Stefano Brivio1-3/+3
Passing commands for logging to t4_record_mbox() with size MBOX_LEN, when the actual command size is actually smaller, causes out-of-bounds stack accesses in t4_record_mbox() while copying command words here: for (i = 0; i < size / 8; i++) entry->cmd[i] = be64_to_cpu(cmd[i]); Up to 48 bytes from the stack are then leaked to debugfs. This happens whenever we send (and log) commands described by structs fw_sched_cmd (32 bytes leaked), fw_vi_rxmode_cmd (48), fw_hello_cmd (48), fw_bye_cmd (48), fw_initialize_cmd (48), fw_reset_cmd (48), fw_pfvf_cmd (32), fw_eq_eth_cmd (16), fw_eq_ctrl_cmd (32), fw_eq_ofld_cmd (32), fw_acl_mac_cmd(16), fw_rss_glb_config_cmd(32), fw_rss_vi_config_cmd(32), fw_devlog_cmd(32), fw_vi_enable_cmd(48), fw_port_cmd(32), fw_sched_cmd(32), fw_devlog_cmd(32). The cxgb4vf driver got this right instead. When we call t4_record_mbox() to log a command reply, a MBOX_LEN size can be used though, as get_mbox_rpl() will fill cmd_rpl up completely. Fixes: 7f080c3f2ff0 ("cxgb4: Add support to enable logging of firmware mailbox commands") Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28net: stmmac: sun8i: Remove the compatiblesMaxime Ripard1-8/+0
Since the bindings have been controversial, and we follow the DT stable ABI rule, we shouldn't let a driver with a DT binding that might change slip through in a stable release. Remove the compatibles to make sure the driver will not probe and no-one will start using the binding currently implemented. This commit will obviously need to be reverted in due time. Signed-off-by: Maxime Ripard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28Merge branch 'nfp-flow-dissector-layer'David S. Miller2-86/+113
Pieter Jansen van Vuuren says: ==================== nfp: fix layer calculation and flow dissector use Previously when calculating the supported key layers MPLS, IPv4/6 TTL and TOS were not considered. Formerly flow dissectors were referenced without first checking that they are in use and correctly populated by TC. Additionally this patch set fixes the incorrect use of mask field for vlan matching. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-08-28nfp: remove incorrect mask check for vlan matchingPieter Jansen van Vuuren1-6/+2
Previously the vlan tci field was incorrectly exact matched. This patch fixes this by using the flow dissector to populate the vlan tci field. Fixes: 5571e8c9f241 ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28nfp: fix supported key layers calculationPieter Jansen van Vuuren1-0/+19
Previously when calculating the supported key layers MPLS, IPv4/6 TTL and TOS were not considered. This patch checks that the TTL and TOS fields are masked out before offloading. Additionally this patch checks that MPLS packets are correctly handled, by not offloading them. Fixes: af9d842c1354 ("nfp: extend flower add flow offload") Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28nfp: fix unchecked flow dissector usePieter Jansen van Vuuren2-81/+93
Previously flow dissectors were referenced without first checking that they are in use and correctly populated by TC. This patch fixes this by checking each flow dissector key before referencing them. Fixes: 5571e8c9f241 ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28Merge branch 'l2tp-tunnel-refs'David S. Miller3-72/+73
Guillaume Nault says: ==================== l2tp: fix some l2tp_tunnel_find() issues in l2tp_netlink Since l2tp_tunnel_find() doesn't take a reference on the tunnel it returns, its users are almost guaranteed to be racy. This series defines l2tp_tunnel_get() which can be used as a safe replacement, and converts some of l2tp_tunnel_find() users in the l2tp_netlink module. Other users often combine this issue with other more or less subtle races. They will be fixed incrementally in followup series. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: hold tunnel used while creating sessions with netlinkGuillaume Nault1-9/+12
Use l2tp_tunnel_get() to retrieve tunnel, so that it can't go away on us. Otherwise l2tp_tunnel_destruct() might release the last reference count concurrently, thus freeing the tunnel while we're using it. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: hold tunnel while handling genl TUNNEL_GET commandsGuillaume Nault1-12/+15
Use l2tp_tunnel_get() instead of l2tp_tunnel_find() so that we get a reference on the tunnel, preventing l2tp_tunnel_destruct() from freeing it from under us. Also move l2tp_tunnel_get() below nlmsg_new() so that we only take the reference when needed. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: hold tunnel while handling genl tunnel updatesGuillaume Nault1-2/+4
We need to make sure the tunnel is not going to be destroyed by l2tp_tunnel_destruct() concurrently. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: hold tunnel while processing genl delete commandGuillaume Nault1-2/+4
l2tp_nl_cmd_tunnel_delete() needs to take a reference on the tunnel, to prevent it from being concurrently freed by l2tp_tunnel_destruct(). Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: hold tunnel while looking up sessions in l2tp_netlinkGuillaume Nault3-47/+38
l2tp_tunnel_find() doesn't take a reference on the returned tunnel. Therefore, it's unsafe to use it because the returned tunnel can go away on us anytime. Fix this by defining l2tp_tunnel_get(), which works like l2tp_tunnel_find(), but takes a reference on the returned tunnel. Caller then has to drop this reference using l2tp_tunnel_dec_refcount(). As l2tp_tunnel_dec_refcount() needs to be moved to l2tp_core.h, let's simplify the patch and not move the L2TP_REFCNT_DEBUG part. This code has been broken (not even compiling) in May 2012 by commit a4ca44fa578c ("net: l2tp: Standardize logging styles") and fixed more than two years later by commit 29abe2fda54f ("l2tp: fix missing line continuation"). So it doesn't appear to be used by anyone. Same thing for l2tp_tunnel_free(); instead of moving it to l2tp_core.h, let's just simplify things and call kfree_rcu() directly in l2tp_tunnel_dec_refcount(). Extra assertions and debugging code provided by l2tp_tunnel_free() didn't help catching any of the reference counting and socket handling issues found while working on this series. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28l2tp: initialise session's refcount before making it reachableGuillaume Nault1-4/+2
Sessions must be fully initialised before calling l2tp_session_add_to_tunnel(). Otherwise, there's a short time frame where partially initialised sessions can be accessed by external users. Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28net: mvpp2: fix the mac address used when using PPv2.2Antoine Tenart1-1/+1
The mac address is only retrieved from h/w when using PPv2.1. Otherwise the variable holding it is still checked and used if it contains a valid value. As the variable isn't initialized to an invalid mac address value, we end up with random mac addresses which can be the same for all the ports handled by this PPv2 driver. Fixes this by initializing the h/w mac address variable to {0}, which is an invalid mac address value. This way the random assignation fallback is called and all ports end up with their own addresses. Signed-off-by: Antoine Tenart <[email protected]> Fixes: 2697582144dd ("net: mvpp2: handle misc PPv2.1/PPv2.2 differences") Signed-off-by: David S. Miller <[email protected]>
2017-08-28cdc_ncm: flag the u-blox TOBY-L4 as wwanAleksander Morgado1-0/+7
The u-blox TOBY-L4 is a LTE Advanced (Cat 6) module with HSPA+ and 2G fallback. Unlike the TOBY-L2, this module has one single USB layout and exposes several TTYs for control and a NCM interface for data. Connecting this module may be done just by activating the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP on the NCM interface. Signed-off-by: Aleksander Morgado <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28net: missing call of trace_napi_poll in busy_poll_stopJesper Dangaard Brouer1-0/+1
Noticed that busy_poll_stop() also invoke the drivers napi->poll() function pointer, but didn't have an associated call to trace_napi_poll() like all other call sites. Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-28Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreamingLinus Torvalds8-27/+17
Pull c6x tweaks from Mark Salter. * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: c6x: Convert to using %pOF instead of full_name c6x: defconfig: Cleanup from old Kconfig options
2017-08-28Input: synaptics - fix device info appearing different on reconnectAnthony Martin1-15/+20
User-modified input settings no longer survive a suspend/resume cycle. Starting with 4.12, the touchpad is reinitialized on every reconnect because the hardware appears to be different. This can be reproduced by running the following as root: echo -n reconnect >/sys/devices/platform/i8042/serio1/drvctl A line like the following will show up in dmesg: [30378.295794] psmouse serio1: synaptics: hardware appears to be different: id(149271-149271), model(114865-114865), caps(d047b3-d047b1), ext(b40000-b40000). Note the single bit difference in caps: bit 1 (SYN_CAP_MULTIFINGER). This happens because we modify our stored copy of the device info capabilities when we enable advanced gesture mode but this change is not reflected in the actual hardware capabilities. It worked in the past because synaptics_query_hardware used to modify the stored synaptics_device_info struct instead of filling in a new one, as it does now. Fix it by no longer faking the SYN_CAP_MULTIFINGER bit when setting advanced gesture mode. This necessitated a small refactoring. Fixes: 6c53694fb222 ("Input: synaptics - split device info into a separate structure") Signed-off-by: Anthony Martin <[email protected]> Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2017-08-28libata: quirk read log on no-name M.2 SSDChristoph Hellwig2-0/+5
Ido reported that reading the log page on his systems fails, so quirk it as it won't support ZBC or security protocols. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Ido Schimmel <[email protected]> Tested-by: Ido Schimmel <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2017-08-28libnvdimm: clean up command definitionsDan Williams1-37/+0
Remove the command payloads that do not have an associated libnvdimm ioctl. I.e. remove the payloads that would only ever be carried in the ND_CMD_CALL envelope. This prevents userspace from growing unnecessary dependencies on this kernel header when userspace already has everything it needs to craft and send these commands. Cc: Jerry Hoemann <[email protected]> Reported-by: Yasunori Goto <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2017-08-28dm mpath: do not lock up a CPU with requeuing activityBart Van Assche1-1/+0
When using the block layer in single queue mode, get_request() returns ERR_PTR(-EAGAIN) if the queue is dying and the REQ_NOWAIT flag has been passed to get_request(). Avoid that the kernel reports soft lockup complaints in this case due to continuous requeuing activity. Fixes: 7083abbbf ("dm mpath: avoid that path removal can trigger an infinite loop") Cc: [email protected] Signed-off-by: Bart Van Assche <[email protected]> Tested-by: Laurence Oberman <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-08-28dm: fix printk() rate limiting codeBart Van Assche2-39/+12
Using the same rate limiting state for different kinds of messages is wrong because this can cause a high frequency message to suppress a report of a low frequency message. Hence use a unique rate limiting state per message type. Fixes: 71a16736a15e ("dm: use local printk ratelimit") Cc: [email protected] Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-08-28dm mpath: retry BLK_STS_RESOURCE errorsBart Van Assche1-1/+0
Retry requests instead of failing them if an out-of-memory error occurs or the block driver below dm-mpath is busy. This restores the v4.12 behavior of noretry_error(), namely that -ENOMEM results in a retry. Fixes: 2a842acab109 ("block: introduce new block status code type") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-08-28dm: fix the second dec_pending() argument in __split_and_process_bio()Bart Van Assche1-1/+1
Detected by sparse. Fixes: 4e4cbee93d56 ("block: switch bios to blk_status_t") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Laurence Oberman <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-08-28arm: dts: sunxi: Revert EMAC changesMaxime Ripard10-128/+0
Since the discussion is not settled yet for the EMAC, and that the release in getting really close, let's revert the changes for now, and we'll reintroduce them later. Acked-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Maxime Ripard <[email protected]>
2017-08-28arm64: dts: allwinner: Revert EMAC changesMaxime Ripard8-135/+0
Since the discussion is not settled yet for the EMAC, and that the release in getting really close, let's revert the changes for now, and we'll reintroduce them later. Acked-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Maxime Ripard <[email protected]>
2017-08-28dt-bindings: net: Revert sun8i dwmac bindingMaxime Ripard1-84/+0
This binding still doesn't please everyone, and we're getting far too close from the release to allow it to reach a stable version. Let's remove it until the discussion settles down. Acked-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Maxime Ripard <[email protected]>
2017-08-28xfrm_user: fix info leak in build_aevent()Mathias Krause1-0/+1
The memory reserved to dump the ID of the xfrm state includes a padding byte in struct xfrm_usersa_id added by the compiler for alignment. To prevent the heap info leak, memset(0) the sa_id before filling it. Cc: Jamal Hadi Salim <[email protected]> Fixes: d51d081d6504 ("[IPSEC]: Sync series - user") Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2017-08-28xfrm_user: fix info leak in build_expire()Mathias Krause1-0/+2
The memory reserved to dump the expired xfrm state includes padding bytes in struct xfrm_user_expire added by the compiler for alignment. To prevent the heap info leak, memset(0) the remainder of the struct. Initializing the whole structure isn't needed as copy_to_user_state() already takes care of clearing the padding bytes within the 'state' member. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2017-08-28xfrm_user: fix info leak in xfrm_notify_sa()Mathias Krause1-0/+1
The memory reserved to dump the ID of the xfrm state includes a padding byte in struct xfrm_usersa_id added by the compiler for alignment. To prevent the heap info leak, memset(0) the whole struct before filling it. Cc: Herbert Xu <[email protected]> Fixes: 0603eac0d6b7 ("[IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notification") Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2017-08-28xfrm_user: fix info leak in copy_user_offload()Mathias Krause1-1/+1
The memory reserved to dump the xfrm offload state includes padding bytes of struct xfrm_user_offload added by the compiler for alignment. Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Cc: Steffen Klassert <[email protected]> Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2017-08-27Linux 4.13-rc7Linus Torvalds1-1/+1
2017-08-27Merge tag 'iommu-fixes-v4.13-rc6' of ↵Linus Torvalds4-15/+37
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Another fix, this time in common IOMMU sysfs code. In the conversion from the old iommu sysfs-code to the iommu_device_register interface, I missed to update the release path for the struct device associated with an IOMMU. It freed the 'struct device', which was a pointer before, but is now embedded in another struct. Freeing from the middle of allocated memory had all kinds of nasty side effects when an IOMMU was unplugged. Unfortunatly nobody unplugged and IOMMU until now, so this was not discovered earlier. The fix is to make the 'struct device' a pointer again" * tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix wrong freeing of iommu_device->dev
2017-08-27Merge tag 'char-misc-4.13-rc7' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fix from Greg KH: "Here is a single misc driver fix for 4.13-rc7. It resolves a reported problem in the Android binder driver due to previous patches in 4.13-rc. It's been in linux-next with no reported issues" * tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ANDROID: binder: fix proc->tsk check.
2017-08-27Merge tag 'staging-4.13-rc7' of ↵Linus Torvalds12-46/+107
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/iio fixes from Greg KH: "Here are few small staging driver fixes, and some more IIO driver fixes for 4.13-rc7. Nothing major, just resolutions for some reported problems. All of these have been in linux-next with no reported problems" * tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: magnetometer: st_magn: remove ihl property for LSM303AGR iio: magnetometer: st_magn: fix status register address for LSM303AGR iio: hid-sensor-trigger: Fix the race with user space powering up sensors iio: trigger: stm32-timer: fix get trigger mode iio: imu: adis16480: Fix acceleration scale factor for adis16480 PATCH] iio: Fix some documentation warnings staging: rtl8188eu: add RNX-N150NUB support Revert "staging: fsl-mc: be consistent when checking strcmp() return" iio: adc: stm32: fix common clock rate iio: adc: ina219: Avoid underflow for sleeping time iio: trigger: stm32-timer: add enable attribute iio: trigger: stm32-timer: fix get/set down count direction iio: trigger: stm32-timer: fix write_raw return value iio: trigger: stm32-timer: fix quadrature mode get routine iio: bmp280: properly initialize device for humidity reading
2017-08-27Merge tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntbLinus Torvalds3-5/+7
Pull NTB fixes from Jon Mason: "NTB bug fixes to address an incorrect ntb_mw_count reference in the NTB transport, improperly bringing down the link if SPADs are corrupted, and an out-of-order issue regarding link negotiation and data passing" * tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb: ntb: ntb_test: ensure the link is up before trying to configure the mws ntb: transport shouldn't disable link due to bogus values in SPADs ntb: use correct mw_count function in ntb_tool and ntb_transport
2017-08-27Avoid page waitqueue race leaving possible page locker waitingLinus Torvalds1-4/+5
The "lock_page_killable()" function waits for exclusive access to the page lock bit using the WQ_FLAG_EXCLUSIVE bit in the waitqueue entry set. That means that if it gets woken up, other waiters may have been skipped. That, in turn, means that if it sees the page being unlocked, it *must* take that lock and return success, even if a lethal signal is also pending. So instead of checking for lethal signals first, we need to check for them after we've checked the actual bit that we were waiting for. Even if that might then delay the killing of the process. This matches the order of the old "wait_on_bit_lock()" infrastructure that the page locking used to use (and is still used in a few other areas). Note that if we still return an error after having unsuccessfully tried to acquire the page lock, that is ok: that means that some other thread was able to get ahead of us and lock the page, and when that other thread then unlocks the page, the wakeup event will be repeated. So any other pending waiters will now get properly woken up. Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Cc: Nick Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Jan Kara <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-27Minor page waitqueue cleanupsLinus Torvalds2-8/+10
Tim Chen and Kan Liang have been battling a customer load that shows extremely long page wakeup lists. The cause seems to be constant NUMA migration of a hot page that is shared across a lot of threads, but the actual root cause for the exact behavior has not been found. Tim has a patch that batches the wait list traversal at wakeup time, so that we at least don't get long uninterruptible cases where we traverse and wake up thousands of processes and get nasty latency spikes. That is likely 4.14 material, but we're still discussing the page waitqueue specific parts of it. In the meantime, I've tried to look at making the page wait queues less expensive, and failing miserably. If you have thousands of threads waiting for the same page, it will be painful. We'll need to try to figure out the NUMA balancing issue some day, in addition to avoiding the excessive spinlock hold times. That said, having tried to rewrite the page wait queues, I can at least fix up some of the braindamage in the current situation. In particular: (a) we don't want to continue walking the page wait list if the bit we're waiting for already got set again (which seems to be one of the patterns of the bad load). That makes no progress and just causes pointless cache pollution chasing the pointers. (b) we don't want to put the non-locking waiters always on the front of the queue, and the locking waiters always on the back. Not only is that unfair, it means that we wake up thousands of reading threads that will just end up being blocked by the writer later anyway. Also add a comment about the layout of 'struct wait_page_key' - there is an external user of it in the cachefiles code that means that it has to match the layout of 'struct wait_bit_key' in the two first members. It so happens to match, because 'struct page *' and 'unsigned long *' end up having the same values simply because the page flags are the first member in struct page. Cc: Tim Chen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Christopher Lameter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-27Clarify (and fix) MAX_LFS_FILESIZE macrosLinus Torvalds1-2/+2
We have a MAX_LFS_FILESIZE macro that is meant to be filled in by filesystems (and other IO targets) that know they are 64-bit clean and don't have any 32-bit limits in their IO path. It turns out that our 32-bit value for that limit was bogus. On 32-bit, the VM layer is limited by the page cache to only 32-bit index values, but our logic for that was confusing and actually wrong. We used to define that value to (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1) which is actually odd in several ways: it limits the index to 31 bits, and then it limits files so that they can't have data in that last byte of a page that has the highest 31-bit index (ie page index 0x7fffffff). Neither of those limitations make sense. The index is actually the full 32 bit unsigned value, and we can use that whole full page. So the maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG". However, we do wan tto avoid the maximum index, because we have code that iterates over the page indexes, and we don't want that code to overflow. So the maximum size of a file on a 32-bit host should actually be one page less than the full 32-bit index. So the actual limit is ULONG_MAX << PAGE_SHIFT. That means that we will not actually be using the page of that last index (ULONG_MAX), but we can grow a file up to that limit. The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5 volume. It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one byte less), but the actual true VM limit is one page less than 16TiB. This was invisible until commit c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"), which started applying that MAX_LFS_FILESIZE limit to block devices too. NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is actually just the offset type itself (loff_t), which is signed. But for clarity, on 64-bit, just use the maximum signed value, and don't make people have to count the number of 'f' characters in the hex constant. So just use LLONG_MAX for the 64-bit case. That was what the value had been before too, just written out as a hex constant. Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-and-tested-by: Doug Nazar <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2017-08-26Merge branch 'for-linus' of ↵Linus Torvalds6-13/+45
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a tweak to the IBM Trackpoint driver that helps recognizing trackpoints on never Lenovo Carbons - a fix to the ALPS driver solving scroll issues on some Dells - yet another ACPI ID has been added to Elan I2C toucpad driver - quieted diagnostic message in soc_button_array driver * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365 Input: trackpoint - add new trackpoint firmware ID Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
2017-08-26Merge tag 'pci-v4.13-fixes-3' of ↵Linus Torvalds1-10/+3
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Remove needlessly alarming MSI affinity warning (this is not actually a bug fix, but the warning prompts unnecessary bug reports)" * tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
2017-08-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2-4/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: one for an ldt_struct handling bug and a cherry-picked objtool fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix use-after-free of ldt_struct objtool: Fix '-mtune=atom' decoding support in objtool 2.0
2017-08-26Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-9/+41
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a timer granularity handling race+bug, which would manifest itself by spuriously increasing timeouts of some timers (from 1 jiffy to ~500 jiffies in the worst case measured) in certain nohz states" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Fix excessive granularity of new timers after a nohz idle
2017-08-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-20/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "A single fix to not allow nonsensical event groups that result in kernel warnings" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix group {cpu,task} validation