aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-09-23tcp: fix wrong checksum calculation on MTU probingDouglas Caetano dos Santos1-5/+7
With TCP MTU probing enabled and offload TX checksumming disabled, tcp_mtu_probe() calculated the wrong checksum when a fragment being copied into the probe's SKB had an odd length. This was caused by the direct use of skb_copy_and_csum_bits() to calculate the checksum, as it pads the fragment being copied, if needed. When this fragment was not the last, a subsequent call used the previous checksum without considering this padding. The effect was a stale connection in one way, as even retransmissions wouldn't solve the problem, because the checksum was never recalculated for the full SKB length. Signed-off-by: Douglas Caetano dos Santos <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-23Merge tag 'linux-can-fixes-for-4.8-20160922' of ↵David S. Miller2-11/+19
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-22 this is a pull request of one patch for the upcoming linux-4.8 release. The patch by Sergei Miroshnichenko fixes a potential deadlock in the generic CAN device code that cann occour after a bus-off. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-23sch_sfb: keep backlog updated with qlenWANG Cong1-0/+3
Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-23sch_qfq: keep backlog updated with qlenWANG Cong1-0/+3
Reported-by: Stas Nichiporovich <[email protected]> Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-22Merge tag 'media/v4.8-7' of ↵Linus Torvalds22-78/+212
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - several fixes for new drivers added for Kernel 4.8 addition (cec core, pulse8 cec driver and Mediatek vcodec) - a regression fix for cx23885 and saa7134 drivers - an important fix for rcar-fcp, making rcar_fcp_enable() return 0 on success * tag 'media/v4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (25 commits) [media] cx23885/saa7134: assign q->dev to the PCI device [media] rcar-fcp: Make sure rcar_fcp_enable() returns 0 on success [media] cec: fix ioctl return code when not registered [media] cec: don't Feature Abort broadcast msgs when unregistered [media] vcodec:mediatek: Refine VP8 encoder driver [media] vcodec:mediatek: Refine H264 encoder driver [media] vcodec:mediatek: change H264 profile default to profile high [media] vcodec:mediatek: Add timestamp and timecode copy for V4L2 Encoder [media] vcodec:mediatek: Fix visible_height larger than coded_height issue in s_fmt_out [media] vcodec:mediatek: Fix fops_vcodec_release flow for V4L2 Encoder [media] vcodec:mediatek:code refine for v4l2 Encoder driver [media] cec-funcs.h: add missing vendor-specific messages [media] cec-edid: check for IEEE identifier [media] pulse8-cec: fix error handling [media] pulse8-cec: set correct Signal Free Time [media] mtk-vcodec: add HAS_DMA dependency [media] cec: ignore messages when log_addr_mask == 0 [media] cec: add item to TODO [media] cec: set unclaimed addresses to CEC_LOG_ADDR_INVALID [media] cec: add CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK flag ...
2016-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds65-343/+658
Pull networking fixes from David Miller: "Mostly small bits scattered all over the place, which is usually how things go this late in the -rc series. 1) Proper driver init device resets in bnx2, from Baoquan He. 2) Fix accounting overflow in __tcp_retransmit_skb(), sk_forward_alloc, and ip_idents_reserve, from Eric Dumazet. 3) Fix crash in bna driver ethtool stats handling, from Ivan Vecera. 4) Missing check of skb_linearize() return value in mac80211, from Johannes Berg. 5) Endianness fix in nf_table_trace dumps, from Liping Zhang. 6) SSN comparison fix in SCTP, from Marcelo Ricardo Leitner. 7) Update DSA and b44 MAINTAINERS entries. 8) Make input path of vti6 driver work again, from Nicolas Dichtel. 9) Off-by-one in mlx4, from Sebastian Ott. 10) Fix fallback route lookup handling in ipv6, from Vincent Bernat. 11) Fix stack corruption on probe in qed driver, from Yuval Mintz. 12) PHY init fixes in r8152 from Hayes Wang. 13) Missing SKB free in irda_accept error path, from Phil Turnbull" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tcp: properly account Fast Open SYN-ACK retrans tcp: fix under-accounting retransmit SNMP counters MAINTAINERS: Update b44 maintainer. net: get rid of an signed integer overflow in ip_idents_reserve() net/mlx4_core: Fix to clean devlink resources net: can: ifi: Configure transmitter delay vti6: fix input path ipmr, ip6mr: return lastuse relative to now r8152: disable ALDPS and EEE before setting PHY r8152: remove r8153_enable_eee r8152: move PHY settings to hw_phy_cfg r8152: move enabling PHY r8152: move some functions cxgb4/cxgb4vf: Allocate more queues for 25G and 100G adapter qed: Fix stack corruption on probe MAINTAINERS: Add an entry for the core network DSA code net: ipv6: fallback to full lookup if table lookup is unsuitable net/mlx5: E-Switch, Handle mode change failures net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code net/mlx5: Fix flow counter bulk command out mailbox allocation ...
2016-09-22can: dev: fix deadlock reported after bus-offSergei Miroshnichenko2-11/+19
A timer was used to restart after the bus-off state, leading to a relatively large can_restart() executed in an interrupt context, which in turn sets up pinctrl. When this happens during system boot, there is a high probability of grabbing the pinctrl_list_mutex, which is locked already by the probe() of other device, making the kernel suspect a deadlock condition [1]. To resolve this issue, the restart_timer is replaced by a delayed work. [1] https://github.com/victronenergy/venus/issues/24 Signed-off-by: Sergei Miroshnichenko <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2016-09-22tcp: properly account Fast Open SYN-ACK retransYuchung Cheng3-1/+4
Since the TFO socket is accepted right off SYN-data, the socket owner can call getsockopt(TCP_INFO) to collect ongoing SYN-ACK retransmission or timeout stats (i.e., tcpi_total_retrans, tcpi_retransmits). Currently those stats are only updated upon handshake completes. This patch fixes it. Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-22tcp: fix under-accounting retransmit SNMP countersYuchung Cheng1-1/+1
This patch fixes these under-accounting SNMP rtx stats LINUX_MIB_TCPFORWARDRETRANS LINUX_MIB_TCPFASTRETRANS LINUX_MIB_TCPSLOWSTARTRETRANS when retransmitting TSO packets Fixes: 10d3be569243 ("tcp-tso: do not split TSO packets at retransmit time") Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-22Merge branch 'master' of ↵David S. Miller7-15/+51
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2016-09-21 1) Propagate errors on security context allocation. From Mathias Krause. 2) Fix inbound policy checks for inter address family tunnels. From Thomas Zeitlhofer. 3) Fix an old memory leak on aead algorithm usage. From Ilan Tayari. 4) A recent patch fixed a possible NULL pointer dereference but broke the vti6 input path. Fix from Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-22Merge tag 'linux-can-fixes-for-4.8-20160921' of ↵David S. Miller1-1/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-21 this is another pull request of one patch for the upcoming linux-4.8 release. Marek Vasut fixes the CAN-FD bit rate switch in the ifi driver by configuring the transmitter delay. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-22MAINTAINERS: Update b44 maintainer.Michael Chan1-1/+1
Taking over as maintainer since Gary Zambrano is no longer working for Broadcom. Signed-off-by: Michael Chan <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-22net: get rid of an signed integer overflow in ip_idents_reserve()Eric Dumazet1-2/+8
Jiri Pirko reported an UBSAN warning happening in ip_idents_reserve() [] UBSAN: Undefined behaviour in ./arch/x86/include/asm/atomic.h:156:11 [] signed integer overflow: [] -2117905507 + -695755206 cannot be represented in type 'int' Since we do not have uatomic_add_return() yet, use atomic_cmpxchg() so that the arithmetics can be done using unsigned int. Fixes: 04ca6973f7c1 ("ip: make IP identifiers less predictable") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-22net/mlx4_core: Fix to clean devlink resourcesKamal Heib1-0/+3
This patch cleans devlink resources by calling devlink_port_unregister() to avoid the following issues: - Kernel panic when triggering reset flow. - Memory leak due to unfreed resources in mlx4_init_port_info(). Fixes: 09d4d087cd48 ("mlx4: Implement devlink interface") Signed-off-by: Kamal Heib <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21Merge tag 'wireless-drivers-for-davem-2016-09-20' of ↵David S. Miller1-10/+9
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.8 iwlwifi * fix to prevent firmware crash when sending off-channel frames ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-21net: can: ifi: Configure transmitter delayMarek Vasut1-1/+10
Configure the transmitter delay register at +0x1c to correctly handle the CAN FD bitrate switch (BRS). This moves the SSP (secondary sample point) to a proper offset, so that the TDC mechanism works and won't generate error frames on the CAN link. Signed-off-by: Marek Vasut <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Oliver Hartkopp <[email protected]> Cc: Wolfgang Grandegger <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2016-09-21vti6: fix input pathNicolas Dichtel4-10/+16
Since commit 1625f4529957, vti6 is broken, all input packets are dropped (LINUX_MIB_XFRMINNOSTATES is incremented). XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in xfrm6_rcv_spi(). A new function xfrm6_rcv_tnl() that enables to pass a value to xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function is used in several handlers). CC: Alexey Kodanev <[email protected]> Fixes: 1625f4529957 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2016-09-21ipmr, ip6mr: return lastuse relative to nowNikolay Aleksandrov2-4/+10
When I introduced the lastuse member I made a subtle error because it was returned as an absolute value but that is meaningless to user-space as it doesn't allow to see how old exactly an entry is. Let's make it similar to how the bridge returns such values and make it relative to "now" (jiffies). This allows us to show the actual age of the entries and is much more useful (e.g. user-space daemons can age out entries, iproute2 can display the lastuse properly). Fixes: 43b9e1274060 ("net: ipmr/ip6mr: add support for keeping an entry age") Reported-by: Satish Ashok <[email protected]> Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21Merge branch 'r8152-phy-fixes'David S. Miller1-135/+146
Hayes Wang says: ==================== r8152: correct the flow of PHY First, to enable the PHY as early as possible. Some settings may fail if the PHY is power down. Move the other PHY settings to hw_phy_cfg() to make sure the order is correct. Finally, disable ALDPS and EEE before updating the PHY for RTL8153. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-21r8152: disable ALDPS and EEE before setting PHYhayeswang1-2/+8
Disable ALDPS and EEE to avoid the possible failure when setting the PHY. Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21r8152: remove r8153_enable_eeehayeswang1-7/+3
Remove r8153_enable_eee(). Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21r8152: move PHY settings to hw_phy_cfghayeswang1-6/+8
Move the PHY relative settings together to hw_phy_cfg(). Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21r8152: move enabling PHYhayeswang1-18/+25
Move enabling PHY to init(), otherwise some other settings may fail. Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21r8152: move some functionshayeswang1-112/+112
Move the following functions forward. r8152_mmd_indirect() r8152_mmd_read() r8152_mmd_write() r8152_eee_en() r8152b_enable_eee() r8153_eee_en() r8153_enable_eee() r8152b_enable_fc() r8153_aldps_en() Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-21cxgb4/cxgb4vf: Allocate more queues for 25G and 100G adapterHariprasad Shenai6-11/+45
We were missing check for 25G and 100G while checking port speed, which lead to less number of queues getting allocated for 25G & 100G adapters and leading to low throughput. Adding the missing check for both NIC and vNIC driver. Also fixes port advertisement for 25G and 100G in ethtool output. Signed-off-by: Hariprasad Shenai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-20Merge tag 'linux-can-fixes-for-4.8-20160919' of ↵David S. Miller1-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-19 this is a pull request of one patch for the upcoming linux-4.8 release. The patch by Fabio Estevam fixes the pm handling in the flexcan driver. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-20Merge tag 'usercopy-v4.8-rc8' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull usercopy hardening fix from Kees Cook: "Expand the arm64 vmalloc check to include skipping the module space too" * tag 'usercopy-v4.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mm: usercopy: Check for module addresses
2016-09-20fix fault_in_multipages_...() on architectures with no-op access_ok()Al Viro1-19/+19
Switching iov_iter fault-in to multipages variants has exposed an old bug in underlying fault_in_multipages_...(); they break if the range passed to them wraps around. Normally access_ok() done by callers will prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into such a range and they should not point to any valid objects). However, on architectures where userland and kernel live in different MMU contexts (e.g. s390) access_ok() is a no-op and on those a range with a wraparound can reach fault_in_multipages_...(). Since any wraparound means EFAULT there, the fix is trivial - turn those while (uaddr <= end) ... into if (unlikely(uaddr > end)) return -EFAULT; do ... while (uaddr <= end); Reported-by: Jan Stancek <[email protected]> Tested-by: Jan Stancek <[email protected]> Cc: [email protected] # v3.5+ Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-20mm: usercopy: Check for module addressesLaura Abbott1-1/+4
While running a compile on arm64, I hit a memory exposure usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes) ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:75! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_security ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle iptable_security iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys CPU: 0 PID: 19744 Comm: updatedb Tainted: G W 4.8.0-rc3-threadinfo+ #1 Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016 task: fffffe03df944c00 task.stack: fffffe00d128c000 PC is at __check_object_size+0x70/0x3f0 LR is at __check_object_size+0x70/0x3f0 ... [<fffffc00082b4280>] __check_object_size+0x70/0x3f0 [<fffffc00082cdc30>] filldir64+0x158/0x1a0 [<fffffc0000f327e8>] __fat_readdir+0x4a0/0x558 [fat] [<fffffc0000f328d4>] fat_readdir+0x34/0x40 [fat] [<fffffc00082cd8f8>] iterate_dir+0x190/0x1e0 [<fffffc00082cde58>] SyS_getdents64+0x88/0x120 [<fffffc0008082c70>] el0_svc_naked+0x24/0x28 fffffc0000f3b1a8 is a module address. Modules may have compiled in strings which could get copied to userspace. In this instance, it looks like "." which matches with a size of 1 byte. Extend the is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover all possible cases. Signed-off-by: Laura Abbott <[email protected]> Signed-off-by: Kees Cook <[email protected]>
2016-09-20fs/proc/kcore.c: Add bounce buffer for ktext dataJiri Olsa1-1/+6
We hit hardened usercopy feature check for kernel text access by reading kcore file: usercopy: kernel memory exposure attempt detected from ffffffff8179a01f (<kernel text>) (4065 bytes) kernel BUG at mm/usercopy.c:75! Bypassing this check for kcore by adding bounce buffer for ktext data. Reported-by: Steve Best <[email protected]> Fixes: f5509cc18daa ("mm: Hardened usercopy") Suggested-by: Kees Cook <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-20fs/proc/kcore.c: Make bounce buffer global for readJiri Olsa1-10/+14
Next patch adds bounce buffer for ktext area, so it's convenient to have single bounce buffer for both vmalloc/module and ktext cases. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-20qed: Fix stack corruption on probeYuval Mintz1-2/+2
Commit fe56b9e6a8d95 ("qed: Add module with basic common support") has introduced a stack corruption during probe, where filling a local struct with data to be sent to management firmware is incorrectly filled; The data is written outside of the struct and corrupts the stack. Changes from v1: ---------------- - Correct the value written [Caught by David Laight] Fixes: fe56b9e6a8d95 ("qed: Add module with basic common support") Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-20MAINTAINERS: Add an entry for the core network DSA codeAndrew Lunn1-0/+9
The core distributed switch architecture code currently does not have a MAINTAINERS entry, which results in some contributions not landing in the right peoples inbox. Signed-off-by: Andrew Lunn <[email protected]> Acked-by: Florian Fainelli <[email protected]> Acked-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-20net: ipv6: fallback to full lookup if table lookup is unsuitableVincent Bernat1-1/+10
Commit 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups") introduced a regression: insertion of an IPv6 route in a table not containing the appropriate connected route for the gateway but which contained a non-connected route (like a default gateway) fails while it was previously working: $ ip link add eth0 type dummy $ ip link set up dev eth0 $ ip addr add 2001:db8::1/64 dev eth0 $ ip route add ::/0 via 2001:db8::5 dev eth0 table 20 $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20 RTNETLINK answers: No route to host $ ip -6 route show table 20 default via 2001:db8::5 dev eth0 metric 1024 pref medium After this patch, we get: $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20 $ ip -6 route show table 20 2001:db8:cafe::1 via 2001:db8::6 dev eth0 metric 1024 pref medium default via 2001:db8::5 dev eth0 metric 1024 pref medium Fixes: 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups") Signed-off-by: Vincent Bernat <[email protected]> Acked-by: David Ahern <[email protected]> Tested-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-19Merge branch 'mlx5-fixes'David S. Miller3-8/+17
Or Gerlitz says: ==================== mlx5 fixes to 4.8-rc6 This series series has a fix from Roi to memory corruption bug in the bulk flow counters code and two late and hopefully last fixes from me to the new eswitch offloads code. Series done over net commit 37dd348 "bna: fix crash in bnad_get_strings()" ==================== Signed-off-by: David S. Miller <[email protected]>
2016-09-19net/mlx5: E-Switch, Handle mode change failuresOr Gerlitz1-6/+14
E-switch mode changes involve creating HW tables, potentially allocating netdevices, etc, and things can fail. Add an attempt to rollback to the existing mode when changing to the new mode fails. Only if rollback fails, getting proper SRIOV functionality requires module unload or sriov disablement/enablement. Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-19net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init codeOr Gerlitz1-0/+1
When enablement of the SRIOV e-switch in certain mode (switchdev or legacy) fails, we must set the mode to none. Otherwise, we'll run into double free based crashes when further attempting to deal with the e-switch (such as when disabling sriov or unloading the driver). Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-19net/mlx5: Fix flow counter bulk command out mailbox allocationRoi Dayan1-2/+2
The FW command output length should be only the length of struct mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy call which is invoked later in the driver to write over wrong memory address and corrupt kernel memory which results in random crashes. This bug was found using the kernel address sanitizer (kasan). Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters') Signed-off-by: Roi Dayan <[email protected]> Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-09-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds22-146/+240
Merge fixes from Andrew Morton: "20 fixes" * emailed patches from Andrew Morton <[email protected]>: rapidio/rio_cm: avoid GFP_KERNEL in atomic context Revert "ocfs2: bump up o2cb network protocol version" ocfs2: fix start offset to ocfs2_zero_range_for_truncate() cgroup: duplicate cgroup reference when cloning sockets mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting ocfs2: fix double unlock in case retry after free truncate log fanotify: fix list corruption in fanotify_get_response() fsnotify: add a way to stop queueing events on group shutdown ocfs2: fix trans extend while free cached blocks ocfs2: fix trans extend while flush truncate log ipc/shm: fix crash if CONFIG_SHMEM is not set mm: fix the page_swap_info() BUG_ON check autofs: use dentry flags to block walks during expire MAINTAINERS: update email for VLYNQ bus entry mm: avoid endless recursion in dump_page() mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin() khugepaged: fix use-after-free in collapse_huge_page() MAINTAINERS: Maik has moved ocfs2/dlm: fix race between convert and migration mem-hotplug: don't clear the only node in new_node_page()
2016-09-19rapidio/rio_cm: avoid GFP_KERNEL in atomic contextAlexandre Bounine1-3/+16
As reported by Alexey Khoroshilov (https://lkml.org/lkml/2016/9/9/737): riocm_send_close() is called from rio_cm_shutdown() under spin_lock_bh(idr_lock), but riocm_send_close() uses a GFP_KERNEL allocation. Fix by taking riocm_send_close() outside of spinlock protected code. [[email protected]: remove unneeded `if (!list_empty())'] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Alexandre Bounine <[email protected]> Reported-by: Alexey Khoroshilov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19Revert "ocfs2: bump up o2cb network protocol version"Junxiao Bi1-4/+1
This reverts commit 38b52efd218b ("ocfs2: bump up o2cb network protocol version"). This commit made rolling upgrade fail. When one node is upgraded to new version with this commit, the remaining nodes will fail to establish connections to it, then the application like VMs on the remaining nodes can't be live migrated to the upgraded one. This will cause an outage. Since negotiate hb timeout behavior didn't change without this commit, so revert it. Fixes: 38b52efd218bf ("ocfs2: bump up o2cb network protocol version") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Joseph Qi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19ocfs2: fix start offset to ocfs2_zero_range_for_truncate()Ashish Samant1-10/+24
If we punch a hole on a reflink such that following conditions are met: 1. start offset is on a cluster boundary 2. end offset is not on a cluster boundary 3. (end offset is somewhere in another extent) or (hole range > MAX_CONTIG_BYTES(1MB)), we dont COW the first cluster starting at the start offset. But in this case, we were wrongly passing this cluster to ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster in place and zero it in the source too. Fix this by skipping this cluster in such a scenario. To reproduce: 1. Create a random file of say 10 MB xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile 2. Reflink it reflink -f 10MBfile reflnktest 3. Punch a hole at starting at cluster boundary with range greater that 1MB. You can also use a range that will put the end offset in another extent. fallocate -p -o 0 -l 1048615 reflnktest 4. sync 5. Check the first cluster in the source file. (It will be zeroed out). dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ashish Samant <[email protected]> Reported-by: Saar Maoz <[email protected]> Reviewed-by: Srinivas Eeda <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Eric Ren <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19cgroup: duplicate cgroup reference when cloning socketsJohannes Weiner2-1/+10
When a socket is cloned, the associated sock_cgroup_data is duplicated but not its reference on the cgroup. As a result, the cgroup reference count will underflow when both sockets are destroyed later on. Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: <[email protected]> [4.5+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accountingJohannes Weiner1-9/+22
During cgroup2 rollout into production, we started encountering css refcount underflows and css access crashes in the memory controller. Splitting the heavily shared css reference counter into logical users narrowed the imbalance down to the cgroup2 socket memory accounting. The problem turns out to be the per-cpu charge cache. Cgroup1 had a separate socket counter, but the new cgroup2 socket accounting goes through the common charge path that uses a shared per-cpu cache for all memory that is being tracked. Those caches are safe against scheduling preemption, but not against interrupts - such as the newly added packet receive path. When cache draining is interrupted by network RX taking pages out of the cache, the resuming drain operation will put references of in-use pages, thus causing the imbalance. Disable IRQs during all per-cpu charge cache operations. Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: <[email protected]> [4.5+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19ocfs2: fix double unlock in case retry after free truncate logJoseph Qi1-2/+12
If ocfs2_reserve_cluster_bitmap_bits() fails with ENOSPC, it will try to free truncate log and then retry. Since ocfs2_try_to_free_truncate_log will lock/unlock global bitmap inode, we have to unlock it before calling this function. But when retry reserve and it fails with no global bitmap inode lock taken, it will unlock again in error handling branch and BUG. This issue also exists if no need retry and then ocfs2_inode_lock fails. So fix it. Fixes: 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in truncate log") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Signed-off-by: Jiufei Xue <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19fanotify: fix list corruption in fanotify_get_response()Jan Kara4-42/+25
fanotify_get_response() calls fsnotify_remove_event() when it finds that group is being released from fanotify_release() (bypass_perm is set). However the event it removes need not be only in the group's notification queue but it can have already moved to access_list (userspace read the event before closing the fanotify instance fd) which is protected by a different lock. Thus when fsnotify_remove_event() races with fanotify_release() operating on access_list, the list can get corrupted. Fix the problem by moving all the logic removing permission events from the lists to one place - fanotify_release(). Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Reported-by: Miklos Szeredi <[email protected]> Tested-by: Miklos Szeredi <[email protected]> Reviewed-by: Miklos Szeredi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19fsnotify: add a way to stop queueing events on group shutdownJan Kara3-1/+29
Implement a function that can be called when a group is being shutdown to stop queueing new events to the group. Fanotify will use this. Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Miklos Szeredi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19ocfs2: fix trans extend while free cached blocksJunxiao Bi1-18/+9
The root cause of this issue is the same with the one fixed by the last patch, but this time credits for allocator inode and group descriptor may not be consumed before trans extend. The following error was caught: WARNING: CPU: 0 PID: 2037 at fs/jbd2/transaction.c:269 start_this_handle+0x4c3/0x510 [jbd2]() Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_netfront parport_pc parport pcspkr i2c_piix4 i2c_core acpi_cpufreq ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 2037 Comm: rm Tainted: G W 4.1.12-37.6.3.el6uek.bug24573128v2.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016 Call Trace: dump_stack+0x48/0x5c warn_slowpath_common+0x95/0xe0 warn_slowpath_null+0x1a/0x20 start_this_handle+0x4c3/0x510 [jbd2] jbd2__journal_restart+0x161/0x1b0 [jbd2] jbd2_journal_restart+0x13/0x20 [jbd2] ocfs2_extend_trans+0x74/0x220 [ocfs2] ocfs2_free_cached_blocks+0x16b/0x4e0 [ocfs2] ocfs2_run_deallocs+0x70/0x270 [ocfs2] ocfs2_commit_truncate+0x474/0x6f0 [ocfs2] ocfs2_truncate_for_delete+0xbd/0x380 [ocfs2] ocfs2_wipe_inode+0x136/0x6a0 [ocfs2] ocfs2_delete_inode+0x2a2/0x3e0 [ocfs2] ocfs2_evict_inode+0x28/0x60 [ocfs2] evict+0xab/0x1a0 iput_final+0xf6/0x190 iput+0xc8/0xe0 do_unlinkat+0x1b7/0x310 SyS_unlinkat+0x22/0x40 system_call_fastpath+0x12/0x71 ---[ end trace a62437cb060baa71 ]--- JBD2: rm wants too many credits (149 > 128) Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19ocfs2: fix trans extend while flush truncate logJunxiao Bi1-19/+10
Every time, ocfs2_extend_trans() included a credit for truncate log inode, but as that inode had been managed by jbd2 running transaction first time, it will not consume that credit until jbd2_journal_restart(). Since total credits to extend always included the un-consumed ones, there will be more and more un-consumed credit, at last jbd2_journal_restart() will fail due to credit number over the half of max transction credit. The following error was caught when unlinking a large file with many extents: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 13626 at fs/jbd2/transaction.c:269 start_this_handle+0x4c3/0x510 [jbd2]() Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport pcspkr i2c_piix4 i2c_core acpi_cpufreq ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 13626 Comm: unlink Tainted: G W 4.1.12-37.6.3.el6uek.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016 Call Trace: dump_stack+0x48/0x5c warn_slowpath_common+0x95/0xe0 warn_slowpath_null+0x1a/0x20 start_this_handle+0x4c3/0x510 [jbd2] jbd2__journal_restart+0x161/0x1b0 [jbd2] jbd2_journal_restart+0x13/0x20 [jbd2] ocfs2_extend_trans+0x74/0x220 [ocfs2] ocfs2_replay_truncate_records+0x93/0x360 [ocfs2] __ocfs2_flush_truncate_log+0x13e/0x3a0 [ocfs2] ocfs2_remove_btree_range+0x458/0x7f0 [ocfs2] ocfs2_commit_truncate+0x1b3/0x6f0 [ocfs2] ocfs2_truncate_for_delete+0xbd/0x380 [ocfs2] ocfs2_wipe_inode+0x136/0x6a0 [ocfs2] ocfs2_delete_inode+0x2a2/0x3e0 [ocfs2] ocfs2_evict_inode+0x28/0x60 [ocfs2] evict+0xab/0x1a0 iput_final+0xf6/0x190 iput+0xc8/0xe0 do_unlinkat+0x1b7/0x310 SyS_unlink+0x16/0x20 system_call_fastpath+0x12/0x71 ---[ end trace 28aa7410e69369cf ]--- JBD2: unlink wants too many credits (251 > 128) Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-09-19ipc/shm: fix crash if CONFIG_SHMEM is not setKirill A. Shutemov1-0/+9
Commit c01d5b300774 ("shmem: get_unmapped_area align huge page") makes use of shm_get_unmapped_area() in shm_file_operations() unconditional to CONFIG_MMU. As Tony Battersby pointed this can lead NULL-pointer dereference on machine with CONFIG_MMU=y and CONFIG_SHMEM=n. In this case ipc/shm is backed by ramfs which doesn't provide f_op->get_unmapped_area for configurations with MMU. The solution is to provide dummy f_op->get_unmapped_area for ramfs when CONFIG_MMU=y, which just call current->mm->get_unmapped_area(). Fixes: c01d5b300774 ("shmem: get_unmapped_area align huge page") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Reported-by: Tony Battersby <[email protected]> Tested-by: Tony Battersby <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: <[email protected]> [4.7.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>