aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-08-09mac80211: add api to start ba session timer expired flowNaftali Goldstein2-1/+36
Some drivers handle rx buffer reordering internally (and by extension handle also the rx ba session timer internally), but do not ofload the addba/delba negotiation. Add an api for these drivers to properly tear-down the ba session, including sending a delba. Signed-off-by: Naftali Goldstein <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-08-09iwlwifi: mvm: don't WARN when a legit race happens in A-MPDUEmmanuel Grumbach1-4/+6
When we start an Rx A-MPDU session, we first get the AddBA request, then we send an ADD_STA command to the firmware that will reply with a BAID which is a hardware resource that tracks the BA session. This BAID will appear on each and every frame that we get from the firwmare until the A-MPDU session is torn down. In the Rx path, we look at this BAID to manage the reordering buffer. This flow is inherently racy since the hardware will start to put the BAID in the frames it receives even if the firmware hasn't sent the response to the ADD_STA command. This basically means that the driver can get frames with a valid BAID that it doesn't know yet. When that happens, the driver used to WARN. Fix this by simply not WARN in this case. When the driver will know abou the BAID, it will initialise the relevant states and the next frame with a valid BAID will refresh them. Fixes: b915c10174fb ("iwlwifi: mvm: add reorder buffer per queue") Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-08-09iwlwifi: mvm: start mac queues when deferred tx frames are purgedAvraham Stern1-1/+11
In AP mode, if a station is removed just as it is adding a new stream, the queue in question will remain stopped and no more TX will happen in this queue, leading to connection failures and other problems. This is because under DQA, when tx is deferred because a queue needs to be allocated, the mac queue for that TID is stopped until the new stream is added. If at this point the station that this stream belongs to is removed, all the deferred tx frames are purged, but the mac queue is not restarted. As a result, all following tx on this queue will not be transmitted. Fix this by starting the relevant mac queues when the deferred tx frames are purged. Fixes: 24afba7690e4 ("iwlwifi: mvm: support bss dynamic alloc/dealloc of queues") Signed-off-by: Avraham Stern <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-08-08net: avoid skb_warn_bad_offload false positives on UFOWillem de Bruijn3-3/+3
skb_warn_bad_offload triggers a warning when an skb enters the GSO stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL checksum offload set. Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") observed that SKB_GSO_DODGY producers can trigger the check and that passing those packets through the GSO handlers will fix it up. But, the software UFO handler will set ip_summed to CHECKSUM_NONE. When __skb_gso_segment is called from the receive path, this triggers the warning again. Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On Tx these two are equivalent. On Rx, this better matches the skb state (checksum computed), as CHECKSUM_NONE here means no checksum computed. See also this thread for context: http://patchwork.ozlabs.org/patch/799015/ Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08qmi_wwan: fix NULL deref on disconnectBjørn Mork1-1/+5
qmi_wwan_disconnect is called twice when disconnecting devices with separate control and data interfaces. The first invocation will set the interface data to NULL for both interfaces to flag that the disconnect has been handled. But the matching NULL check was left out when qmi_wwan_disconnect was added, resulting in this oops: usb 2-1.4: USB disconnect, device number 4 qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: <stripped irrelevant module list> CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G E 4.12.3-nr44-normandy-r1500619820+ #1 Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017 Workqueue: usb_hub_wq hub_event [usbcore] task: ffff8c882b716040 task.stack: ffffb8e800d84000 RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400 RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000 R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8 R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0 Call Trace: ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan] ? usbnet_disconnect+0x6c/0xf0 [usbnet] ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan] ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 Reported-and-tested-by: Nathaniel Roach <[email protected]> Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Cc: Daniele Palmas <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08ppp: fix xmit recursion detection on ppp channelsGuillaume Nault1-8/+10
Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") dropped the xmit_recursion counter incrementation in ppp_channel_push() and relied on ppp_xmit_process() for this task. But __ppp_channel_push() can also send packets directly (using the .start_xmit() channel callback), in which case the xmit_recursion counter isn't incremented anymore. If such packets get routed back to the parent ppp unit, ppp_xmit_process() won't notice the recursion and will call ppp_channel_push() on the same channel, effectively creating the deadlock situation that the xmit_recursion mechanism was supposed to prevent. This patch re-introduces the xmit_recursion counter incrementation in ppp_channel_push(). Since the xmit_recursion variable is now part of the parent ppp unit, incrementation is skipped if the channel doesn't have any. This is fine because only packets routed through the parent unit may enter the channel recursively. Finally, we have to ensure that pch->ppp is not going to be modified while executing ppp_channel_push(). Instead of taking this lock only while calling ppp_xmit_process(), we now have to hold it for the full ppp_channel_push() execution. This respects the ppp locks ordering which requires locking ->upl before ->downl. Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08rds: Reintroduce statistics countingHåkon Bugge1-1/+4
In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection while senders are present"), refilling the receive queue was removed from rds_ib_recv(), along with the increment of s_ib_rx_refill_from_thread. Commit 73ce4317bf98 ("RDS: make sure we post recv buffers") re-introduces filling the receive queue from rds_ib_recv(), but does not add the statistics counter. rds_ib_recv() was later renamed to rds_ib_recv_path(). This commit reintroduces the statistics counting of s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq. Signed-off-by: Håkon Bugge <[email protected]> Reviewed-by: Knut Omang <[email protected]> Reviewed-by: Wei Lin Guay <[email protected]> Reviewed-by: Shamir Rabinovitch <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08tcp: fastopen: tcp_connect() must refresh the routeEric Dumazet1-0/+4
With new TCP_FASTOPEN_CONNECT socket option, there is a possibility to call tcp_connect() while socket sk_dst_cache is either NULL or invalid. +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4 +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 connect(4, ..., ...) = 0 << sk->sk_dst_cache becomes obsolete, or even set to NULL >> +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000 We need to refresh the route otherwise bad things can happen, especially when syzkaller is running on the host :/ Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support") Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Wei Wang <[email protected]> Cc: Yuchung Cheng <[email protected]> Acked-by: Wei Wang <[email protected]> Acked-by: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08net: sched: set xt_tgchk_param par.net properly in ipt_init_targetXin Long1-10/+10
Now xt_tgchk_param par in ipt_init_target is a local varibale, par.net is not initialized there. Later when xt_check_target calls target's checkentry in which it may access par.net, it would cause kernel panic. Jaroslav found this panic when running: # ip link add TestIface type dummy # tc qd add dev TestIface ingress handle ffff: # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \ action xt -j CONNMARK --set-mark 4 This patch is to pass net param into ipt_init_target and set par.net with it properly in there. v1->v2: As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix it by also passing net_id to __tcf_ipt_init. v2->v3: Missed the fixes tag, so add it. Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put") Reported-by: Jaroslav Aster <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08net: dsa: mediatek: add adjust link support for user portsJohn Crispin2-0/+39
Manually adjust the port settings of user ports once PHY polling has completed. This patch extends the adjust_link callback to configure the per port PMCR register, applying the proper values polled from the PHY. Without this patch flow control was not always getting setup properly. Signed-off-by: Shashidhar Lakkavalli <[email protected]> Signed-off-by: Muciri Gatimu <[email protected]> Signed-off-by: John Crispin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-08net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packetsDavide Caratti1-11/+18
if the NIC fails to validate the checksum on TCP/UDP, and validation of IP checksum is successful, the driver subtracts the pseudo-header checksum from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails. V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set Reported-by: Shuang Li <[email protected]> Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-09drm/exynos: forbid creating framebuffers from too small GEM buffersMarek Szyprowski1-1/+13
Add a check if the framebuffer described by the provided drm_mode_fb_cmd2 structure fits into provided GEM buffers. Without this check it is possible to create a framebuffer object from a small buffer and set it to the hardware, what results in displaying system memory outside the allocated GEM buffer. Signed-off-by: Marek Szyprowski <[email protected]> Reviewed-by: Tobias Jakobi <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-08-08nfs/flexfiles: fix leak of nfs4_ff_ds_version arraysWeston Andros Adamson1-0/+1
The client was freeing the nfs4_ff_layout_ds, but not the contained nfs4_ff_ds_version array. Signed-off-by: Weston Andros Adamson <[email protected]> Cc: [email protected] # v4.0+ Signed-off-by: Anna Schumaker <[email protected]>
2017-08-08Merge tag 'for-linus' of ↵Linus Torvalds13-53/+102
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Third set of -rc fixes for 4.13 cycle - small set of miscellanous fixes - a reasonably sizable set of IPoIB fixes that deal with multiple long standing issues" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/hns: checking for IS_ERR() instead of NULL RDMA/mlx5: Fix existence check for extended address vector IB/uverbs: Fix device cleanup RDMA/uverbs: Prevent leak of reserved field IB/core: Fix race condition in resolving IP to MAC IB/ipoib: Notify on modify QP failure only when relevant Revert "IB/core: Allow QP state transition from reset to error" IB/ipoib: Remove double pointer assigning IB/ipoib: Clean error paths in add port IB/ipoib: Add get statistics support to SRIOV VF IB/ipoib: Add multicast packets statistics IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp IB/ipoib: Make sure no in-flight joins while leaving that mcast IB/ipoib: Use cancel_delayed_work_sync when needed IB/ipoib: Fix race between light events and interface restart
2017-08-08parse-maintainers: Move matching sections from MAINTAINERSJoe Perches1-0/+12
Allow any number of command line arguments to match either the section header or the section contents and create new files. Create MAINTAINERS.new and SECTION.new. This allows scripting of the movement of various sections from MAINTAINERS. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08parse-maintainers: Use perl hash references and specific filenamesJoe Perches1-23/+34
Instead of reading STDIN and writing STDOUT, use specific filenames of MAINTAINERS and MAINTAINERS.new. Use hash references instead of global hash %hash so future modifications can read and write specific hashes to split up MAINTAINERS into multiple files using a script. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08parse-maintainers: Add section pattern sortingJoe Perches1-21/+49
Section [A-Z]: patterns are not currently in any required sorting order. Add a specific sorting sequence to MAINTAINERS entries. Sort F: and X: patterns in alphabetic order. The preferred section ordering is: SECTION HEADER M: Maintainers R: Reviewers P: Named persons without email addresses L: Mailing list addresses S: Status of this section (Supported, Maintained, Orphan, etc...) W: Any relevant URLs T: Source code control type (git, quilt, etc) Q: Patchwork patch acceptance queue site B: Bug tracking URIs C: Chat URIs F: Files with wildcard patterns (alphabetic ordered) X: Excluded files with wildcard patterns (alphabetic ordered) N: Files with regex patterns K: Keyword regexes in source code for maintainership identification Miscellaneous perl neatening: - Rename %map to %hash, map has a different meaning in perl - Avoid using \& and local variables for function indirection - Use return for a little c like clarity - Use c-like function call style instead of &function Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08get_maintainer: Prepare for separate MAINTAINERS filesJoe Perches1-25/+66
Allow for MAINTAINERS to become a directory and if it is, read all the files in the directory for maintained sections. Optionally look for all files named MAINTAINERS in directories excluding the .git directory by using --find-maintainer-files. This optional feature adds ~.3 seconds of CPU on an Intel i5-6200 with an SSD. Miscellanea: - Create a read_maintainer_file subroutine from the existing code - Test only the existence of MAINTAINERS, not whether it's a file Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08MAINTAINERS: openbmc mailing list is moderatedRandy Dunlap1-1/+1
The openbmc mailing list is moderated for non-subscribers. Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Brendan Higgins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Joel Stanley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08MAINTAINERS: greybus: Fix typo s/LOOBACK/LOOPBACKSedat Dilek1-1/+1
Fixes: f47e07bc5f1a5c48 ("Fix up MAINTAINERS file problems") Cc: Joe Perches <[email protected]> Signed-off-by: Sedat Dilek <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08mmc: mmc: correct the logic for setting HS400ES signal voltageHaibo Chen1-1/+1
Change the default err value to -EINVAL, make sure the card only has type EXT_CSD_CARD_TYPE_HS400_1_8V also do the signal voltage setting when select hs400es mode. Fixes: commit 1720d3545b77 ("mmc: core: switch to 1V8 or 1V2 for hs400es mode") Cc: <[email protected]> Signed-off-by: Haibo Chen <[email protected]> Reviewed-by: Shawn Lin <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2017-08-08Merge tag 'scsi-fixes' of ↵Linus Torvalds8-171/+69
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, one re-fix of a previous fix and five patches sorting out hotplug in the bnx2X class of drivers. The latter is rather involved, but necessary because these drivers have started dropping lockdep recursion warnings on the hotplug lock because of its conversion to a percpu rwsem" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sg: only check for dxfer_len greater than 256M scsi: aacraid: reading out of bounds scsi: qedf: Limit number of CQs scsi: bnx2i: Simplify cpu hotplug code scsi: bnx2fc: Simplify CPU hotplug code scsi: bnx2i: Prevent recursive cpuhotplug locking scsi: bnx2fc: Prevent recursive cpuhotplug locking scsi: bnx2fc: Plug CPU hotplug race
2017-08-08random: fix warning message on ia64 and pariscHelge Deller1-1/+1
Fix the warning message on the parisc and IA64 architectures to show the correct function name of the caller by using %pS instead of %pF. The message is printed with the value of _RET_IP_ which calls __builtin_return_address(0) and as such returns the IP address caller instead of pointer to a function descriptor of the caller. The effect of this patch is visible on the parisc and ia64 architectures only since those are the ones which use function descriptors while on all others %pS and %pF will behave the same. Cc: Theodore Ts'o <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Signed-off-by: Helge Deller <[email protected]> Fixes: eecabf567422 ("random: suppress spammy warnings about unseeded randomness") Fixes: d06bfd1989fe ("random: warn when kernel uses unseeded randomness") Signed-off-by: Linus Torvalds <[email protected]>
2017-08-08scsi: ses: Fix wrong page errorBrian King1-1/+1
If a SES device returns an error on a requested diagnostic page, we are currently printing an error indicating the wrong page was received. Fix this up to simply return a failure and only check the returned page when the diagnostic page buffer was populated by the device. Signed-off-by: Brian King <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2017-08-08scsi: ipr: Fix scsi-mq lockdep issueBrian King2-14/+21
Fixes the following lockdep warning that can occur when scsi-mq is enabled with ipr due to ipr calling scsi_unblock_requests from irq context. The fix is to move the call to scsi_unblock_requests to ipr's existing workqueue. stack backtrace: CPU: 28 PID: 0 Comm: swapper/28 Not tainted 4.13.0-rc2-gcc6x-gf74c89b #1 Call Trace: [c000001fffe97550] [c000000000b50818] dump_stack+0xe8/0x160 (unreliable) [c000001fffe97590] [c0000000001586d0] print_usage_bug+0x2d0/0x390 [c000001fffe97640] [c000000000158f34] mark_lock+0x7a4/0x8e0 [c000001fffe976f0] [c00000000015a000] __lock_acquire+0x6a0/0x1a70 [c000001fffe97860] [c00000000015befc] lock_acquire+0xec/0x2e0 [c000001fffe97930] [c000000000b71514] _raw_spin_lock+0x44/0x70 [c000001fffe97960] [c0000000005b60f4] blk_mq_sched_dispatch_requests+0xa4/0x2a0 [c000001fffe979c0] [c0000000005acac0] __blk_mq_run_hw_queue+0x100/0x2c0 [c000001fffe97a00] [c0000000005ad478] __blk_mq_delay_run_hw_queue+0x118/0x130 [c000001fffe97a40] [c0000000005ad61c] blk_mq_start_hw_queues+0x6c/0xa0 [c000001fffe97a80] [c000000000797aac] scsi_kick_queue+0x2c/0x60 [c000001fffe97aa0] [c000000000797cf0] scsi_run_queue+0x210/0x360 [c000001fffe97b10] [c00000000079b888] scsi_run_host_queues+0x48/0x80 [c000001fffe97b40] [c0000000007b6090] ipr_ioa_bringdown_done+0x70/0x1e0 [c000001fffe97bc0] [c0000000007bc860] ipr_reset_ioa_job+0x80/0xf0 [c000001fffe97bf0] [c0000000007b4d50] ipr_reset_timer_done+0xd0/0x100 [c000001fffe97c30] [c0000000001937bc] call_timer_fn+0xdc/0x4b0 [c000001fffe97cf0] [c000000000193d08] expire_timers+0x178/0x330 [c000001fffe97d60] [c0000000001940c8] run_timer_softirq+0xb8/0x120 [c000001fffe97de0] [c000000000b726a8] __do_softirq+0x168/0x6d8 [c000001fffe97ef0] [c0000000000df2c8] irq_exit+0x108/0x150 [c000001fffe97f10] [c000000000017bf4] __do_irq+0x2a4/0x4a0 [c000001fffe97f90] [c00000000002da50] call_do_irq+0x14/0x24 [c0000007fad93aa0] [c000000000017e8c] do_IRQ+0x9c/0x140 [c0000007fad93af0] [c000000000008b98] hardware_interrupt_common+0x138/0x140 Reported-by: Michael Ellerman <[email protected]> Signed-off-by: Brian King <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2017-08-08scsi: st: fix blk_get_queue usageBodo Stroesser1-2/+2
If blk_queue_get() in st_probe fails, disk->queue must not be set to SDp->request_queue, as that would result in put_disk() dropping a not taken reference. Thus, disk->queue should be set only after a successful blk_queue_get(). Fixes: 2b5bebccd282 ("st: Take additional queue ref in st_probe") Signed-off-by: Bodo Stroesser <[email protected]> Acked-by: Shirish Pargaonkar <[email protected]> Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Ewan D. Milne <[email protected]> Acked-by: Kai Mäkisara <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2017-08-08scsi: qla2xxx: Fix system crash while triggering FW dumpMichael Hernandez1-12/+0
This patch fixes system hang/crash while firmware dump is attempted with Block MQ enabled in qla2xxx driver. Fix is to remove check in fw dump template entries for existing request and response queues so that full buffer size is calculated during template size calculation. Following stack trace is seen during firmware dump capture process [ 694.390588] qla2xxx [0000:81:00.0]-5003:11: ISP System Error - mbx1=4b1fh mbx2=10h mbx3=2ah mbx7=0h. [ 694.402336] BUG: unable to handle kernel paging request at ffffc90008c7b000 [ 694.402372] IP: memcpy_erms+0x6/0x10 [ 694.402386] PGD 105f01a067 [ 694.402386] PUD 85f89c067 [ 694.402398] PMD 10490cb067 [ 694.402409] PTE 0 [ 694.402421] [ 694.402437] Oops: 0002 [#1] PREEMPT SMP [ 694.402452] Modules linked in: netconsole configfs qla2xxx scsi_transport_fc nvme_fc nvme_fabrics bnep bluetooth rfkill xt_tcpudp unix_diag xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ipmi_ssif sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass igb crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt aes_x86_64 crypto_simd ptp iTCO_vendor_support glue_helper cryptd lpc_ich joydev i2c_i801 pcspkr ioatdma mei_me pps_core tpm_tis mei mfd_core acpi_power_meter tpm_tis_core ipmi_si ipmi_devintf tpm ipmi_msghandler shpchp wmi dca button acpi_pad btrfs xor uas usb_storage hid_generic usbhid raid6_pq crc32c_intel ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect [ 694.402692] sysimgblt fb_sys_fops xhci_pci ttm ehci_pci sr_mod xhci_hcd cdrom ehci_hcd drm usbcore sg [ 694.402730] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-1-default+ #19 [ 694.402753] Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015 [ 694.402776] task: ffffffff81c0e4c0 task.stack: ffffffff81c00000 [ 694.402798] RIP: 0010:memcpy_erms+0x6/0x10 [ 694.402813] RSP: 0018:ffff88085fc03cd0 EFLAGS: 00210006 [ 694.402832] RAX: ffffc90008c7ae0c RBX: 0000000000000004 RCX: 000000000001fe0c [ 694.402856] RDX: 0000000000020000 RSI: ffff8810332c01f4 RDI: ffffc90008c7b000 [ 694.402879] RBP: ffff88085fc03d18 R08: 0000000000020000 R09: 0000000000279e0a [ 694.402903] R10: 0000000000000000 R11: f000000000000000 R12: ffff88085fc03d80 [ 694.402927] R13: ffffc90008a01000 R14: ffffc90008a056d4 R15: ffff881052ef17e0 [ 694.402951] FS: 0000000000000000(0000) GS:ffff88085fc00000(0000) knlGS:0000000000000000 [ 694.402977] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 694.403012] CR2: ffffc90008c7b000 CR3: 0000000001c09000 CR4: 00000000001406f0 [ 694.403036] Call Trace: [ 694.403047] <IRQ> [ 694.403072] ? qla27xx_fwdt_entry_t263+0x18e/0x380 [qla2xxx] [ 694.403099] qla27xx_walk_template+0x9d/0x1a0 [qla2xxx] [ 694.403124] qla27xx_fwdump+0x1f3/0x272 [qla2xxx] [ 694.403149] qla2x00_async_event+0xb08/0x1a50 [qla2xxx] [ 694.403169] ? enqueue_task_fair+0xa2/0x9d0 Signed-off-by: Mike Hernandez <[email protected]> Signed-off-by: Joe Carnuccio <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2017-08-08md/r5cache: fix io_unit handling in r5l_log_endio()Song Liu1-9/+31
In r5l_log_endio(), once log->io_list_lock is released, the io unit may be accessed (or even freed) by other threads. Current code doesn't handle the io_unit properly, which leads to potential race conditions. This patch solves this race condition by: 1. Add a pending_stripe count flush_payload. Multiple flush_payloads are counted as only one pending_stripe. Flag has_flush_payload is added to show whether the io unit has flush_payload; 2. In r5l_log_endio(), check flags has_null_flush and has_flush_payload with log->io_list_lock held. After the lock is released, this IO unit is only accessed when we know the pending_stripe counter cannot be zeroed by other threads. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-08-08md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_setSong Liu1-6/+15
In r5c_journal_mode_set(), it is necessary to call mddev_lock() before accessing conf and conf->log. Otherwise, the conf->log may change (and become NULL). Shaohua: fix unlock in failure cases Signed-off-by: Song Liu <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-08-08md: fix test in md_write_start()NeilBrown1-1/+1
md_write_start() needs to clear the in_sync flag is it is set, or if there might be a race with set_in_sync() such that the later will set it very soon. In the later case it is sufficient to take the spinlock to synchronize with set_in_sync(), and then set the flag if needed. The current test is incorrect. It should be: if "flag is set" or "race is possible" "flag is set" is trivially "mddev->in_sync". "race is possible" should be tested by "mddev->sync_checkers". If sync_checkers is 0, then there can be no race. set_in_sync() will wait in percpu_ref_switch_to_atomic_sync() for an RCU grace period, and as md_write_start() holds the rcu_read_lock(), set_in_sync() will be sure ot see the update to writes_pending. If sync_checkers is > 0, there could be race. If md_write_start() happened entirely between if (!mddev->in_sync && percpu_ref_is_zero(&mddev->writes_pending)) { and mddev->in_sync = 1; in set_in_sync(), then it would not see that is_sync had been set, and set_in_sync() would not see that writes_pending had been incremented. This bug means that in_sync is sometimes not set when it should be. Consequently there is a small chance that the array will be marked as "clean" when in fact it is inconsistent. Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") cc: [email protected] (v4.12+) Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-08-08md: always clear ->safemode when md_check_recovery gets the mddev lock.NeilBrown1-0/+3
If ->safemode == 1, md_check_recovery() will try to get the mddev lock and perform various other checks. If mddev->in_sync is zero, it will call set_in_sync, and clear ->safemode. However if mddev->in_sync is not zero, ->safemode will not be cleared. When md_check_recovery() drops the mddev lock, the thread is woken up again. Normally it would just check if there was anything else to do, find nothing, and go to sleep. However as ->safemode was not cleared, it will take the mddev lock again, then wake itself up when unlocking. This results in an infinite loop, repeatedly calling md_check_recovery(), which RCU or the soft-lockup detector will eventually complain about. Prior to commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"), safemode would only be set to one when the writes_pending counter reached zero, and would be cleared again when writes_pending is incremented. Since that patch, safemode is set more freely, but is not reliably cleared. So in md_check_recovery() clear ->safemode before checking ->in_sync. Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") Cc: [email protected] (4.12+) Reported-by: Dominik Brodowski <[email protected]> Reported-by: David R <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-08-08drm/etnaviv: Fix off-by-one error in reloc checkingWladimir J. van der Laan1-2/+2
A relocation pointing to the last four bytes of a buffer can legitimately happen in the case of small vertex buffers. CC: [email protected] #4.9+ Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
2017-08-08MIPS: Set ISA bit in entry-y for microMIPS kernelsPaul Burton1-1/+14
When building a kernel for the microMIPS ISA, ensure that the ISA bit (ie. bit 0) in the entry address is set. Otherwise we may include an entry address in images which bootloaders will jump to as MIPS32 code. I originally tried using "objdump -f" to obtain the entry address, which works for microMIPS but it always outputs a 32 bit address for a 32 bit ELF whilst nm will sign extend to 64 bit. That matters for systems where we might want to run a MIPS32 kernel on a MIPS64 CPU & load it with a MIPS64 bootloader, which would then jump to a non-canonical (non-sign-extended) address. This works in all cases as it only changes the behaviour for microMIPS kernels, but isn't the prettiest solution. A possible alternative would be to write a custom tool to just extract, sign extend & print the entry point of an ELF executable. I'm open to feedback if that would be preferred. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/16950/ Signed-off-by: Ralf Baechle <[email protected]>
2017-08-08MIPS: Prevent building MT support for microMIPS kernelsPaul Burton1-1/+1
We don't currently support the MT ASE for microMIPS kernels, and there are no CPUs currently in existence that use both. They can however both be enabled in Kconfig, resulting in build failures such as: AS arch/mips/kernel/cps-vec.o arch/mips/kernel/cps-vec.S: Assembler messages: arch/mips/kernel/cps-vec.S:242: Warning: the 32-bit microMIPS architecture does not support the `mt' extension arch/mips/kernel/cps-vec.S:276: Error: unrecognized opcode `mttc0 $13,$2,2' arch/mips/kernel/cps-vec.S:282: Error: unrecognized opcode `mttc0 $8,$1,2' arch/mips/kernel/cps-vec.S:285: Error: unrecognized opcode `mttc0 $0,$2,1' ... Fix this by preventing MT from being enabled when targeting microMIPS. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/16951/ Signed-off-by: Ralf Baechle <[email protected]>
2017-08-08powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api failsGautham R. Shenoy2-3/+48
Currently, we use the opal call opal_slw_set_reg() to inform the Sleep-Winkle Engine (SLW) to restore the contents of some of the Hypervisor state on wakeup from deep idle states that lose full hypervisor context (characterized by the flag OPAL_PM_LOSE_FULL_CONTEXT). However, the current code has a bug in that if opal_slw_set_reg() fails, we don't disable the use of these deep states (winkle on POWER8, stop4 onwards on POWER9). This patch fixes this bug by ensuring that if programing the sleep-winkle engine to restore the hypervisor states in pnv_save_sprs_for_deep_states() fails, then we exclude such states by clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from supported_cpuidle_states. As a result POWER8 will be prevented from using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to the default stop state when available. Further, we ensure in the initialization of the cpuidle-powernv driver to only include those states whose flags are present in supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT states when they have been disabled due to stop-api failure. Fixes: 1e1601b38e6 ("powerpc/powernv/idle: Restore SPRs for deep idle states via stop API.") Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-08mmc: host: omap_hsmmc: Add CMD23 capability to omap_hsmmc driverKishon Vijay Abraham I1-1/+1
omap_hsmmc driver always relied on CMD12 to stop transmission. However if CMD12 is not issued at the correct timing, the card will indicate a out of range error. With certain cards in some of the DRA7 based boards, -EIO error is observed. By Adding CMD23 capability, the MMC core will send MMC_SET_BLOCK_COUNT command before MMC_READ_MULTIPLE_BLOCK/MMC_WRITE_MULTIPLE_BLOCK commands. commit a04e6bae9e6f12 ("mmc: core: check also R1 response for stop commands") exposed this bug in omap_hsmmc driver. Signed-off-by: Kishon Vijay Abraham I <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2017-08-07Merge tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds5-43/+10
Pull Xtensa fixes from Max Filippov: - use asm-generic instances of asm/param.h and asm/device.h instead of exact copies in arch/xtensa/include/asm; - fix build error for xtensa cores with aliasing WT cache: define cache flushing functions and copy_{to,from}_user_page; - add missing EXPORT_SYMBOLs for clear_user_highpage, copy_user_highpage, flush_dcache_page, local_flush_cache_range, local_flush_cache_page, csum_partial and csum_partial_copy_generic. * tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: mm/cache: add missing EXPORT_SYMBOLs xtensa: don't limit csum_partial export by CONFIG_NET xtensa: fix cache aliasing handling code for WT cache xtensa: remove wrapper header for asm/param.h xtensa: remove wrapper header for asm/device.h
2017-08-07Merge tag 'for-linus-20170807' of git://git.infradead.org/linux-mtdLinus Torvalds6-25/+27
Pull MTD fixes from Brian Norris: "I missed getting these out for rc4, but here are some MTD fixes. Just NAND fixes (in both the core handling, and a few drivers). Notes stolen from Boris: Core fixes: - fix data interface setup for ONFI NANDs that do not support the SET FEATURES command - fix a kernel doc header - fix potential integer overflow when retrieving timing information from the parameter page - fix wrong OOB layout for small page NANDs Driver fixes: - fix potential division-by-zero bug - fix backward compat with old atmel-nand DT bindings - fix ->setup_data_interface() in the atmel NAND driver" * tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd: mtd: nand: atmel: Fix EDO mode check mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES mtd: nand: Fix a docs build warning mtd: nand: sunxi: fix potential divide-by-zero error nand: fix wrong default oob layout for small pages using soft ecc mtd: nand: atmel: Fix DT backward compatibility in pmecc.c
2017-08-07Merge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-5/+8
Pull xfs fixes from Darrick Wong: "I have a couple more bug fixes for you today: - fix memory leak when issuing discard - fix propagation of the dax inode flag" * tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Fix per-inode DAX flag inheritance xfs: Fix leak of discard bio
2017-08-08MIPS: PCI: Fix smp_processor_id() in preemptibleMatt Redfearn1-4/+3
Commit 1c3c5eab1715 ("sched/core: Enable might_sleep() and smp_processor_id() checks early") enables checks for might_sleep() and smp_processor_id() being used in preemptible code earlier in the boot than before. This results in a new BUG from pcibios_set_cache_line_size(). BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is pcibios_set_cache_line_size+0x10/0x70 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc1-00007-g3ce3e4ba4275 #615 Stack: 0000000000000000 ffffffff81189694 0000000000000000 ffffffff81822318 000000000000004e 0000000000000001 800000000e20bd08 20c49ba5e3540000 0000000000000000 0000000000000000 ffffffff818d0000 0000000000000000 0000000000000000 ffffffff81189328 ffffffff818ce692 0000000000000000 0000000000000000 ffffffff81189bc8 ffffffff818d0000 0000000000000000 ffffffff81828907 ffffffff81769970 800000020ec78d80 ffffffff818c7b48 0000000000000001 0000000000000001 ffffffff818652b0 ffffffff81896268 ffffffff818c0000 800000020ec7fb40 800000020ec7fc58 ffffffff81684cac 0000000000000000 ffffffff8118ab50 0000000000000030 ffffffff81769970 0000000000000001 ffffffff81122a58 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff81122a58>] show_stack+0x90/0xb0 [<ffffffff81684cac>] dump_stack+0xac/0xf0 [<ffffffff813f7050>] check_preemption_disabled+0x120/0x128 [<ffffffff818855e8>] pcibios_set_cache_line_size+0x10/0x70 [<ffffffff81100578>] do_one_initcall+0x48/0x140 [<ffffffff81865dc4>] kernel_init_freeable+0x194/0x24c [<ffffffff8169c534>] kernel_init+0x14/0x118 [<ffffffff8111ca84>] ret_from_kernel_thread+0x14/0x1c Fix this by using the cpu_*cache_line_size() macros instead. These macros are the "proper" way to determine the CPU cache sizes. This makes use of the newly added cpu_tcache_line_size. Fixes: 1c3c5eab1715 ("sched/core: Enable might_sleep() and smp_processor_id() checks early") Signed-off-by: Matt Redfearn <[email protected]> Suggested-by: James Hogan <[email protected]> Reviewed-by: James Hogan <[email protected]> Signed-off-by: Ralf Baechle <[email protected]>
2017-08-08MIPS: Introduce cpu_tcache_line_sizeMatt Redfearn1-0/+3
There exist macros to return the cache line size of the L1 dcache and L2 scache but there is currently no macro for the L3 tcache. Add this macro which will be used by the following patch "MIPS: PCI: Fix smp_processor_id() in preemptible" Signed-off-by: Matt Redfearn <[email protected]> Cc: Maciej W. Rozycki <[email protected]> Cc: James Hogan <[email protected]> Cc: Paul Burton <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/16871/ Signed-off-by: Ralf Baechle <[email protected]>
2017-08-07qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'Christophe Jaillet1-1/+1
We allocate 'p_info->mfw_mb_cur' and 'p_info->mfw_mb_shadow' but we check 'p_info->mfw_mb_addr' instead of 'p_info->mfw_mb_cur'. 'p_info->mfw_mb_addr' is never 0, because it is initiliazed a few lines above in 'qed_load_mcp_offsets()'. Update the test and check the result of the 2 'kzalloc()' instead. Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Tomer Tayar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-07test_sysctl: fix sysctl.sh by making it executableLuis R. Rodriguez1-0/+0
We had just forogtten to do this. Without this the following test fails: $ sudo make -C tools/testing/selftests/sysctl/ run_tests make: Entering directory '/home/mcgrof/linux-next/tools/testing/selftests/sysctl' /bin/sh: ./sysctl.sh: Permission denied selftests: sysctl.sh [FAIL] /home/mcgrof/linux-next/tools/testing/selftests/sysctl make: Leaving directory '/home/mcgrof/linux-next/tools/testing/selftests/sysctl' Fixes: 64b671204afd71 ("test_sysctl: add generic script to expand on tests") Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2017-08-07test_kmod: fix kmod.sh by making it executableLuis R. Rodriguez1-0/+0
We had just forgotten to do this. Without this if we run the following we get a permission denied: sudo make -C tools/testing/selftests/kmod/ run_tests make: Entering directory '/home/mcgrof/linux-next/tools/testing/selftests/kmod' /bin/sh: ./kmod.sh: Permission denied selftests: kmod.sh [FAIL] /home/mcgrof/linux-next/tools/testing/selftests/kmod make: Leaving directory '/home/mcgrof/linux-next/tools/testing/selftests/kmod Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader") Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2017-08-07hysdn: fix to a race condition in put_log_bufferAnton Volkov1-15/+13
The synchronization type that was used earlier to guard the loop that deletes unused log buffers may lead to a situation that prevents any thread from going through the loop. The patch deletes previously used synchronization mechanism and moves the loop under the spin_lock so the similar cases won't be feasible in the future. Found by by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Volkov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-07s390/qeth: fix L3 next-hop in xmit qeth hdrJulian Wiedmann1-2/+2
On L3, the qeth_hdr struct needs to be filled with the next-hop IP address. The current code accesses rtable->rt_gateway without checking that rtable is a valid address. The accidental access to a lowcore area results in a random next-hop address in the qeth_hdr. rtable (or more precisely, skb_dst(skb)) can be NULL in rare cases (for instance together with AF_PACKET sockets). This patch adds the missing NULL-ptr checks. Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: Ursula Braun <[email protected]> Fixes: 87e7597b5a3 qeth: Move away from using neighbour entries in qeth_l3_fill_header() Signed-off-by: David S. Miller <[email protected]>
2017-08-07Merge tag 'rdma-rc-2017-07-26' of ↵Doug Ledford7-33/+50
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma into leon-ipoib IPoIB fixes for 4.13 The patchset provides various fixes for IPoIB. It is combination of fixes to various issues discovered during verification along with static checkers cleanup patches. Most of the patches are from pre-git era and hence lack of Fixes lines. There is one exception in this IPoIB group - addition of patch revert: Revert "IB/core: Allow QP state transition from reset to error", but it followed by proper fix to the annoying print, so I thought it is appropriate to include it. Signed-off-by: Doug Ledford <[email protected]>
2017-08-07Merge branch 'asix-Improve-robustness'David S. Miller3-10/+45
Dean Jenkins says: ==================== asix: Improve robustness Please consider taking these patches to improve the robustness of the ASIX USB to Ethernet driver. Failures prompting an ASIX driver code review ============================================= On an ARM i.MX6 embedded platform some strange one-off and two-off failures were observed in and around the ASIX USB to Ethernet driver. This was observed on a highly modified kernel 3.14 with the ASIX driver containing back-ported changes from kernel.org up to kernel 4.8 approximately. a) A one-off failure in asix_rx_fixup_internal(): There was an occurrence of an attempt to write off the end of the netdev buffer which was trapped by skb_over_panic() in skb_put(). [20030.846440] skbuff: skb_over_panic: text:7f2271c0 len:120 put:60 head:8366ecc0 data:8366ed02 tail:0x8366ed7a end:0x8366ed40 dev:eth0 [20030.863007] Kernel BUG at 8044ce38 [verbose debug info unavailable] [20031.215345] Backtrace: [20031.217884] [<8044cde0>] (skb_panic) from [<8044d50c>] (skb_put+0x50/0x5c) [20031.227408] [<8044d4bc>] (skb_put) from [<7f2271c0>] (asix_rx_fixup_internal+0x1c4/0x23c [asix]) [20031.242024] [<7f226ffc>] (asix_rx_fixup_internal [asix]) from [<7f22724c>] (asix_rx_fixup_common+0x14/0x18 [asix]) [20031.260309] [<7f227238>] (asix_rx_fixup_common [asix]) from [<7f21f7d4>] (usbnet_bh+0x74/0x224 [usbnet]) [20031.269879] [<7f21f760>] (usbnet_bh [usbnet]) from [<8002f834>] (call_timer_fn+0xa4/0x1f0) [20031.283961] [<8002f790>] (call_timer_fn) from [<80030834>] (run_timer_softirq+0x230/0x2a8) [20031.302782] [<80030604>] (run_timer_softirq) from [<80028780>] (__do_softirq+0x15c/0x37c) [20031.321511] [<80028624>] (__do_softirq) from [<80028c38>] (irq_exit+0x8c/0xe8) [20031.339298] [<80028bac>] (irq_exit) from [<8000e9c8>] (handle_IRQ+0x8c/0xc8) [20031.350038] [<8000e93c>] (handle_IRQ) from [<800085c8>] (gic_handle_irq+0xb8/0xf8) [20031.365528] [<80008510>] (gic_handle_irq) from [<8050de80>] (__irq_svc+0x40/0x70) Analysis of the logic of the ASIX driver (containing backported changes from kernel.org up to kernel 4.8 approximately) suggested that the software could not trigger skb_over_panic(). The analysis of the kernel BUG() crash information suggested that the netdev buffer was written with 2 minimal 60 octet length Ethernet frames (ASIX hardware drops the 4 octet FCS field) and the 2nd Ethernet frame attempted to write off the end of the netdev buffer. Note that the netdev buffer should only contain 1 Ethernet frame so if an attempt to write 2 Ethernet frames into the buffer is made then that is wrong. However, the logic of the asix_rx_fixup_internal() only allows 1 Ethernet frame to be written into the netdev buffer. Potentially this failure was due to memory corruption because it was only seen once. b) Two-off failures in the NAPI layer's backlog queue: There were 2 crashes in the NAPI layer's backlog queue presumably after asix_rx_fixup_internal() called usbnet_skb_return(). [24097.273945] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [24097.398944] PC is at process_backlog+0x80/0x16c [24097.569466] Backtrace: [24097.572007] [<8045ad98>] (process_backlog) from [<8045b64c>] (net_rx_action+0xcc/0x248) [24097.591631] [<8045b580>] (net_rx_action) from [<80028780>] (__do_softirq+0x15c/0x37c) [24097.610022] [<80028624>] (__do_softirq) from [<800289cc>] (run_ksoftirqd+0x2c/0x84) and [ 1059.828452] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 1059.953715] PC is at process_backlog+0x84/0x16c [ 1060.140896] Backtrace: [ 1060.143434] [<8045ad98>] (process_backlog) from [<8045b64c>] (net_rx_action+0xcc/0x248) [ 1060.163075] [<8045b580>] (net_rx_action) from [<80028780>] (__do_softirq+0x15c/0x37c) [ 1060.181474] [<80028624>] (__do_softirq) from [<80028c38>] (irq_exit+0x8c/0xe8) [ 1060.199256] [<80028bac>] (irq_exit) from [<8000e9c8>] (handle_IRQ+0x8c/0xc8) [ 1060.210006] [<8000e93c>] (handle_IRQ) from [<800085c8>] (gic_handle_irq+0xb8/0xf8) [ 1060.225492] [<80008510>] (gic_handle_irq) from [<8050de80>] (__irq_svc+0x40/0x70) The embedded board was only using an ASIX USB to Ethernet adaptor eth0. Analysis suggested that the doubly-linked list pointers of the backlog queue had been corrupted because one of the link pointers was NULL. Potentially this failure was due to memory corruption because it was only seen twice. Results of the ASIX driver code review ====================================== During the code review some weaknesses were observed in the ASIX driver and the following patches have been created to improve the robustness. Brief overview of the patches ----------------------------- 1. asix: Add rx->ax_skb = NULL after usbnet_skb_return() The current ASIX driver sends the received Ethernet frame to the NAPI layer of the network stack via the call to usbnet_skb_return() in asix_rx_fixup_internal() but retains the rx->ax_skb pointer to the netdev buffer. The driver no longer needs the rx->ax_skb pointer at this point because the NAPI layer now has the Ethernet frame. This means that asix_rx_fixup_internal() must not use rx->ax_skb after the call to usbnet_skb_return() because it could corrupt the handling of the Ethernet frame within the network layer. Therefore, to remove the risk of erroneous usage of rx->ax_skb, set rx->ax_skb to NULL after the call to usbnet_skb_return(). This avoids potential erroneous freeing of rx->ax_skb and erroneous writing to the netdev buffer. If the software now somehow inappropriately reused rx->ax_skb, then a NULL pointer dereference of rx->ax_skb would occur which makes investigation easier. 2. asix: Ensure asix_rx_fixup_info members are all reset This patch creates reset_asix_rx_fixup_info() to allow all the asix_rx_fixup_info structure members to be consistently reset to initial conditions. Call reset_asix_rx_fixup_info() upon each detectable error condition so that the next URB is processed from a known state. Otherwise, there is a risk that some members of the asix_rx_fixup_info structure may be incorrect after an error occurred so potentially leading to a malfunction. 3. asix: Fix small memory leak in ax88772_unbind() This patch creates asix_rx_fixup_common_free() to allow the rx->ax_skb to be freed when necessary. asix_rx_fixup_common_free() is called from ax88772_unbind() before the parent private data structure is freed. Without this patch, there is a risk of a small netdev buffer memory leak each time ax88772_unbind() is called during the reception of an Ethernet frame that spans across 2 URBs. Testing ======= The patches have been sanity tested on a 64-bit Linux laptop running kernel 4.13-rc2 with the 3 patches applied on top. The ASIX USB to Adaptor used for testing was (output of lsusb): ID 0b95:772b ASIX Electronics Corp. AX88772B Test #1 ------- The test ran a flood ping test script which slowly incremented the ICMP Echo Request's payload from 0 to 5000 octets. This eventually causes IPv4 fragmentation to occur which causes Ethernet frames to be sent very close to each other so increases the probability that an Ethernet frame will span 2 URBs. The test showed that all pings were successful. The test took about 15 minutes to complete. Test #2 ------- A script was run on the laptop to periodically run ifdown and ifup every second so that the ASIX USB to Adaptor was up for 1 second and down for 1 second. From a Linux PC connected to the laptop, the following ping command was used ping -f -s 5000 <ip address of laptop> The large ICMP payload causes IPv4 fragmentation resulting in multiple Ethernet frames per original IP packet. Kernel debug within the ASIX driver was enabled to see whether any ASIX errors were generated. The test was run for about 24 hours and no ASIX errors were seen. Patches ======= The 3 patches have been rebased off the net-next repo master branch with HEAD fbbeefd net: fec: Allow reception of frames bigger than 1522 bytes ==================== Signed-off-by: David S. Miller <[email protected]>
2017-08-07asix: Fix small memory leak in ax88772_unbind()Dean Jenkins3-0/+17
When Ethernet frames span mulitple URBs, the netdev buffer memory pointed to by the asix_rx_fixup_info structure remains allocated during the time gap between the 2 executions of asix_rx_fixup_internal(). This means that if ax88772_unbind() is called within this time gap to free the memory of the parent private data structure then a memory leak of the part filled netdev buffer memory will occur. Therefore, create a new function asix_rx_fixup_common_free() to free the memory of the netdev buffer and add a call to asix_rx_fixup_common_free() from inside ax88772_unbind(). Consequently when an unbind occurs part way through receiving an Ethernet frame, the netdev buffer memory that is holding part of the received Ethernet frame will now be freed. Signed-off-by: Dean Jenkins <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-08-07asix: Ensure asix_rx_fixup_info members are all resetDean Jenkins1-9/+25
There is a risk that the members of the structure asix_rx_fixup_info become unsynchronised leading to the possibility of a malfunction. For example, rx->split_head was not being set to false after an error was detected so potentially could cause a malformed 32-bit Data header word to be formed. Therefore add function reset_asix_rx_fixup_info() to reset all the members of asix_rx_fixup_info so that future processing will start with known initial conditions. Also, if (skb->len != offset) becomes true then call reset_asix_rx_fixup_info() so that the processing of the next URB starts with known initial conditions. Without the call, the check does nothing which potentially could lead to a malfunction when the next URB is processed. In addition, for robustness, call reset_asix_rx_fixup_info() before every error path's "return 0". This ensures that the next URB is processed from known initial conditions. Signed-off-by: Dean Jenkins <[email protected]> Signed-off-by: David S. Miller <[email protected]>