aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-01-04can: vxcan: improve handling of missing peer name attributeOliver Hartkopp1-1/+1
Picking up the patch from Serhey Popovych (commit 191cdb3822e5df6b3c8, "veth: Be more robust on network device creation when no attributes"). When the peer name attribute is not provided the former implementation tries to register the given device name twice ... which leads to -EEXIST. If only one device name is given apply an automatic generated and valid name for the peer. Cc: Serhey Popovych <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2018-01-03Merge branch '40GbE' of ↵David S. Miller3-17/+72
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-01-03 This series contains fixes for i40e and i40evf. Amritha removes the UDP support for big buffer cloud filters since it is not supported and having UDP enabled is a bug. Alex fixes a bug in the __i40e_chk_linearize() which did not take into account large (16K or larger) fragments that are split over 2 descriptors, which could result in a transmit hang. Jake fixes an issue where a devices own MAC address could be removed from the unicast address list, so force a check on every address sync to ensure removal does not happen. Jiri Pirko fixes the return value when a filter configuration is not supported, do not return "invalid" but return "not supported" so that the core can react correctly. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-01-033c59x: fix missing dma_mapping_error check and bad ring refill logicNeil Horman1-52/+38
A few spots in 3c59x missed calls to dma_mapping_error checks, casuing WARN_ONS to trigger. Clean those up. While we're at it, refactor the refill code a bit so that if skb allocation or dma mapping fails, we recycle the existing buffer. This prevents holes in the rx ring, and makes for much simpler logic Note: This is compile only tested. Ted, if you could run this and confirm that it continues to work properly, I would appreciate it, as I currently don't have access to this hardware Signed-off-by: Neil Horman <[email protected]> CC: Steffen Klassert <[email protected]> CC: "David S. Miller" <[email protected]> Reported-by: [email protected] Signed-off-by: David S. Miller <[email protected]>
2018-01-03Merge branch 'ena-fixes'David S. Miller1-15/+30
Netanel Belgazal says: ==================== bug fixes for ENA Ethernet driver Changes from V1: Revome incorrect "ena: invoke netif_carrier_off() only after netdev registered" patch This patchset contains 2 bug fixes: * handle rare race condition during MSI-X initialization * fix error processing in ena_down() ==================== Signed-off-by: David S. Miller <[email protected]>
2018-01-03net: ena: fix error handling in ena_down() sequenceNetanel Belgazal1-2/+17
ENA admin command queue errors are not handled as part of ena_down(). As a result, in case of error admin queue transitions to non-running state and aborts all subsequent commands including those coming from ena_up(). Reset scheduled by the driver from the timer service context would not proceed due to sharing rtnl with ena_up()/ena_down() Signed-off-by: Netanel Belgazal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03net: ena: unmask MSI-X only after device initialization is completedNetanel Belgazal1-13/+13
Under certain conditions MSI-X interrupt might arrive right after it was unmasked in ena_up(). There is a chance it would be processed by the driver before device ENA_FLAG_DEV_UP flag is set. In such a case the interrupt is ignored. ENA device operates in auto-masked mode, therefore ignoring interrupt leaves it masked for good. Moving unmask of interrupt to be the last step in ena_up(). Signed-off-by: Netanel Belgazal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03cxgb4: Fix FW flash errorsArjun Vynipadath2-10/+8
commit 96ac18f14a5a ("cxgb4: Add support for new flash parts") removed initialization of adapter->params.sf_fw_start causing issues while flashing firmware to card. We no longer need sf_fw_start in adapter->params as we already have macros defined for FW flash addresses. Fixes: 96ac18f14a5a ("cxgb4: Add support for new flash parts") Signed-off-by: Arjun Vynipadath <[email protected]> Signed-off-by: Casey Leedom <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03i40e: flower: Fix return value for unsupported offloadJiri Pirko1-1/+1
When filter configuration is not supported, drivers should return -EOPNOTSUPP so the core can react correctly. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Jiri Pirko <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-03i40e: don't remove netdev->dev_addr when syncing uc listJacob Keller1-1/+16
In some circumstances, such as with bridging, it is possible that the stack will add a devices own MAC address to its unicast address list. If, later, the stack deletes this address, then the i40e driver will receive a request to remove this address. The driver stores its current MAC address as part of the MAC/VLAN hash array, since it is convenient and matches exactly how the hardware expects to be told which traffic to receive. This causes a problem, since for more devices, the MAC address is stored separately, and requests to delete a unicast address should not have the ability to remove the filter for the MAC address. Fix this by forcing a check on every address sync to ensure we do not remove the device address. There is a very narrow possibility of a race between .set_mac and .set_rx_mode, if we don't change netdev->dev_addr before updating our internal MAC list in .set_mac. This might be possible if .set_rx_mode is going to remove MAC "XYZ" from the list, at the same time as .set_mac changes our dev_addr to MAC "XYZ", we might possibly queue a delete, then an add in .set_mac, then queue a delete in .set_rx_mode's dev_uc_sync and then update netdev->dev_addr. We can avoid this by moving the copy into dev_addr prior to the changes to the MAC filter list. A similar race on the other side does not cause problems, as if we're changing our MAC form A to B, and we race with .set_rx_mode, it could queue a delete from A, we'd update our address, and allow the delete. This seems like a race, but in reality we're about to queue a delete of A anyways, so it would not cause any issues. A race in the initialization code is unlikely because the netdevice has not yet been fully initialized and the stack should not be adding or removing addresses yet. Note that we don't (yet) need similar code for the VF driver because it does not make use of __dev_uc_sync and __dev_mc_sync, but instead roles its own method for handling updates to the MAC/VLAN list, which already has code to protect against removal of the hardware address. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-03i40e/i40evf: Account for frags split over multiple descriptors in check ↵Alexander Duyck2-6/+46
linearize The original code for __i40e_chk_linearize didn't take into account the fact that if a fragment is 16K in size or larger it has to be split over 2 descriptors and the smaller of those 2 descriptors will be on the trailing edge of the transmit. As a result we can get into situations where we didn't catch requests that could result in a Tx hang. This patch takes care of that by subtracting the length of all but the trailing edge of the stale fragment before we test for sum. By doing this we can guarantee that we have all cases covered, including the case of a fragment that spans multiple descriptors. We don't need to worry about checking the inner portions of this since 12K is the maximum aligned DMA size and that is larger than any MSS will ever be since the MTU limit for jumbos is something on the order of 9K. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-03Merge branch 'fec-clean-up-in-the-cases-of-probe-error'David S. Miller1-0/+5
Fugang Duan says: ==================== net: fec: clean up in the cases of probe error The simple patches just clean up in the cases of probe error like restore dev_id and handle the defer probe when regulator is still not ready. v2: * Fabio Estevam's comment to suggest split v1 to separate patches. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-01-03net: fec: defer probe if regulator is not readyFugang Duan1-0/+4
Defer probe if regulator is not ready. E.g. some regulator is fixed regulator controlled by i2c expander gpio, the i2c device may be probed after the driver, then it should handle the case of defer probe error. Signed-off-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03net: fec: restore dev_id in the cases of probe errorFugang Duan1-0/+1
The static variable dev_id always plus one before netdev registerred. It should restore the dev_id value in the cases of probe error. Signed-off-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03i40e: Remove UDP support for big bufferAmritha Nambiar1-9/+9
Since UDP based filters are not supported via big buffer cloud filters, remove UDP support. Also change a few return types to indicate unsupported vs invalid configuration. Signed-off-by: Amritha Nambiar <[email protected]> Acked-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-03vxlan: trivial indenting fix.William Tu1-1/+1
Fix indentation of reserved_flags2 field in vxlanhdr_gpe. Fixes: e1e5314de08b ("vxlan: implement GPE") Signed-off-by: William Tu <[email protected]> Acked-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03sctp: fix error path in sctp_stream_initMarcelo Ricardo Leitner1-12/+10
syzbot noticed a NULL pointer dereference panic in sctp_stream_free() which was caused by an incomplete error handling in sctp_stream_init(). By not clearing stream->outcnt, it made a for() in sctp_stream_free() think that it had elements to free, but not, leading to the panic. As suggested by Xin Long, this patch also simplifies the error path by moving it to the only if() that uses it. See-also: https://www.spinics.net/lists/netdev/msg473756.html See-also: https://www.spinics.net/lists/netdev/msg465024.html Reported-by: syzbot <[email protected]> Fixes: f952be79cebd ("sctp: introduce struct sctp_stream_out_ext") Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Reviewed-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03Merge branch '1GbE' of ↵David S. Miller3-9/+32
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-01-02 This series contains fixes for e1000 and e1000e. Tushar Dave adds a check to the driver so that it won't attempt to disable a device that is already disabled for e1000. Benjamin Poirier provides a fix to e1000e, where a previous commit that Benjamin submitted changed the meaning of the return value for "check_for_link" for copper media and not all the instances were properly updated. Benjamin fixes the remaining instances that needed the change. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-01-03RDS: Heap OOB write in rds_message_alloc_sgs()Mohamed Ghannam1-0/+3
When args->nr_local is 0, nr_pages gets also 0 due some size calculation via rds_rm_size(), which is later used to allocate pages for DMA, this bug produces a heap Out-Of-Bound write access to a specific memory region. Signed-off-by: Mohamed Ghannam <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-03uapi libc compat: add fallback for unsupported libcsFelix Janda1-1/+54
libc-compat.h aims to prevent symbol collisions between uapi and libc headers for each supported libc. This requires continuous coordination between them. The goal of this commit is to improve the situation for libcs (such as musl) which are not yet supported and/or do not wish to be explicitly supported, while not affecting supported libcs. More precisely, with this commit, unsupported libcs can request the suppression of any specific uapi definition by defining the correspondings _UAPI_DEF_* macro as 0. This can fix symbol collisions for them, as long as the libc headers are included before the uapi headers. Inclusion in the other order is outside the scope of this commit. All infrastructure in order to enable this fallback for unsupported libcs is already in place, except that libc-compat.h unconditionally defines all _UAPI_DEF_* macros to 1 for all unsupported libcs so that any previous definitions are ignored. In order to fix this, this commit merely makes these definitions conditional. This commit together with the musl libc commit http://git.musl-libc.org/cgit/musl/commit/?id=04983f2272382af92eb8f8838964ff944fbb8258 fixes for example the following compiler errors when <linux/in6.h> is included after musl's <netinet/in.h>: ./linux/in6.h:32:8: error: redefinition of 'struct in6_addr' ./linux/in6.h:49:8: error: redefinition of 'struct sockaddr_in6' ./linux/in6.h:59:8: error: redefinition of 'struct ipv6_mreq' The comments referencing glibc are still correct, but this file is not only used for glibc any more. Signed-off-by: Felix Janda <[email protected]> Reviewed-by: Hauke Mehrtens <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02tipc: fix problems with multipoint-to-point flow controlJon Maloy1-4/+18
In commit 04d7b574b245 ("tipc: add multipoint-to-point flow control") we introduced a protocol for preventing buffer overflow when many group members try to simultaneously send messages to the same receiving member. Stress test of this mechanism has revealed a couple of related bugs: - When the receiving member receives an advertisement REMIT message from one of the senders, it will sometimes prematurely activate a pending member and send it the remitted advertisement, although the upper limit for active senders has been reached. This leads to accumulation of illegal advertisements, and eventually to messages being dropped because of receive buffer overflow. - When the receiving member leaves REMITTED state while a received message is being read, we miss to look at the pending queue, to activate the oldest pending peer. This leads to some pending senders being starved out, and never getting the opportunity to profit from the remitted advertisement. We fix the former in the function tipc_group_proto_rcv() by returning directly from the function once it becomes clear that the remitting peer cannot leave REMITTED state at that point. We fix the latter in the function tipc_group_update_rcv_win() by looking up and activate the longest pending peer when it becomes clear that the remitting peer now can leave REMITTED state. Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02ethtool: do not print warning for applications using legacy APIStephen Hemminger1-13/+2
In kernel log ths message appears on every boot: "warning: `NetworkChangeNo' uses legacy ethtool link settings API, link modes are only partially reported" When ethtool link settings API changed, it started complaining about usages of old API. Ironically, the original patch was from google but the application using the legacy API is chrome. Linux ABI is fixed as much as possible. The kernel must not break it and should not complain about applications using legacy API's. This patch just removes the warning since using legacy API's in Linux is perfectly acceptable. Fixes: 3f1ac7a700d0 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API") Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David Decotigny <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625SZ Lin (林上智)1-0/+1
This patch adds support for PID 0x9625 of YUGA CLM920-NC5. YUGA CLM920-NC5 needs to enable QMI_WWAN_QUIRK_DTR before QMI operation. qmicli -d /dev/cdc-wdm0 -p --dms-get-revision [/dev/cdc-wdm0] Device revision retrieved: Revision: 'CLM920_NC5-V1 1 [Oct 23 2016 19:00:00]' Signed-off-by: SZ Lin (林上智) <[email protected]> Acked-by: Bjørn Mork <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02e1000e: Fix e1000_check_for_copper_link_ich8lan return value.Benjamin Poirier1-3/+8
e1000e_check_for_copper_link() and e1000_check_for_copper_link_ich8lan() are the two functions that may be assigned to mac.ops.check_for_link when phy.media_type == e1000_media_type_copper. Commit 19110cfbb34d ("e1000e: Separate signaling for link check/link up") changed the meaning of the return value of check_for_link for copper media but only adjusted the first function. This patch adjusts the second function likewise. Reported-by: Christian Hesse <[email protected]> Reported-by: Gabriel C <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=198047 Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up") Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Tested-by: Christian Hesse <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-02e1000: fix disabling already-disabled warningTushar Dave2-6/+24
This patch adds check so that driver does not disable already disabled device. [ 44.637743] advantechwdt: Unexpected close, not stopping watchdog! [ 44.997548] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input6 [ 45.013419] e1000 0000:00:03.0: disabling already-disabled device [ 45.013447] ------------[ cut here ]------------ [ 45.014868] WARNING: CPU: 1 PID: 71 at drivers/pci/pci.c:1641 pci_disable_device+0xa1/0x105: pci_disable_device at drivers/pci/pci.c:1640 [ 45.016171] CPU: 1 PID: 71 Comm: rcu_perf_shutdo Not tainted 4.14.0-01330-g3c07399 #1 [ 45.017197] task: ffff88011bee9e40 task.stack: ffffc90000860000 [ 45.017987] RIP: 0010:pci_disable_device+0xa1/0x105: pci_disable_device at drivers/pci/pci.c:1640 [ 45.018603] RSP: 0000:ffffc90000863e30 EFLAGS: 00010286 [ 45.019282] RAX: 0000000000000035 RBX: ffff88013a230008 RCX: 0000000000000000 [ 45.020182] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000203 [ 45.021084] RBP: ffff88013a3f31e8 R08: 0000000000000001 R09: 0000000000000000 [ 45.021986] R10: ffffffff827ec29c R11: 0000000000000002 R12: 0000000000000001 [ 45.022946] R13: ffff88013a230008 R14: ffff880117802b20 R15: ffffc90000863e8f [ 45.023842] FS: 0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 [ 45.024863] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.025583] CR2: ffffc900006d4000 CR3: 000000000220f000 CR4: 00000000000006a0 [ 45.026478] Call Trace: [ 45.026811] __e1000_shutdown+0x1d4/0x1e2: __e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5162 [ 45.027344] ? rcu_perf_cleanup+0x2a1/0x2a1: rcu_perf_shutdown at kernel/rcu/rcuperf.c:627 [ 45.027883] e1000_shutdown+0x14/0x3a: e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5235 [ 45.028351] device_shutdown+0x110/0x1aa: device_shutdown at drivers/base/core.c:2807 [ 45.028858] kernel_power_off+0x31/0x64: kernel_power_off at kernel/reboot.c:260 [ 45.029343] rcu_perf_shutdown+0x9b/0xa7: rcu_perf_shutdown at kernel/rcu/rcuperf.c:637 [ 45.029852] ? __wake_up_common_lock+0xa2/0xa2: autoremove_wake_function at kernel/sched/wait.c:376 [ 45.030414] kthread+0x126/0x12e: kthread at kernel/kthread.c:233 [ 45.030834] ? __kthread_bind_mask+0x8e/0x8e: kthread at kernel/kthread.c:190 [ 45.031399] ? ret_from_fork+0x1f/0x30: ret_from_fork at arch/x86/entry/entry_64.S:443 [ 45.031883] ? kernel_init+0xa/0xf5: kernel_init at init/main.c:997 [ 45.032325] ret_from_fork+0x1f/0x30: ret_from_fork at arch/x86/entry/entry_64.S:443 [ 45.032777] Code: 00 48 85 ed 75 07 48 8b ab a8 00 00 00 48 8d bb 98 00 00 00 e8 aa d1 11 00 48 89 ea 48 89 c6 48 c7 c7 d8 e4 0b 82 e8 55 7d da ff <0f> ff b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 f0 b1 61 82 [ 45.035222] ---[ end trace c257137b1b1976ef ]--- [ 45.037838] ACPI: Preparing to enter system sleep state S5 Signed-off-by: Tushar Dave <[email protected]> Tested-by: Fengguang Wu <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-01-02sfp: fix sfp-bus oops when removing socket/upstreamRussell King1-2/+4
When we remove a socket or upstream, and the other side isn't registered, we dereference a NULL pointer, causing a kernel oops. Fix this. Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") Signed-off-by: Russell King <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02phylink: ensure we report link down when LOS assertedRussell King1-2/+1
Although we disable the netdev carrier, we fail to report in the kernel log that the link went down. Fix this. Fixes: 9525ae83959b ("phylink: add phylink infrastructure") Signed-off-by: Russell King <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02macvlan: Fix one possible double freeGao Feng1-1/+6
Because the macvlan_uninit would free the macvlan port, so there is one double free case in macvlan_common_newlink. When the macvlan port is just created, then register_netdevice or netdev_upper_dev_link failed and they would invoke macvlan_uninit. Then it would reach the macvlan_port_destroy which triggers the double free. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02net/sched: Fix update of lastuse in act modules implementing stats_updateRoi Dayan2-2/+2
We need to update lastuse to to the most updated value between what is already set and the new value. If HW matching fails, i.e. because of an issue, the stats are not updated but it could be that software did match and updated lastuse. Fixes: 5712bf9c5c30 ("net/sched: act_mirred: Use passed lastuse argument") Fixes: 9fea47d93bcc ("net/sched: act_gact: Update statistics when offloaded to hardware") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02mlxsw: spectrum: Relax sanity checks during enslavementIdo Schimmel3-2/+17
Since commit 25cc72a33835 ("mlxsw: spectrum: Forbid linking to devices that have uppers") the driver forbids enslavement to netdevs that already have uppers of their own, as this can result in various ordering problems. This requirement proved to be too strict for some users who need to be able to enslave ports to a bridge that already has uppers. In this case, we can allow the enslavement if the bridge is already known to us, as any configuration performed on top of the bridge was already reflected to the device. Fixes: 25cc72a33835 ("mlxsw: spectrum: Forbid linking to devices that have uppers") Signed-off-by: Ido Schimmel <[email protected]> Reported-by: Alexander Petrovskiy <[email protected]> Tested-by: Alexander Petrovskiy <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02mlxsw: spectrum_router: Fix NULL pointer derefIdo Schimmel1-1/+1
When we remove the neighbour associated with a nexthop we should always refuse to write the nexthop to the adjacency table. Regardless if it is already present in the table or not. Otherwise, we risk dereferencing the NULL pointer that was set instead of the neighbour. Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel <[email protected]> Reported-by: Alexander Petrovskiy <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02ip6_tunnel: allow ip6gre dev mtu to be set below 1280Xin Long1-3/+3
Commit 582442d6d5bc ("ipv6: Allow the MTU of ipip6 tunnel to be set below 1280") fixed a mtu setting issue. It works for ipip6 tunnel. But ip6gre dev updates the mtu also with ip6_tnl_change_mtu. Since the inner packet over ip6gre can be ipv4 and it's mtu should also be allowed to set below 1280, the same issue also exists on ip6gre. This patch is to fix it by simply changing to check if parms.proto is IPPROTO_IPV6 in ip6_tnl_change_mtu instead, to make ip6gre to go to 'else' branch. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02geneve: update skb dst pmtu on tx pathXin Long1-0/+14
Commit a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") has fixed a performance issue caused by the change of lower dev's mtu for vxlan. The same thing needs to be done for geneve as well. Note that geneve cannot adjust it's mtu according to lower dev's mtu when creating it. The performance is very low later when netperfing over it without fixing the mtu manually. This patch could also avoid this issue. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02ip6_tunnel: disable dst caching if tunnel is dual-stackEli Cooper1-4/+5
When an ip6_tunnel is in mode 'any', where the transport layer protocol can be either 4 or 41, dst_cache must be disabled. This is because xfrm policies might apply to only one of the two protocols. Caching dst would cause xfrm policies for one protocol incorrectly used for the other. Signed-off-by: Eli Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-01-02Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"David S. Miller1-1/+13
This reverts commit 87c320e51519a83c496ab7bfb4e96c8f9c001e89. Changing the error return code in some situations turns out to be harmful in practice. In particular Michael Ellerman reports that DHCP fails on his powerpc machines, and this revert gets things working again. Johannes Berg agrees that this revert is the best course of action for now. Fixes: 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"") Reported-by: Michael Ellerman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds30-121/+298
Pull networking fixes from David Miller: 1) IPv6 gre tunnels end up with different default features enabled depending upon whether netlink or ioctls are used to bring them up. Fix from Alexey Kodanev. 2) Fix read past end of user control message in RDS< from Avinash Repaka. 3) Missing RCU barrier in mini qdisc code, from Cong Wang. 4) Missing policy put when reusing per-cpu route entries, from Florian Westphal. 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G. Piccoli. 6) Run nested transport mode IPSEC packets via tasklet, from Herbert Xu. 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy Bhuvaragan. 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert. 9) Another zerocopy ubuf handling fix, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) strparser: Call sock_owned_by_user_nocheck sock: Add sock_owned_by_user_nocheck skbuff: in skb_copy_ubufs unclone before releasing zerocopy tipc: fix hanging poll() for stream sockets sctp: Replace use of sockets_allocated with specified macro. bnx2x: Improve reliability in case of nested PCI errors tg3: Enable PHY reset in MTU change path for 5720 tg3: Add workaround to restrict 5762 MRRS to 2048 tg3: Update copyright net: fec: unmap the xmit buffer that are not transferred by DMA tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path tipc: error path leak fixes in tipc_enable_bearer() RDS: Check cmsg_len before dereferencing CMSG_DATA tcp: Avoid preprocessor directives in tracepoint macro args tipc: fix memory leak of group member when peer node is lost net: sched: fix possible null pointer deref in tcf_block_put tipc: base group replicast ack counter on number of actual receivers net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap() net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround ip6_gre: fix device features for ioctl setup ...
2017-12-28Merge tag 'drm-fixes-for-v4.15-rc6' of ↵Linus Torvalds3-4/+5
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "nouveau and i915 regression fixes" * tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux: drm/nouveau: fix race when adding delayed work items i915: Reject CCS modifiers for pipe C on Geminilake drm/i915/gvt: Fix pipe A enable as default for vgpu
2017-12-28Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "One more fix for the runtime PM clk patches. We're calling a runtime PM API that may schedule from somewhere that we can't do that. We change to the async version of pm_runtime_put() to fix it" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: use atomic runtime pm api in clk_core_is_enabled
2017-12-28Merge tag 'led_fixes_for_4.15-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "A single LED fix for brightness setting when delay_off is 0" * tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: led: core: Fix brightness setting when setting delay_off=0
2017-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds22-105/+208
Pull rdma fixes from Jason Gunthorpe: "This is the next batch of for-rc patches from RDMA. It includes the fix for the ipoib regression I mentioned last time, and the result of a fairly major debugging effort to get iser working reliably on cxgb4 hardware - it turns out the cxgb4 driver was not handling QP error flushing properly causing iser to fail. - cxgb4 fix for an iser testing failure as debugged by Steve and Sagi. The problem was a driver bug in the handling of shutting down a QP. - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on error unwind and a use after free bug - Improper congestion counter values on mlx5 when link aggregation is enabled - ipoib lockdep regression introduced in this merge window - hfi1 regression supporting the device in a VM introduced in a recent patch - Typo that breaks future uAPI compatibility in the verbs core - More SELinux related oops fixing - Fix an oops during error unwind in mlx5" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/mlx5: Fix mlx5_ib_alloc_mr error flow IB/core: Verify that QP is security enabled in create and destroy IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() IB/mlx5: Serialize access to the VMA list IB/hfi: Only read capability registers if the capability exists IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush IB/mlx5: Fix congestion counters in LAG mode RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path iw_cxgb4: when flushing, complete all wrs in a chain iw_cxgb4: reflect the original WR opcode in drain cqes iw_cxgb4: Only validate the MSN for successful completions
2017-12-28Merge branch 'strparser-Fix-lockdep-issue'David S. Miller2-1/+6
Tom Herbert says: ==================== strparser: Fix lockdep issue When sock_owned_by_user returns true in strparser. Fix is to add and call sock_owned_by_user_nocheck since the check for owned by user is not an error condition in this case. ==================== Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <[email protected]> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <[email protected]>
2017-12-28strparser: Call sock_owned_by_user_nocheckTom Herbert1-1/+1
strparser wants to check socket ownership without producing any warnings. As indicated by the comment in the code, it is permissible for owned_by_user to return true. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <[email protected]> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-12-28sock: Add sock_owned_by_user_nocheckTom Herbert1-0/+5
This allows checking socket lock ownership with producing lockdep warnings. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-12-28skbuff: in skb_copy_ubufs unclone before releasing zerocopyWillem de Bruijn1-3/+3
skb_copy_ubufs must unclone before it is safe to modify its skb_shared_info with skb_zcopy_clear. Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") ensures that all skbs release their zerocopy state, even those without frags. But I forgot an edge case where such an skb arrives that is cloned. The stack does not build such packets. Vhost/tun skbs have their frags orphaned before cloning. TCP skbs only attach zerocopy state when a frag is added. But if TCP packets can be trimmed or linearized, this might occur. Tracing the code I found no instance so far (e.g., skb_linearize ends up calling skb_zcopy_clear if !skb->data_len). Still, it is non-obvious that no path exists. And it is fragile to rely on this. Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-12-28tipc: fix hanging poll() for stream socketsParthasarathy Bhuvaragan1-1/+1
In commit 42b531de17d2f6 ("tipc: Fix missing connection request handling"), we replaced unconditional wakeup() with condtional wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND. This breaks the applications which do a connect followed by poll with POLLOUT flag. These applications are not woken when the connection is ESTABLISHED and hence sleep forever. In this commit, we fix it by including the POLLOUT event for sockets in TIPC_CONNECTING state. Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling") Acked-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller3-4/+8
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-28 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Two small fixes for bpftool. Fix otherwise broken output if any of the system calls failed when listing maps in json format and instead of bailing out, skip maps or progs that disappeared between fetching next id and getting an fd for that id, both from Jakub. 2) Small fix in BPF selftests to respect LLC passed from command line when testing for -mcpu=probe presence, from Quentin. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-12-27IB/mlx5: Fix mlx5_ib_alloc_mr error flowNitzan Carmi1-0/+1
ibmr.device is being set only after ib_alloc_mr() is (successfully) complete. Therefore, in case mlx5_core_create_mkey() return with error, the error flow calls mlx5_free_priv_descs() which uses ibmr.device (which doesn't exist yet), causing a NULL dereference oops. To fix this, the IB device should be set in the mr struct earlier stage (e.g. prior to calling mlx5_core_create_mkey()). Fixes: 8a187ee52b04 ("IB/mlx5: Support the new memory registration API") Signed-off-by: Max Gurtovoy <[email protected]> Signed-off-by: Nitzan Carmi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2017-12-27IB/core: Verify that QP is security enabled in create and destroyMoni Shoua2-1/+5
The XRC target QP create flow sets up qp_sec only if there is an IB link with LSM security enabled. However, several other related uAPI entry points blindly follow the qp_sec NULL pointer, resulting in a possible oops. Check for NULL before using qp_sec. Cc: <[email protected]> # v4.12 Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs") Reviewed-by: Daniel Jurgens <[email protected]> Signed-off-by: Moni Shoua <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2017-12-27IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()Moni Shoua1-2/+2
If the input command length is larger than the kernel supports an error should be returned in case the unsupported bytes are not cleared, instead of the other way aroudn. This matches what all other callers of ib_is_udata_cleared do and will avoid user ABI problems in the future. Cc: <[email protected]> # v4.10 Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Moni Shoua <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2017-12-27IB/mlx5: Serialize access to the VMA listMajd Dibbiny2-0/+12
User-space applications can do mmap and munmap directly at any time. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Add a mutex around the list. Cc: <[email protected]> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2017-12-27Merge tag 'trace-v4.15-rc4' of ↵Linus Torvalds2-10/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "While doing tests on tracing over the network, I found that the packets were getting corrupted. In the process I found three bugs. One was the culprit, but the other two scared me. After deeper investigation, they were not as major as I thought they were, due to a signed compared to an unsigned that prevented a negative number from doing actual harm. The two bigger bugs: - Mask the ring buffer data page length. There are data flags at the high bits of the length field. These were not cleared via the length function, and the length could return a negative number. (Although the number returned was unsigned, but was assigned to a signed number) Luckily, this value was compared to PAGE_SIZE which is unsigned and kept it from entering the path that could have caused damage. - Check the page usage before reusing the ring buffer reader page. TCP increments the page ref when passing the page off to the network. The page is passed back to the ring buffer for use on free. But the page could still be in use by the TCP stack. Minor bugs: - Related to the first bug. No need to clear out the unused ring buffer data before sending to user space. It is now done by the ring buffer code itself. - Reset pointers after free on error path. There were some cases in the error path that pointers were freed but not set to NULL, and could have them freed again, having a pointer freed twice" * tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible double free on failure of allocating trace buffer tracing: Fix crash when it fails to alloc ring buffer ring-buffer: Do no reuse reader page if still in use tracing: Remove extra zeroing out of the ring buffer page ring-buffer: Mask out the info bits when returning buffer page length