aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/macvlan.c
AgeCommit message (Collapse)AuthorFilesLines
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-22macvlan: Let passthru macvlan correctly restore lower mac addressVlad Yasevich1-3/+44
Passthru macvlans directly change the mac address of the lower level device. That's OK, but after the macvlan is deleted, the lower device is left with changed address and one needs to reboot to bring back the origina HW addresses. This scenario is actually quite common with passthru macvtap devices. This patch attempts to solve this, by storing the mac address of the lower device in macvlan_port structure and keeping track of it through the changes. After this patch, any changes to the lower device mac address done trough the macvlan device, will be reverted back. Any changs done directly to the lower device mac address will be kept. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-22macvlan: convert port passthru to flags.Vlad Yasevich1-13/+24
Convert the port passthru boolean into flags with accesor functions. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-22macvlan: Fix passthru macvlan mac address inheritanceVlad Yasevich1-1/+2
When a lower device of the passthru macvlan changes it's address, passthru macvlan is supposed to change it's own address as well. However, that doesn't happen correctly because the check in macvlan_addr_busy() will catch the fact that the lower level (port) mac address is the same as the address we are trying to assign to the macvlan, and return an error. As a reasult, the address of the passthru macvlan device is never changed. The same thing happens when the user attempts to change the mac address of the passthru macvlan. The simple solution appers to be to not check against the lower device in case of passthru macvlan device, since the 2 addresses are _supposed_ to be the same. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-22macvlan: Do not return error when setting the same mac addressVlad Yasevich1-0/+4
The user currently gets an EBUSY error when attempting to set the mac address on a macvlan device to the same value. This should really be a no-op as nothing changes. Catch the condition and return early. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <[email protected]>
2017-06-14macvlan: propagate the mac address change status for lowerdevZhang Shengju1-4/+2
The macvlan dev should propagate the return value of mac address change for lower device in the passthru mode, instead of always return 0. Signed-off-by: Zhang Shengju <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-07net: Fix inconsistent teardown and release of private netdev state.David S. Miller1-1/+1
Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <[email protected]>
2017-05-15macvlan: Fix performance issues with vlan tagged packetsVlad Yasevich1-2/+5
Macvlan always turns on offload features that have sofware fallback (NETIF_GSO_SOFTWARE). This allows much higher guest-guest communications over macvtap. However, macvtap does not turn on these features for vlan tagged traffic. As a result, depending on the HW that mactap is configured on, the performance of guest-guest communication over a vlan is very inconsistent. If the HW supports TSO/UFO over vlans, then the performance will be fine. If not, the the performance will suffer greatly since the VM may continue using TSO/UFO, and will force the host segment the traffic and possibly overlow the macvtap queue. This patch adds the always on offloads to vlan_features. This makes sure that any vlan tagged traffic between 2 guest will not be segmented needlessly. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-25macvlan: Fix device ref leak when purging bc_queueHerbert Xu1-1/+10
When a parent macvlan device is destroyed we end up purging its broadcast queue without dropping the device reference count on the packet source device. This causes the source device to linger. This patch drops that reference count. Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...") Reported-by: Joe Ghalam <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-02-11tap: Abstract type of virtual interface from tap implementationSainath Grandhi1-1/+1
macvlan object is re-structured to hold tap related elements in a separate entity, tap_dev. Upon NETDEV_REGISTER device_event, tap_dev is registered with idr and fetched again on tap_open. Few of the tap functions are modified to accepted tap_dev as argument. tap_dev object includes callbacks to be used by underlying virtual interface to take care of tx and rx accounting. Signed-off-by: Sainath Grandhi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-20macvlan: use netdev_is_rx_handler_busy instead of checking specific typeMahesh Bandewar1-1/+1
netdev_is_rx_handler_busy() check is a superset of netif_is_ipvlan_port() check and hence should be preferred. Signed-off-by: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-08net: make ndo_get_stats64 a void functionstephen hemminger1-3/+2
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-12-07driver: macvlan: Remove the rcu member of macvlan_portGao Feng1-2/+1
When free macvlan_port in macvlan_port_destroy, it is safe to free directly because netdev_rx_handler_unregister could enforce one grace period. So it is unnecessary to use kfree_rcu for macvlan_port. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+2
udplite conflict is resolved by taking what 'net-next' did which removed the backlog receive method assignment, since it is no longer necessary. Two entries were added to the non-priv ethtool operations switch statement, one in 'net' and one in 'net-next, so simple overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2016-11-23driver: macvlan: Check if need rollback multicast setting in macvlan_openGao Feng1-1/+2
When dev_set_promiscuity failed in macvlan_open, it always invokes dev_set_allmulti without checking if necessary. Now check the IFF_ALLMULTI flag firstly before rollback the multicast setting in the error handler. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-11-21driver: macvlan: Remove duplicated IFF_UP condition check in ↵Gao Feng1-2/+1
macvlan_forward_source The function macvlan_forward_source_one has already checked the flag IFF_UP, so needn't check it outside in macvlan_forward_source too. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-9/+22
Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <[email protected]>
2016-11-14driver: macvlan: Replace integer number with bool valueGao Feng1-5/+5
The return value of function macvlan_addr_busy is used as bool value, so use bool value instead of integer number "1" and "0". Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-11-09driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.Gao Feng1-9/+22
When there is no existing macvlan port in lowdev, one new macvlan port would be created. But it doesn't be destoried when something failed later. It casues some memleak. Now add one flag to indicate if new macvlan port is created. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-10-20net: use core MTU range checking in core net infraJarod Wilson1-1/+7
geneve: - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu - This one isn't quite as straight-forward as others, could use some closer inspection and testing macvlan: - set min/max_mtu tun: - set min/max_mtu, remove tun_net_change_mtu vxlan: - Merge __vxlan_change_mtu back into vxlan_change_mtu - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function - This one is also not as straight-forward and could use closer inspection and testing from vxlan folks bridge: - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported sch_teql: - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535) macsec: - min_mtu = 0, max_mtu = 65535 macvlan: - min_mtu = 0, max_mtu = 65535 ntb_netdev: - min_mtu = 0, max_mtu = 65535 veth: - min_mtu = 68, max_mtu = 65535 8021q: - min_mtu = 0, max_mtu = 65535 CC: [email protected] CC: Nicolas Dichtel <[email protected]> CC: Hannes Frederic Sowa <[email protected]> CC: Tom Herbert <[email protected]> CC: Daniel Borkmann <[email protected]> CC: Alexander Duyck <[email protected]> CC: Paolo Abeni <[email protected]> CC: Jiri Benc <[email protected]> CC: WANG Cong <[email protected]> CC: Roopa Prabhu <[email protected]> CC: Pravin B Shelar <[email protected]> CC: Sabrina Dubroca <[email protected]> CC: Patrick McHardy <[email protected]> CC: Stephen Hemminger <[email protected]> CC: Pravin Shelar <[email protected]> CC: Maxim Krasnyansky <[email protected]> Signed-off-by: Jarod Wilson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-13net: remove type_check from dev_get_nest_level()Sabrina Dubroca1-1/+1
The idea for type_check in dev_get_nest_level() was to count the number of nested devices of the same type (currently, only macvlan or vlan devices). This prevented the false positive lockdep warning on configurations such as: eth0 <--- macvlan0 <--- vlan0 <--- macvlan1 However, this doesn't prevent a warning on a configuration such as: eth0 <--- macvlan0 <--- vlan0 eth1 <--- vlan1 <--- macvlan1 In this case, all the locks end up with a nesting subclass of 1, so lockdep thinks that there is still a deadlock: - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then take (vlan_netdev_xmit_lock_key, 1) - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then take (macvlan_netdev_addr_lock_key, 1) By removing the linktype check in dev_get_nest_level() and always incrementing the nesting depth, lockdep considers this configuration valid. Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-06-09net: macvlan: call netdev_lockdep_set_classes()Eric Dumazet1-10/+1
In case a qdisc is used on a macvlan device, we need to use different lockdep classes to avoid false positives. Use the new netdev_lockdep_set_classes() generic helper. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-06-01macvlan: Avoid unnecessary multicast cloningHerbert Xu1-6/+34
Currently we always queue a multicast packet for further processing, even if none of the macvlan devices are subscribed to the address. This patch optimises this by adding a global multicast filter for a macvlan_port. Note that this patch doesn't handle the broadcast addresses of the individual macvlan devices correctly, if they are not all identical to vlan->lowerdev. However, this is already broken because there is no mechanism in place to update the individual multicast filters when you change the broadcast address. If someone cares enough they should fix this by collecting all broadcast addresses for a macvlan as we do for multicast and unicast. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-06-01macvlan: Fix potential use-after free for broadcastsHerbert Xu1-2/+8
When we postpone a broadcast packet we save the source port in the skb if it is local. However, the source port can disappear before we get a chance to process the packet. This patch fixes this by holding a ref count on the netdev. It also delays the skb->cb modification until after we allocate the new skb as you should not modify shared skbs. Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26macvlan: fix failure during registration v3Francesco Ruggeri1-6/+4
If macvlan_common_newlink fails in register_netdevice after macvlan_init then it decrements port->count twice, first in macvlan_uninit (from register_netdevice or rollback_registered) and then again in macvlan_common_newlink. A similar problem may exist in the ipvlan driver. This patch consolidates modifications to port->count into macvlan_init and macvlan_uninit (thanks to Eric Biederman for suggesting this approach). v3: remove macvtap specific bits. Signed-off-by: Francesco Ruggeri <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-17vlan: propagate gso_max_segsEric Dumazet1-0/+2
vlan drivers lack proper propagation of gso_max_segs from lower device. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-02-25net: macvlan: use __ethtool_get_ksettingsDavid Decotigny1-4/+4
Signed-off-by: David Decotigny <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-02-17macvlan: convert to use IFF_NO_QUEUEZhang Shengju1-1/+1
Use IFF_NO_QUEUE to indicate that a device can run without a qdisc. Signed-off-by: Zhang Shengju <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-01-29macvlan: make operstate and carrier more accurateNikolay Aleksandrov1-0/+2
Currently when a macvlan is being initialized and the lower device is netif_carrier_ok(), the macvlan device doesn't run through rfc2863_policy() and is left with UNKNOWN operstate. Fix it by adding an unconditional linkwatch event for the new macvlan device. Similar fix is already used by the 8021q device (see register_vlan_dev()). Also fix the inconsistent state when the lower device has been down and its carrier was changed (when a device is down NETDEV_CHANGE doesn't get generated). The second issue can be seen f.e. when we have a macvlan on top of a 8021q device which has been down and its real device has been changing carrier states, after setting the 8021q device up, the macvlan device will have the same carrier state as it was before even though the 8021q can now have a different state. Example for case 1: 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 $ ip l add l eth2 macvl0 type macvlan $ ip l set macvl0 up $ ip l sh macvl0 72: macvl0@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether f6:0b:54:0a:9d:a3 brd ff:ff:ff:ff:ff:ff Example for case 2 (order is important): Prestate: eth2 UP/CARRIER, vlan1 down, vlan1-macvlan down $ ip l set vlan1-macvlan up $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff [ eth2 loses CARRIER before vlan1 has been UP-ed ] $ ip l sh eth2 4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff $ ip l set vlan1 up $ ip l sh vlan1 70: vlan1@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff vlan1-macvlan is still UP, still has carrier and is still in the same operstate as before. After the patch in case 1 macvl0 has state UP as it should and in case 2 vlan1-macvlan has state LOWERLAYERDOWN again as it should. Note that while the lower macvlan device is down their carrier and thus operstate can go out of sync but that will be fixed once the lower device goes up again. This behaviour seems to have been present since beginning of git history. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUMTom Herbert1-1/+1
These netif flags are unnecessary convolutions. It is more straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM, and NETIF_F_IPV6_CSUM directly. This patch also: - Cleans up can_checksum_protocol - Simplifies netdev_intersect_features Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASKTom Herbert1-1/+1
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the set of features for offloading all checksums. This is a mask of the checksum offload related features bits. It is incorrect to set both NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for features of a device. This patch: - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where NETIF_F_ALL_CSUM is being used as a mask). - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-11-17macvlan: fix leak in macvlan_handle_frameSabrina Dubroca1-0/+2
Reset pskb in macvlan_handle_frame in case skb_share_check returned a clone. Fixes: 8a4eb5734e8d ("net: introduce rx_handler results and logic around that") Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-10-12ipv4: Pass struct net into ip_defrag and ip_check_defragEric W. Biederman1-1/+1
The function ip_defrag is called on both the input and the output paths of the networking stack. In particular conntrack when it is tracking outbound packets from the local machine calls ip_defrag. So add a struct net parameter and stop making ip_defrag guess which network namespace it needs to defragment packets in. Signed-off-by: "Eric W. Biederman" <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-03macvlan: Don't segment multiple tagged packets on macvlan deviceToshiaki Makita1-0/+1
Macvlan/macvtap devices don't need to segment multiple tagged packets since the lower devices can segment them. Signed-off-by: Toshiaki Makita <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-05-04macvlan: Propagate promiscuity setting to lower devices.Vlad Yasevich1-0/+15
When a macvlan device is placed in promiscuous mode, it currently just sets it's multicast mask to permissive, but doesn't change the state of the lower device. As a result, not all multicast traffic can be received on such device. Additionally, none of a vlan traffic can be received on such device as well. This patch propagates the promiscuous mode setting to lower device so that lower device may receive all packets that macvlan may be interested in. Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-04-02macvlan: implement ndo_get_iflinkNicolas Dichtel1-1/+8
Don't use dev->iflink anymore. CC: Patrick McHardy <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-03-02net: Kill dev_rebuild_headerEric W. Biederman1-1/+0
Now that there are no more users kill dev_rebuild_header and all of it's implementations. This is long overdue. Signed-off-by: "Eric W. Biederman" <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-23macvlan: advertise link netns via netlinkNicolas Dichtel1-0/+6
Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-12-09macvlan: play well with ipvlan deviceMahesh Bandewar1-0/+3
If device is already used as an ipvlan port then refuse to use it as a macvlan port at early stage of port creation. thost1:~# ip link add link eth0 ipvl0 type ipvlan thost1:~# echo $? 0 thost1:~# ip link add link eth0 mvl0 type macvlan RTNETLINK answers: Device or resource busy thost1:~# echo $? 2 thost1:~# Signed-off-by: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-12-09macvlan: allow setting LRO independently of lower deviceMichal Kubeček1-4/+5
Since commit fbe168ba91f7 ("net: generic dev_disable_lro() stacked device handling"), dev_disable_lro() zeroes NETIF_F_LRO feature flag first for a macvlan device and then for its lower device. As an attempt to set NETIF_F_LRO to zero is ignored, dev_disable_lro() issues a warning and taints kernel. Allowing NETIF_F_LRO to be set independently of the lower device consists of three parts: - add the flag to hw_features to allow toggling it - allow setting it to 0 even if lower device has the flag set - add the flag to MACVLAN_FEATURES to restore copying from lower device on macvlan creation Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-12-02net: make vid as a parameter for ndo_fdb_add/ndo_fdb_delJiri Pirko1-2/+2
Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple u16 vid to drivers from there. Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Andy Gospodarek <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-11-29macvlan: delay the header check for dodgy packets into lower deviceJason Wang1-2/+3
We do header check twice for a dodgy packet. One is done before macvlan_start_xmit(), another is done before lower device's ndo_start_xmit(). The first one seems redundant so this patch tries to delay header check until a packet reaches its lower device (or macvtap) through always enabling NETIF_F_GSO_ROBUST for macvlan device. Cc: Patrick McHardy <[email protected]> Signed-off-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-25macvlan: fix a race on port dismantle and possible skb leaksEric Dumazet1-2/+8
We need to cancel the work queue after rcu grace period, otherwise it can be rescheduled by incoming packets. We need to purge queue if some skbs are still in it. We can use __skb_queue_head_init() variant in macvlan_process_broadcast() Signed-off-by: Eric Dumazet <[email protected]> Fixes: 412ca1550cbec ("macvlan: Move broadcasts into a work queue") Cc: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-10macvlan: optimize the receive path[email protected]1-5/+10
The netif_rx() call on the fast path of macvlan_handle_frame() appears to be there to ensure that we properly throttle incoming packets. However, it would appear as though the proper throttling is already in place for all possible ingress paths, and that the call is redundant. If packets are arriving from the physical NIC, we've already throttled them by this point. Otherwise, if they are coming via macvlan_queue_xmit(), it calls either 'dev_forward_skb()', which ends up calling netif_rx_internal(), or else in the broadcast case, we are throttling via macvlan_broadcast_enqueue(). The test results below are from off the box to an lxc instance running macvlan. Once the tranactions/sec stop increasing, the cpu idle time has gone to 0. Results are from a quad core Intel E3-1270 [email protected] box with bnx2x 10G card. for i in {10,100,200,300,400,500}; do super_netperf $i -H $ip -t TCP_RR; done Average of 5 runs. trans/sec trans/sec (3.17-rc7-net-next) (3.17-rc7-net-next + this patch) ---------- ---------- 208101 211534 (+1.6%) 839493 850162 (+1.3%) 845071 844053 (-.12%) 816330 819623 (+.4%) 778700 789938 (+1.4%) 735984 754408 (+2.5%) Signed-off-by: Jason Baron <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-10macvlan: pass 'bool' type to macvlan_count_rx()[email protected]1-3/+3
Pass last argument to macvlan_count_rx() as the correct bool type. Signed-off-by: Jason Baron <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-07net: better IFF_XMIT_DST_RELEASE supportEric Dumazet1-1/+2
Testing xmit_more support with netperf and connected UDP sockets, I found strange dst refcount false sharing. Current handling of IFF_XMIT_DST_RELEASE is not optimal. Dropping dst in validate_xmit_skb() is certainly too late in case packet was queued by cpu X but dequeued by cpu Y The logical point to take care of drop/force is in __dev_queue_xmit() before even taking qdisc lock. As Julian Anastasov pointed out, need for skb_dst() might come from some packet schedulers or classifiers. This patch adds new helper to cleanly express needs of various drivers or qdiscs/classifiers. Drivers that need skb_dst() in their ndo_start_xmit() should call following helper in their setup instead of the prior : dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; -> netif_keep_dst(dev); Instead of using a single bit, we use two bits, one being eventually rebuilt in bonding/team drivers. The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being rebuilt in bonding/team. Eventually, we could add something smarter later. Signed-off-by: Eric Dumazet <[email protected]> Cc: Julian Anastasov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-09-29macvlan: add source modeMichael Braun1-3/+301
This patch adds a new mode of operation to macvlan, called "source". It allows one to set a list of allowed mac address, which is used to match against source mac address from received frames on underlying interface. This enables creating mac based VLAN associations, instead of standard port or tag based. The feature is useful to deploy 802.1x mac based behavior, where drivers of underlying interfaces doesn't allows that. Configuration is done through the netlink interface using e.g.: ip link add link eth0 name macvlan0 type macvlan mode source ip link add link eth0 name macvlan1 type macvlan mode source ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11 ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22 ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33 ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33 ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44 This allows clients with MAC addresses 00:11:11:11:11:11, 00:22:22:22:22:22 to be part of only VLAN associated with macvlan0 interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN associated with macvlan1 interface. And client with MAC address 00:33:33:33:33:33 to be associated with both VLANs. Based on work of Stefan Gula <[email protected]> v8: last version of Stefan Gula for Kernel 3.2.1 v9: rework onto linux-next 2014-03-12 by Michael Braun add MACADDR_SET command, enable to configure mac for source mode while creating interface v10: - reduce indention level - rename source_list to source_entry - use aligned 64bit ether address - use hash_64 instead of addr[5] v11: - rebase for 3.14 / linux-next 20.04.2014 v12 - rebase for linux-next 2014-09-25 Signed-off-by: Michael Braun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-09-22macvlan: allow to enqueue broadcast pkt on virtual deviceNicolas Dichtel1-1/+2
Since commit 412ca1550cbe ("macvlan: Move broadcasts into a work queue"), the driver uses tx_queue_len of the master device as the limit of packets enqueuing. Problem is that virtual drivers have this value set to 0, thus all broadcast packets were rejected. Because tx_queue_len was arbitrarily chosen, I replace it with a static limit of 1000 (also arbitrarily chosen). CC: Herbert Xu <[email protected]> Reported-by: Thibaut Collet <[email protected]> Suggested-by: Thibaut Collet <[email protected]> Tested-by: Thibaut Collet <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>