aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-05-05Merge branch 'ipv4-Move-location-of-pcpu-route-cache-and-exceptions'David S. Miller3-67/+64
David Ahern says: ==================== ipv4: Move location of pcpu route cache and exceptions This series moves IPv4 pcpu cached routes from fib_nh to fib_nh_common to make the caches available for IPv6 nexthops (fib6_nh) with IPv4 routes. This allows a fib6_nh struct to be used with both IPv4 and and IPv6 routes. v4 - fixed memleak if encap_type is not set as noticed by Ido v3 - dropped ipv6 patches for now. Will resubmit those once the existing refcnt problem is fixed v2 - reverted patch 2 to use ifdef CONFIG_IP_ROUTE_CLASSID instead of IS_ENABLED(CONFIG_IP_ROUTE_CLASSID) to fix compile issues reported by kbuild test robot ==================== Signed-off-by: David S. Miller <[email protected]>
2019-05-05ipv4: Move exception bucket to nh_commonDavid Ahern3-31/+24
Similar to the cached routes, make IPv4 exceptions accessible when using an IPv6 nexthop struct with IPv4 routes. Simplify the exception functions by passing in fib_nh_common since that is all it needs, and then cleanup the call sites that have extraneous fib_nh conversions. As with the cached routes this is a change in location only, from fib_nh up to fib_nh_common; no functional change intended. Signed-off-by: David Ahern <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05ipv4: Pass fib_nh_common to rt_cache_routeDavid Ahern1-10/+10
Now that the cached routes are in fib_nh_common, pass it to rt_cache_route and simplify its callers. For rt_set_nexthop, the tclassid becomes the last user of fib_nh so move the container_of under the #ifdef CONFIG_IP_ROUTE_CLASSID. Signed-off-by: David Ahern <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05ipv4: Move cached routes to fib_nh_commonDavid Ahern3-28/+32
While the cached routes, nh_pcpu_rth_output and nh_rth_input, are IPv4 specific, a later patch wants to make them accessible for IPv6 nexthops with IPv4 routes using a fib6_nh. Move the cached routes from fib_nh to fib_nh_common and update references. Initialization of the cached entries is moved to fib_nh_common_init, and free is moved to fib_nh_common_release. Change in location only, from fib_nh up to fib_nh_common; no functional change intended. Signed-off-by: David Ahern <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05libbpf: add libbpf_util.h to header install.William Tu1-0/+1
The libbpf_util.h is used by xsk.h, so add it to the install headers. Reported-by: Ben Pfaff <[email protected]> Signed-off-by: William Tu <[email protected]> Acked-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-05tools/bpf: fix perf build error with uClibc (seen on ARC)Vineet Gupta1-0/+2
When build perf for ARC recently, there was a build failure due to lack of __NR_bpf. | Auto-detecting system features: | | ... get_cpuid: [ OFF ] | ... bpf: [ on ] | | # error __NR_bpf not defined. libbpf does not support your arch. ^~~~~ | bpf.c: In function 'sys_bpf': | bpf.c:66:17: error: '__NR_bpf' undeclared (first use in this function) | return syscall(__NR_bpf, cmd, attr, size); | ^~~~~~~~ | sys_bpf Signed-off-by: Vineet Gupta <[email protected]> Acked-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04bpftool: exclude bash-completion/bpftool from .gitignore patternMasahiro Yamada1-1/+1
tools/bpf/bpftool/.gitignore has the "bpftool" pattern, which is intended to ignore the following build artifact: tools/bpf/bpftool/bpftool However, the .gitignore entry is effective not only for the current directory, but also for any sub-directories. So, from the point of .gitignore grammar, the following check-in file is also considered to be ignored: tools/bpf/bpftool/bash-completion/bpftool As the manual gitignore(5) says "Files already tracked by Git are not affected", this is not a problem as far as Git is concerned. However, Git is not the only program that parses .gitignore because .gitignore is useful to distinguish build artifacts from source files. For example, tar(1) supports the --exclude-vcs-ignore option. As of writing, this option does not work perfectly, but it intends to create a tarball excluding files specified by .gitignore. So, I believe it is better to fix this issue. You can fix it by prefixing the pattern with a slash; the leading slash means the specified pattern is relative to the current directory. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Quentin Monnet <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04Merge branch 'af_xdp-fixes'Alexei Starovoitov1-92/+100
Björn Töpel says: ==================== William found two bugs, when doing socket teardown within the same process. The first issue was an invalid munmap call, and the second one was an invalid XSKMAP cleanup. Both resulted in that the process kept references to the socket, which was not correctly cleaned up. When a new socket was created, the bind() call would fail, since the old socket was still lingering, refusing to give up the queue on the netdev. More details can be found in the individual commits. Thanks, Björn ==================== Reviewed-by: Jonathan Lemon <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04libbpf: proper XSKMAP cleanupBjörn Töpel1-55/+60
The bpf_map_update_elem() function, when used on an XSKMAP, will fail if not a valid AF_XDP socket is passed as value. Therefore, this is function cannot be used to clear the XSKMAP. Instead, the bpf_map_delete_elem() function should be used for that. This patch also simplifies the code by breaking up xsk_update_bpf_maps() into three smaller functions. Reported-by: William Tu <[email protected]> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Björn Töpel <[email protected]> Tested-by: William Tu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04libbpf: fix invalid munmap callBjörn Töpel1-37/+40
When unmapping the AF_XDP memory regions used for the rings, an invalid address was passed to the munmap() calls. Instead of passing the beginning of the memory region, the descriptor region was passed to munmap. When the userspace application tried to tear down an AF_XDP socket, the operation failed and the application would still have a reference to socket it wished to get rid of. Reported-by: William Tu <[email protected]> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Björn Töpel <[email protected]> Tested-by: William Tu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04selftests/bpf: set RLIMIT_MEMLOCK properly for test_libbpf_open.cYonghong Song1-0/+2
Test test_libbpf.sh failed on my development server with failure -bash-4.4$ sudo ./test_libbpf.sh [0] libbpf: Error in bpf_object__probe_name():Operation not permitted(1). Couldn't load basic 'r0 = 0' BPF program. test_libbpf: failed at file test_l4lb.o selftests: test_libbpf [FAILED] -bash-4.4$ The reason is because my machine has 64KB locked memory by default which is not enough for this program to get locked memory. Similar to other bpf selftests, let us increase RLIMIT_MEMLOCK to infinity, which fixed the issue. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04bpf: Use PTR_ERR_OR_ZERO in bpf_fd_sk_storage_update_elem()YueHaibing1-1/+1
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Signed-off-by: YueHaibing <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-05-04i40e: Memory leak in i40e_config_iwarp_qvlistMartyna Szapar1-8/+15
Added freeing the old allocation of vf->qvlist_info in function i40e_config_iwarp_qvlist before overwriting it with the new allocation. Signed-off-by: Martyna Szapar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Fix of memory leak and integer truncation in i40e_virtchnl.cMartyna Szapar1-6/+10
Fixed possible memory leak in i40e_vc_add_cloud_filter function: cfilter is being allocated and in some error conditions the function returns without freeing the memory. Fix of integer truncation from u16 (type of queue_id value) to u8 when calling i40e_vc_isvalid_queue_id function. Signed-off-by: Martyna Szapar <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Use struct_size() in kzalloc()Gustavo A. R. Silva1-4/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = kzalloc(size, GFP_KERNEL) Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL) Notice that, in this case, variable size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Revert ShadowRAM checksum calculation changeMaciej Paczkowski1-25/+3
The reason of this revert is unexpected issue found in NVM Update tool during NVM image downgrade. The implementation is no longer needed since the QV tools are already aware of new FW double ShadowRAM dump mechanism. This patch reverts ShadowRAM checksum calculation change introduced in commit 9d12f0c4e436 ("i40e: Revert ShadowRAM checksum calculation change") Signed-off-by: Maciej Paczkowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: missing input validation on VF message handling by the PFMartyna Szapar2-14/+46
Patch is adding missing input validation on VF message handling by the PF to the functions with opcodes: VIRTCHNL_OP_CONFIG_VSI_QUEUES = 6 VIRTCHNL_OP_CONFIG_IRQ_MAP = 7, VIRTCHNL_OP_DISABLE_QUEUES = 9, VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE = 14, Signed-off-by: Martyna Szapar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Add support for X710 B/P & SFP+ cardsAleksandr Loktionov6-2/+74
New device ids are created to support X710 backplane and SFP+ cards. This patch adds in i40e driver support for 2.5GbaseT and 5GbaseT speed. It's implemented by checking I40E_CAP_PHY_TYPE_2_5GBASE_T, I40E_CAP_PHY_TYPE_5GBASE_T bits from f/w and setting corresponding bits in ethtool link ksettings supported and advertising masks. Signed-off-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Alice Michael <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Wrong truncation from u16 to u8Grzegorz Siwik1-1/+1
In this patch fixed wrong truncation method from u16 to u8 during validation. It was changed by changing u8 to u32 parameter in method declaration and arguments were changed to u32. Signed-off-by: Grzegorz Siwik <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: add num_vectors checker in iwarp handlerSergey Nemov1-0/+10
Field num_vectors from struct virtchnl_iwarp_qvlist_info should not be larger than num_msix_vectors_vf in the hw struct. The iwarp uses the same set of vectors as the LAN VF driver. Signed-off-by: Sergey Nemov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Fix the typo in adding 40GE KR4 modeGrzegorz Siwik1-2/+2
This patch fixes the typo in I40E_CAP_PHY_TYPE mode link code. It was fixed by changing 40000baseLR4_Full to 40000baseKR4_Full Signed-off-by: Grzegorz Siwik <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: Setting VF to VLAN 0 requires restartGrzegorz Siwik1-2/+2
This patch fixes a bug where changing VLAN to 0 was not set until VF restart. Now we are setting pvid info to 0 when we have to change VLAN to 0. Without this change when VF VLAN was changed to 0 nothing happened until VF restart. For changing to VLAN different than 0 it worked correctly. Signed-off-by: Grzegorz Siwik <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: add new pci id for X710/XXV710 N3000 cardsAleksandr Loktionov3-0/+6
New device ids are created to support X710/XXV710 N3000 cards. Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04i40e: VF's promiscuous attribute is not keptGrzegorz Siwik1-0/+23
This patch fixes a bug where the promiscuous mode was not being kept when the VF switched to a new VLAN. Now we are config two times a promiscuous mode when we switch VLAN. Without this change when we change VF VLAN we still receive all the packets from previous VLAN and only unicast from new VLAN. Signed-off-by: Grzegorz Siwik <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Disable sniffing VF traffic on PFMichal Swiatkowski1-22/+2
Delete code that add default Tx rule on PF. With this rule PF can see Tx VF traffic that should go outside. For traffic from VF to another VF default Tx rule on PF doesn't apply because of lower priority than VF mac rule. With this change on PF in promisc mode we can see only Rx traffic that doesn't match any other rule (mac etc.). We can't see Tx traffic from other VSI. Signed-off-by: Michal Swiatkowski <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Use more efficient structuresJesse Brandeburg1-2/+2
Move a bunch of members around to make more efficient use of memory, eliminating holes where possible. None of these members are hot path so cache line alignment is not very important here. Signed-off-by: Jesse Brandeburg <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Use bitfields where possibleJesse Brandeburg1-5/+5
The driver was converted to not use bool, but it was neglected that the bools should have been converted to bit fields as bit fields in software structures are ok, as long as they use the correct kinds of unsigned types. This avoids wasting lots of storage space to store single bit values. One of the change hunks moves a variable lport out of a group of "combinable" bit fields because all bits of the u8 lport are valid and the variable can be packed in the struct in struct holes. Signed-off-by: Jesse Brandeburg <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Add function to program ethertype based filter rule on VSIsAkeem G Abodunrin4-0/+120
This patch adds function to program VSI with ethertype based filter rule, so that all flow control frames would be disallowed from being transmitted to the client, in order to prevent malicious VSI, especially VF from sending out PAUSE or PFC frames, and then control other VSIs traffic. Signed-off-by: Akeem G Abodunrin <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Separate if conditions for ice_set_features()Tony Nguyen1-2/+6
Set features can have multiple features turned on|off in a single call. Grouping these all in an if/else means after one condition is met, other conditions/features will not be evaluated. Break the if/else statements by feature to ensure all features will be handled properly. Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Remove __always_unused attributeTony Nguyen1-1/+1
The variable netdev is being used in this function; remove the __always_unused attribute from it. Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Suppress false-positive style issues reported by static analyzerBruce Allan1-0/+1
A recent version of cppcheck falsely reports- Variable ip.hdr is assigned a value that is never used. ip is a union so the pointer ip.hdr is actually used when referenced as ip.v4 and ip.v6. Silence these false reports when using cppcheck with the --inline-suppr command-line option. Signed-off-by: Bruce Allan <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Refactor getting/setting coalesceBrett Creeley1-59/+93
Currently if the driver has an uneven amount of Rx/Tx queues setting the coalesce settings through ethtool will result in an error. This is happening because in the setting coalesce flow we are reporting an error if either Rx or Tx fails. Also, the flow for setting/getting per_q_coalesce and setting/getting coalesce settings for the entire device is different. Fix these issues by adding one function, ice_set_q_coalesce(), and another, ice_get_q_coalesce(), that both getting/setting per_q and entire device coalesce can use. This makes handling the error cases generic between the two flows and simplifies __ice_set_coalesce() and __ice_get_coalesce(). Also, add a header comment to __ice_set_coalesce(). Signed-off-by: Brett Creeley <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Always free/allocate q_vectorsBrett Creeley1-20/+14
Currently when probing/removing the driver we allocate/deallocate each vsi->q_vectors array in ice_vsi_alloc_arrays() and ice_vsi_free_arrays() respectively. However, we don't do this during the reset and VSI rebuild flow. This is inconsistent and unnecessary to have a difference between the two flows. This patch makes the change to always allocate/deallocate the vsi->q_vectors array regardless of the driver flow we are in. Also, update the comment for ice_vsi_free_arrays() to be more descriptive. Signed-off-by: Brett Creeley <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Do not unnecessarily initialize local variableBruce Allan1-1/+1
The local variable speed does not need to be initialized and can cause some static analysis tools to complain the initial assigned value is never used. Signed-off-by: Bruce Allan <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Add more validation in ice_vc_cfg_irq_map_msgMichal Swiatkowski4-28/+36
Add few checks to validate msg from iavf driver. Test if we have got enough q_vectors allocated in VSI connected with VF. Add masks for itr_indx and msix_indx to avoid writing to reserved fieldi of QINT. Clear q_vector->num_ring_rx/tx, without it we can increment this value every time we send irq map msg from VF. So after second call this value will be incorrect. Decrement num_vectors from msg, because last vector in iavf msg is misc vector (we don't set map for it). Signed-off-by: Michal Swiatkowski <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Don't remove VLAN filters that were never programmedAkeem G Abodunrin2-2/+16
In case of non-trusted VFs, it is possible to program VLAN filter far less than what is requested by the VF originally, thereby makes number of VLAN elements being tracked by VF different from actual VLAN tags. This patch makes sure that we are not attempting to remove VLAN filter that does not exist. Signed-off-by: Akeem G Abodunrin <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Preserve VLAN Rx stripping settingsTony Nguyen1-0/+4
When Tx insertion is set, we are not accounting for the state of Rx stripping. This causes Rx stripping to be enabled any time Tx insertion is changed, even when it's supposed to be disabled. Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Fix for allowing too many MDD events on VFMichal Swiatkowski1-7/+8
Disable VF if any malicious device driver (MDD) event is detected by hardware. Track vf->num_mdd_events for information about VF MDD events. Signed-off-by: Michal Swiatkowski <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04ice: Use pf instead of vsi-backJesse Brandeburg1-30/+30
Many times in our functions we have a local variable pf, which is equivalent to vsi->back. Just use pf consistently instead of vsi->back where available. Signed-off-by: Jesse Brandeburg <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-05-04net: openvswitch: return an error instead of doing BUG_ON()Eelco Chaudron1-2/+5
For all other error cases in queue_userspace_packet() the error is returned, so it makes sense to do the same for these two error cases. Reported-by: Davide Caratti <[email protected]> Signed-off-by: Eelco Chaudron <[email protected]> Acked-by: Flavio Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04r8169: remove rtl_write_exgmac_batchHeiner Kallweit1-22/+4
rtl_write_exgmac_batch is used in only one place, so we can remove it. Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04Merge branch 'netlink-strict-attribute-checking-follow-up'David S. Miller3-12/+41
Michal Kubecek says: ==================== netlink: strict attribute checking follow-up Three follow-up patches for recent strict netlink validation series. Patch 1 fixes dump handling for genetlink families which validate and parse messages themselves (e.g. because they need different policies for diferent commands). Patch 2 sets bad_attr in extack in one place where this was omitted. Patch 3 adds new NL_VALIDATE_NESTED flags for strict validation to enable checking that NLA_F_NESTED value in received messages matches expectations and includes this flag in NL_VALIDATE_STRICT. This would change userspace visible behavior but the previous switching to NL_VALIDATE_STRICT for new code is still only in net-next at the moment. v2: change error messages to mention NLA_F_NESTED explicitly ==================== Signed-off-by: David S. Miller <[email protected]>
2019-05-04netlink: add validation of NLA_F_NESTED flagMichal Kubecek2-1/+25
Add new validation flag NL_VALIDATE_NESTED which adds three consistency checks of NLA_F_NESTED_FLAG: - the flag is set on attributes with NLA_NESTED{,_ARRAY} policy - the flag is not set on attributes with other policies except NLA_UNSPEC - the flag is set on attribute passed to nla_parse_nested() Signed-off-by: Michal Kubecek <[email protected]> v2: change error messages to mention NLA_F_NESTED explicitly Reviewed-by: Johannes Berg <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04netlink: set bad attribute also on maxtype checkMichal Kubecek1-1/+2
The check that attribute type is within 0...maxtype range in __nla_validate_parse() sets only error message but not bad_attr in extack. Set also bad_attr to tell userspace which attribute failed validation. Signed-off-by: Michal Kubecek <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04genetlink: do not validate dump requests if there is no policyMichal Kubecek1-10/+14
Unlike do requests, dump genetlink requests now perform strict validation by default even if the genetlink family does not set policy and maxtype because it does validation and parsing on its own (e.g. because it wants to allow different message format for different commands). While the null policy will be ignored, maxtype (which would be zero) is still checked so that any attribute will fail validation. The solution is to only call __nla_validate() from genl_family_rcv_msg() if family->maxtype is set. Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps") Signed-off-by: Michal Kubecek <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04Merge branch 'mlxsw-Firmware-version-update'David S. Miller2-13/+41
Ido Schimmel says: ==================== mlxsw: Firmware version update This patchset updates mlxsw to use a new firmware version and adds support for split into two ports on Spectrum-2 based systems. Patch #1 updates the firmware version to 13.2000.1122 Patch #2 queries new resources from the firmware. Patch #3 makes use of these resources in order to support split into two ports on Spectrum-2 based systems. The need for these resources is explained by Shalom: When splitting a port, different local ports need to be mapped on different systems. For example: SN3700 (local_ports_in_2x=2): * Without split: front panel 1 --> local port 1 front panel 2 --> local port 5 * Split to 2: front panel 1s0 --> local port 1 front panel 1s1 --> local port 3 front panel 2 --> local port 5 SN3800 (local_ports_in_2x=1): * Without split: front panel 1 --> local port 1 front panel 2 --> local port 3 * Split to 2: front panel 1s0 --> local port 1 front panel 1s1 --> local port 2 front panel 2 --> local port 3 The local_ports_in_{1x, 2x} resources provide the offsets from the base local ports according to which the new local ports can be calculated. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-05-04mlxsw: spectrum: split base on local_ports_in_{1x, 2x} resourcesShalom Toledo1-11/+35
When splitting a port, different local ports need to be mapped on different systems. For example: SN3700 (local_ports_in_2x=2): * Without split: front panel 1 --> local port 1 front panel 2 --> local port 5 * Split to 2: front panel 1s0 --> local port 1 front panel 1s1 --> local port 3 front panel 2 --> local port 5 SN3800 (local_ports_in_2x=1): * Without split: front panel 1 --> local port 1 front panel 2 --> local port 3 * Split to 2: front panel 1s0 --> local port 1 front panel 1s1 --> local port 2 front panel 2 --> local port 3 The local_ports_in_{1x, 2x} resources provide the offsets from the base local ports according to which the new local ports can be calculated. Signed-off-by: Shalom Toledo <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04mlxsw: resources: Add local_ports_in_{1x, 2x}Shalom Toledo1-0/+4
Since the number of local ports in 4x changed between SPC and SPC-2, firmware expose new resources that the driver can query. Signed-off-by: Shalom Toledo <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04mlxsw: Bump firmware version to 13.2000.1122Ido Schimmel1-2/+2
The new version supports two features that are required by upcoming changes in the driver: * Querying of new resources allowing port split into two ports on Spectrum-2 systems * Querying of number of gearboxes on supported systems such as SN3800 Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04tipc: fix missing Name entries due to half-failoverTuong Lien3-7/+84
TIPC link can temporarily fall into "half-establish" that only one of the link endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages, whereas the other link endpoint is not up (e.g. immediately when the endpoint receives ACTIVATE_MSG, the network interface goes down...). This is a normal situation and will be settled because the link endpoint will be eventually brought down after the link tolerance time. However, the situation will become worse when the second link is established before the first link endpoint goes down, For example: 1. Both links <1A-2A>, <1B-2B> down 2. Link endpoint 2A up, but 1A still down (e.g. due to network disturbance, wrong session, etc.) 3. Link <1B-2B> up 4. Link endpoint 2A down (e.g. due to link tolerance timeout) 5. Node B starts failover onto link <1B-2B> ==> Node A does never start link failover. When the "half-failover" situation happens, two consequences have been observed: a) Peer link/node gets stuck in FAILINGOVER state; b) Traffic or user messages that peer node is trying to failover onto the second link can be partially or completely dropped by this node. The consequence a) was actually solved by commit c140eb166d68 ("tipc: fix failover problem"), but that commit didn't cover the b). It's due to the fact that the tunnel link endpoint has never been prepared for a failover, so the 'l->drop_point' (and the other data...) is not set correctly. When a TUNNEL_MSG from peer node arrives on the link, depending on the inner message's seqno and the current 'l->drop_point' value, the message can be dropped (- treated as a duplicate message) or processed. At this early stage, the traffic messages from peer are likely to be NAME_DISTRIBUTORs, this means some name table entries will be missed on the node forever! The commit resolves the issue by starting the FAILOVER process on this node as well. Another benefit from this solution is that we ensure the link will not be re-established until the failover ends. Acked-by: Jon Maloy <[email protected]> Signed-off-by: Tuong Lien <[email protected]> Signed-off-by: David S. Miller <[email protected]>