Age | Commit message (Collapse) | Author | Files | Lines |
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-05-17
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Provide a new BPF helper for doing a FIB and neighbor lookup
in the kernel tables from an XDP or tc BPF program. The helper
provides a fast-path for forwarding packets. The API supports
IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are
implemented in this initial work, from David (Ahern).
2) Just a tiny diff but huge feature enabled for nfp driver by
extending the BPF offload beyond a pure host processing offload.
Offloaded XDP programs are allowed to set the RX queue index and
thus opening the door for defining a fully programmable RSS/n-tuple
filter replacement. Once BPF decided on a queue already, the device
data-path will skip the conventional RSS processing completely,
from Jakub.
3) The original sockmap implementation was array based similar to
devmap. However unlike devmap where an ifindex has a 1:1 mapping
into the map there are use cases with sockets that need to be
referenced using longer keys. Hence, sockhash map is added reusing
as much of the sockmap code as possible, from John.
4) Introduce BTF ID. The ID is allocatd through an IDR similar as
with BPF maps and progs. It also makes BTF accessible to user
space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data
through BPF_OBJ_GET_INFO_BY_FD, from Martin.
5) Enable BPF stackmap with build_id also in NMI context. Due to the
up_read() of current->mm->mmap_sem build_id cannot be parsed.
This work defers the up_read() via a per-cpu irq_work so that
at least limited support can be enabled, from Song.
6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND
JIT conversion as well as implementation of an optimized 32/64 bit
immediate load in the arm64 JIT that allows to reduce the number of
emitted instructions; in case of tested real-world programs they
were shrinking by three percent, from Daniel.
7) Add ifindex parameter to the libbpf loader in order to enable
BPF offload support. Right now only iproute2 can load offloaded
BPF and this will also enable libbpf for direct integration into
other applications, from David (Beckett).
8) Convert the plain text documentation under Documentation/bpf/ into
RST format since this is the appropriate standard the kernel is
moving to for all documentation. Also add an overview README.rst,
from Jesper.
9) Add __printf verification attribute to the bpf_verifier_vlog()
helper. Though it uses va_list we can still allow gcc to check
the format string, from Mathieu.
10) Fix a bash reference in the BPF selftest's Makefile. The '|& ...'
is a bash 4.0+ feature which is not guaranteed to be available
when calling out to shell, therefore use a more portable variant,
from Joe.
11) Fix a 64 bit division in xdp_umem_reg() by using div_u64()
instead of relying on the gcc built-in, from Björn.
12) Fix a sock hashmap kmalloc warning reported by syzbot when an
overly large key size is used in hashmap then causing overflows
in htab->elem_size. Reject bogus attr->key_size early in the
sock_hash_alloc(), from Yonghong.
13) Ensure in BPF selftests when urandom_read is being linked that
--build-id is always enabled so that test_stacktrace_build_id[_nmi]
won't be failing, from Alexei.
14) Add bitsperlong.h as well as errno.h uapi headers into the tools
header infrastructure which point to one of the arch specific
uapi headers. This was needed in order to fix a build error on
some systems for the BPF selftests, from Sirio.
15) Allow for short options to be used in the xdp_monitor BPF sample
code. And also a bpf.h tools uapi header sync in order to fix a
selftest build failure. Both from Prashant.
16) More formally clarify the meaning of ID in the direct packet access
section of the BPF documentation, from Wang.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to
free it.
Fixes: fed9ce22bf8ae ("net/mlx5: E-Switch, Add API to create vport rx rules")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to
free it.
Fixes: 9efa75254593d ("net/mlx5_core: Introduce access functions to query vport RoCE fields")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
For T6, clip table is separated from main TCAM. So, update LE-TCAM
collection logic to collect clip table TCAM as well. IPv6 takes
4 entries in clip table TCAM compared to 2 entries in main TCAM.
Also, in case of errors, keep LE-TCAM collected so far and set the
status to partial dump.
Signed-off-by: Rahul Lakkireddy <[email protected]>
Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.
This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.
Fixes: 1d6cff4fca4366 ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.
Fixes: 0a7fb11c23c0f ("qed: Add Light L2 support")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Driver should free all pending isles once it gets a FLUSH cqe from FW.
Part of iSCSI out of order flow.
Fixes: 1d6cff4fca4366 ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.
This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.
Fixes: 1d6cff4fca4366 ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.
Fixes: 0a7fb11c23c0f ("qed: Add Light L2 support")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Driver should free all pending isles once it gets a FLUSH cqe from FW.
Part of iSCSI out of order flow.
Fixes: 1d6cff4fca4366 ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
clk_disable_unprepare() already checks that the clock pointer is valid.
No need to test it before calling it.
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Tobias Klauser <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
clk_disable_unprepare() already checks that the clock pointer is valid.
No need to test it before calling it.
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The configuration settings for RBTX4927 were accidentally removed,
leading to a silently broken network interface.
Re-add the missing settings to fix this.
Fixes: 8eb97ff5a4ec941d ("net: 8390: remove m32r specific bits")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch introduces ops structure for sgmii, This by ensures that
we do not need dummy functions in case of emulation platforms.
Signed-off-by: Hemanth Puranik <[email protected]>
Acked-by: Timur Tabi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The command packet size is already checked once in
rmnet_map_deaggregate() for the header, packet and trailer size, so
this additional check is not needed.
Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add ethtool private stats handler to debug the handling of packets
with checksum offload header / trailer. This allows to keep track of
the number of packets for which hardware computes the checksum and
counts and reasons where checksum computation was skipped in hardware
and was done in the network stack.
Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Packets in transmit path could potentially be dropped if there were
errors while adding the MAP header or the checksum header.
Increment the tx_drops stats in these cases.
Additionally, refactor the code to free the packet and increment
the tx_drops stat under a single label.
Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
A number of drivers have the following pattern:
if (np)
of_mdiobus_register()
else
mdiobus_register()
which the implementation of of_mdiobus_register() now takes care of.
Remove that pattern in drivers that strictly adhere to it.
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Reviewed-by: Fugang Duan <[email protected]>
Reviewed-by: Antoine Tenart <[email protected]>
Reviewed-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This fixes klockworks warnings: Pointer 'dev' returned from call to
function 'bus_find_device' at line 179 may be NULL and will be dereferenced
at line 181.
cpsw-phy-sel.c:179: 'dev' is assigned the return value from function 'bus_find_device'.
bus.c:342: 'bus_find_device' explicitly returns a NULL value.
cpsw-phy-sel.c:181: 'dev' is dereferenced by passing argument 1 to function 'dev_get_drvdata'.
device.h:1024: 'dev' is passed to function 'dev_get_drvdata'.
device.h:1026: 'dev' is explicitly dereferenced.
Signed-off-by: Grygorii Strashko <[email protected]>
[[email protected]: add an error message, fix return path]
Signed-off-by: Sekhar Nori <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-05-14
Misc update for mlx5e netdevice driver
From Gal Pressman:
- Remove MLX5E_TEST_BIT macros and use test_bit instead
- Use __set_bit when possible
From Eran Ben Elisha:
- Improve debug print on initial RX posting timeout
From Or Gerlitz:
- Support offloaded TC flows with no matches on headers
- mlx5e TC cleanups
Trivial cleanups From Roi, Tariq and Saeed:
- Use bool as return type for mlx5e_xdp_handle
- Use u8 instead of int for LRO number of segments
- Skip redundant checks when providing NUD lastuse feedback
- Remove redundant vport context vlan update
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
We call pcim_iomap in hclge_pci_init, pcim_iounmap should be called
in error handle of hclge_init_ae_dev.
We call pcim_iomap in hclge_pci_init, but do not call pcim_iounmap in
hclge_pci_uninit. When we remove the hclge.ko and insert it again, a
problem that pci can not map will happen. pcim_iounmap need to be called
in hclge_pci_uninit.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
As HNS3 driver will enable SRIOV default and enable all VFs the
HW support, if PF and VF driver compiled to kernel, VF driver
will work on host default, it is not right.
This patch adds support for hns3_driver.sriov_configure to support
user configs the VF_num, and do not enable sriov default.
Signed-off-by: Peng Li <[email protected]>
Suggested-by: Zhou Wang <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When hclge_ae_start is called, hdev->hw.mac.link may be set
to one after up/down multi-times, which does not correspond to
the link state of netdev when the netdev is up.
This fixes it by setting hdev->hw.mac.link to zero when
hclge_ae_start is called.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When sriov is enabled, the Qset and tc mapping is not longer one
to one relation.
This patch fixes it by mapping all pf and vf's Qset to tc.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
A client includes many client instance. Just like ae_algo, Initializing
client instance failed does not represent registering client failed.
The action of registering client just is adding client to the client
list and the result always is true. This patch changes the return
value of hnae3_register_client form a variable value to a fixed value,
makes the function always return ok.
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ae_algo is used by many ae_devs. It is not only belong to just a
ae_dev. Initializing ae_dev failed does not represent registering ae_algo
failed. Because the action of registering ae_algo just is adding ae_algo
to the ae_algo list and it is always is true, it make no sense to define
return type as int.
This patch changes the return type of hnae3_register_ae_algo from int to
void.
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If hclge.ko has not been inserted, the value of ret always is zero
in hnae3_register_ae_dev. If hclge.ko has been inserted, the value
of ret is zero or non zero. Different execution ways have different
results. It is confusing.
The ae_dev which is initialized failed can be reinitialized when we
remove hclge.ko and insert it again. For the case initializing client
instance, it is just like the case initializing ae_dev. The main function
of hnae3_register_ae_dev is adding the ae_dev to ad_dev list. Because
adding ae_dev is always ok, we does not need to return any in this
function.
This patch changes the return type of hnae3_register_ae_dev from int
to void.
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If the client instance is initializd failed, we do not need to uninit it.
This patch adds a state check to check init state of client instance.
Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
ae_dev failed
When initializing ae_dev failed during loading hclge.ko, the drvdata will
be set to null. When removing hns3.ko, we get a null ae_dev. It causes the
null pointer problem.
This patch removes pci_set_drvdata from error handle of hclge_init_ae_dev
to fix the bug, since pci_set_drvdata has been called in hns3_remove.
Also, we do not need to uninit the ae_dev which is not initialized. And
it may be the one which is initialized failed.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When hnae3_unregister_ae_algo is called by PF, pci_disable_sriov is
called. And then, hns3_remove is called by VF. We get deadlocked in
this case.
Since VF pci device is dependent on PF pci device, When PF pci device
is removed, VF pci device must be removed. Also, To solve the deadlock
problem, VF pci device should be removed before PF pci device is removed.
This patch moves pci_enable/disable_sriov from hclge to hns3 to solve
the deadlock problem.
Also, we do not need to return EPROBE_DEFER in hnae3_register_ae_dev,
because SRIOV is no longer enabled in the context calling
hnae3_register_ae_dev. Mutex_trylock can be replaced with mutex_lock.
Fixes: 424eb834a9be ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a driver for Microsemi Ocelot Ethernet switch support.
This makes two modules:
mscc_ocelot_common handles all the common features that doesn't depend on
how the switch is integrated in the SoC. Currently, it handles offloading
bridging to the hardware. ocelot_io.c handles register accesses. This is
unfortunately needed because the register layout is packed and then depends
on the number of ports available on the switch. The register definition
files are automatically generated.
ocelot_board handles the switch integration on the SoC and on the board.
Frame injection and extraction to/from the CPU port is currently done using
register accesses which is quite slow. DMA is possible but the port is not
able to absorb the whole switch bandwidth.
Signed-off-by: Alexandre Belloni <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Adds support for matching flows based on tunnel VNI value.
Introduces fw APIs for allocating/removing MPS entries related
to encapsulation. And uses the same while adding/deleting filters
for offloading flows based on tunnel VNI match.
Signed-off-by: Kumar Sanghvi <[email protected]>
Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Earlier code of doing bitwise AND with field width bits was wrong.
Instead, simplify code to calculate ntuple_mask based on supplied
fields and then compare with mask configured in hw - which is the
correct and simpler way to validate ntuple mask.
Fixes: 3eb8b62d5a26 ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: Kumar Sanghvi <[email protected]>
Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
MLX5E_TEST_BIT macro is the same as the already existent test_bit,
remove it and replace all usages.
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Replace (mask & bit) check with test_bit.
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Make the code more clear by replacing the existing code with __set_bit.
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Report all channels which got timeout on posting the minimal number of
RX WQEs and not only the first one. Avoid busy wait on every channel,
when one of the RQs check got timeout, poll once for the remaining RQs.
In addition, add channel index to log when failed to get min RX WQEs
This info is needed in order to debug in case of dysfunctional channel.
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
For example:
tc filter add dev ens2f0_0 parent ffff: flower skip_sw action drop
Note that for eswitch flows, we still always match on the source port.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Introduce levels of matching on headers of offloaded flows
(none, L2, L3, L4) that follow the inline mode levels.
This is pre-step for us to offload flows without any
matches on headers.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Set the initial value to none instead of L2, and set to L2
where the previous initial value was assumed. Make sure to
parse L2 matches before L3 matches and L3 before L4.
This is a pre-step to get the match level for more purposes
other than the validating the needed vs. actual inline level.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Use local actions variable while parsing the actions of offloaded TC flow.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Reaching here, means we didn't err anywhere, so lets just
return success.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Jianbo Liu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
This is not needed as the attributes are zeroed out on allocation.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Jianbo Liu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Clean warning/check complaints made by checkpatch on en_{tc,rep}.c
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Jianbo Liu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
The firmware DMAC_47_16 header re-write token was defined twice,
clean it up.
Signed-off-by: Or Gerlitz <[email protected]>
Reported-by: Jianbo Liu <[email protected]>
Reviewed-by: Jianbo Liu <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Function returns boolean values, use bool instead of int.
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Range of LRO number of segments fits in u8.
Also, bring initialization and declaration together to
save code.
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
It's redundant to continue the loop if we found one flow whose lastuse value
being newer than the last one we reported, since this is enough for us to
trigger a NUD update (neigh_event_send()).
Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Paul Blakey <[email protected]>
Reviewed-by: Or Gerlitz <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
In delete vlan flow an extra call to mlx5e_vport_context_update_vlans
was added by mistake, remove it.
Fixes: 86d722ad2c3b ("net/mlx5: Use flow steering infrastructure for mlx5_en")
Signed-off-by: Saeed Mahameed <[email protected]>
Reviewed-by: Gal Pressman <[email protected]>
|
|
We no longer require a check for cxgb4 to be MASTER
when configuring SRIOV, It was required when we had
module parameter to instantiate vf.
Signed-off-by: Arjun Vynipadath <[email protected]>
Signed-off-by: Casey Leedom <[email protected]>
Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|