aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2022-05-11net: warn if transport header was not setEric Dumazet1-0/+2
Make sure skb_transport_header() and skb_transport_offset() uses are not fooled if the transport header has not been set. This change will likely expose existing bugs in linux networking stacks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11net: add include/net/net_debug.hEric Dumazet1-140/+1
Remove from include/linux/netdevice.h helpers that send debug/info/warnings to syslog. We plan adding more helpers in following patches. v2: added two includes, and 'struct net_device' forward declaration to avoid compile errors (kernel bots) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11Merge tag 'mlx5-updates-2022-05-09' of ↵David S. Miller1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-05-09 1) Gavin Li, adds exit route from waiting for FW init on device boot and increases FW init timeout on health recovery flow 2) Support 4 ports HCAs LAG mode Mark Bloch Says: ================ This series adds to mlx5 drivers support for 4 ports HCAs. Starting with ConnectX-7 HCAs with 4 ports are possible. As most driver parts aren't affected by such configuration most driver code is unchanged. Specially the only affected areas are: - Lag - Devcom - Merged E-Switch - Single FDB E-Switch Lag was chosen to be converted first. Creating hardware LAG when all 4 ports are added to the same bond device. Devom, merge E-Switch and single FDB E-Switch, are marked as supporting only 2 ports HCAs and future patches will add support for 4 ports HCAs. In order to activate the hardware lag a user can execute the: ip link add bond0 type bond ip link set bond0 type bond miimon 100 mode 2 ip link set eth2 master bond0 ip link set eth3 master bond0 ip link set eth4 master bond0 ip link set eth5 master bond0 Where eth2, eth3, eth4 and eth5 are the PFs of the same HCA. ================ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-10skbuff: render the checksum comment to documentationJakub Kicinski1-95/+124
Long time ago Tom added a giant comment to skbuff.h explaining checksums. Now that we have a place in Documentation for skbuff docs we should render it. Sprinkle some markup while at it. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10skbuff: rewrite the doc for data-only skbsJakub Kicinski1-12/+30
The comment about shinfo->dataref split is really unhelpful, at least to me. Rewrite it and render it to skb documentation. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10skbuff: add a basic intro docJakub Kicinski1-0/+40
Add basic skb documentation. It's mostly an intro to the subsequent patches - it would looks strange if we documented advanced topics without covering the basics in any way. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10ptp: Support late timestamp determinationGerhard Engleder2-3/+32
If a physical clock supports a free running cycle counter, then timestamps shall be based on this time too. For TX it is known in advance before the transmission if a timestamp based on the free running cycle counter is needed. For RX it is impossible to know which timestamp is needed before the packet is received and assigned to a socket. Support late timestamp determination by a network device. Therefore, an address/cookie is stored within the new netdev_data field of struct skb_shared_hwtstamps. This address/cookie is provided to a new network device function called ndo_get_tstamp(), which returns a timestamp based on the normal/adjustable time or based on the free running cycle counter. If function is not supported, then timestamp handling is not changed. This mechanism is intended for RX, but TX use is also possible. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10ptp: Pass hwtstamp to ptp_convert_timestamp()Gerhard Engleder1-4/+3
ptp_convert_timestamp() converts only the timestamp hwtstamp, which is a field of the argument with the type struct skb_shared_hwtstamps *. So a pointer to the hwtstamp field of this structure is sufficient. Rework ptp_convert_timestamp() to use an argument of type ktime_t *. This allows to add additional timestamp manipulation stages before the call of ptp_convert_timestamp(). Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10ptp: Request cycles for TX timestampGerhard Engleder1-1/+6
The free running cycle counter of physical clocks called cycles shall be used for hardware timestamps to enable synchronisation. Introduce new flag SKBTX_HW_TSTAMP_USE_CYCLES, which signals driver to provide a TX timestamp based on cycles if cycles are supported. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10ptp: Add cycles support for virtual clocksGerhard Engleder1-0/+31
ptp vclocks require a free running time for their timecounter. Currently only a physical clock forced to free running is supported. If vclocks are used, then the physical clock cannot be synchronized anymore. The synchronized time is not available in hardware in this case. As a result, timed transmission with TAPRIO hardware support is not possible anymore. If hardware would support a free running time additionally to the physical clock, then the physical clock does not need to be forced to free running. Thus, the physical clocks can still be synchronized while vclocks are in use. The physical clock could be used to synchronize the time domain of the TSN network and trigger TAPRIO. In parallel vclocks can be used to synchronize other time domains. Introduce support for a free running cycle counter called cycles to physical clocks. Rework ptp vclocks to use this free running cycle counter. Default implementation is based on time of physical clock. Thus, behavior of ptp vclocks based on physical clocks without free running cycle counter is identical to previous behavior. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-09net/mlx5: Lag, add debugfs to query hardware lag stateMark Bloch1-0/+1
Lag state has become very complicated with many modes, flags, types and port selections methods and future work will add additional features. Add a debugfs to query the current lag state. A new directory named "lag" will be created under the mlx5 debugfs directory. As the driver has debugfs per pci function the location will be: <debugfs>/mlx5/<BDF>/lag For example: /sys/kernel/debug/mlx5/0000:08:00.0/lag The following files are exposed: - state: Returns "active" or "disabled". If "active" it means hardware lag is active. - members: Returns the BDFs of all the members of lag object. - type: Returns the type of the lag currently configured. Valid only if hardware lag is active. * "roce" - Members are bare metal PFs. * "switchdev" - Members are in switchdev mode. * "multipath" - ECMP offloads. - port_sel_mode: Returns the egress port selection method, valid only if hardware lag is active. * "queue_affinity" - Egress port is selected by the QP/SQ affinity. * "hash" - Egress port is selected by hash done on each packet. Controlled by: xmit_hash_policy of the bond device. - flags: Returns flags that are specific per lag @type. Valid only if hardware lag is active. * "shared_fdb" - "on" or "off", if "on" single FDB is used. - mapping: Returns the mapping which is used to select egress port. Valid only if hardware lag is active. If @port_sel_mode is "hash" returns the active egress ports. The hash result will select only active ports. if @port_sel_mode is "queue_affinity" returns the mapping between the configured port affinity of the QP/SQ and actual egress port. For example: * 1:1 - Mapping means if the configured affinity is port 1 traffic will egress via port 1. * 1:2 - Mapping means if the configured affinity is port 1 traffic will egress via port 2. This can happen if port 1 is down or in active/backup mode and port 1 is backup. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09net/mlx5: Support devices with more than 2 portsMark Bloch1-1/+1
Increase the define MLX5_MAX_PORTS to 4 as the driver is ready to support NICs with 4 ports. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09net/mlx5: Lag, expose number of lag portsMark Bloch1-0/+1
Downstream patches will add support for hardware lag with more than 2 ports. Add a way for users to query the number of lag ports. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09net/mlx5: Add exit route when waiting for FWGavin Li1-0/+1
Currently, removing a device needs to get the driver interface lock before doing any cleanup. If the driver is waiting in a loop for FW init, there is no way to cancel the wait, instead the device cleanup waits for the loop to conclude and release the lock. To allow immediate response to remove device commands, check the TEARDOWN flag while waiting for FW init, and exit the loop if it has been set. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09net: phy: export genphy_c45_baset1_read_status()Oleksij Rempel1-0/+1
Export genphy_c45_baset1_read_status() to make it reusable by PHY drivers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09net: phy: introduce genphy_c45_pma_baset1_read_master_slave()Oleksij Rempel1-0/+1
Move baset1 specific part of genphy_c45_read_pma() code to separate function to make it reusable by PHY drivers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09net: phy: introduce genphy_c45_pma_baset1_setup_master_slave()Oleksij Rempel1-0/+1
Move baset1 specific part of genphy_c45_pma_setup_forced() code to separate function to make it reusable by PHY drivers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09rtnetlink: add extack support in fdb del handlersAlaa Mohamed1-1/+1
Add extack support to .ndo_fdb_del in netdevice.h and all related methods. Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09net: skb: introduce skb_data_area_size()Ricardo Martinez1-0/+5
Helper to calculate the linear data space in the skb. Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09list: Add list_next_entry_circular() and list_prev_entry_circular()Ricardo Martinez1-0/+26
Add macros to get the next or previous entries and wraparound if needed. For example, calling list_next_entry_circular() on the last element should return the first element in the list. Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-06net: move netif_set_gso_max helpersJakub Kicinski1-21/+0
These are now internal to the core, no need to expose them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-06net: don't allow user space to lift the device limitsJakub Kicinski1-0/+9
Up until commit 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation") the gso_max_segs and gso_max_size of a device were not controlled from user space. The quoted commit added the ability to control them because of the following setup: netns A | netns B veth<->veth eth0 If eth0 has TSO limitations and user wants to efficiently forward traffic between eth0 and the veths they should copy the TSO limitations of eth0 onto the veths. This would happen automatically for macvlans or ipvlan but veth users are not so lucky (given the loose coupling). Unfortunately the commit in question allowed users to also override the limits on real HW devices. It may be useful to control the max GSO size and someone may be using that ability (not that I know of any user), so create a separate set of knobs to reliably record the TSO limitations. Validate the user requests. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-06net: add netif_inherit_tso_max()Jakub Kicinski1-0/+3
To make later patches smaller create a helper for inheriting the TSO limitations of a lower device. The TSO in the name is not an accident, subsequent patches will replace GSO with TSO in more names. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-05net: Make msg_zerocopy_alloc staticDavid Ahern1-1/+0
msg_zerocopy_alloc is only used by msg_zerocopy_realloc; remove the export and make static in skbuff.c Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/r/20220504170947.18773-1-dsahern@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-2/+3
tools/testing/selftests/net/forwarding/Makefile f62c5acc800e ("selftests/net/forwarding: add missing tests to Makefile") 50fe062c806e ("selftests: forwarding: new test, verify host mdb entries") https://lore.kernel.org/all/20220502111539.0b7e4621@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05Merge tag 'net-5.18-rc6' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, rxrpc and wireguard. Previous releases - regressions: - igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter() - mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter() - rds: acquire netns refcount on TCP sockets - rxrpc: enable IPv6 checksums on transport socket - nic: hinic: fix bug of wq out of bound access - nic: thunder: don't use pci_irq_vector() in atomic context - nic: bnxt_en: fix possible bnxt_open() failure caused by wrong RFS flag - nic: mlx5e: - lag, fix use-after-free in fib event handler - fix deadlock in sync reset flow Previous releases - always broken: - tcp: fix insufficient TCP source port randomness - can: grcan: grcan_close(): fix deadlock - nfc: reorder destructive operations in to avoid bugs Misc: - wireguard: improve selftests reliability" * tag 'net-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits) NFC: netlink: fix sleep in atomic bug when firmware download timeout selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer tcp: drop the hash_32() part from the index calculation tcp: increase source port perturb table to 2^16 tcp: dynamically allocate the perturb table used by source ports tcp: add small random increments to the source port tcp: resalt the secret every 10 seconds tcp: use different parts of the port_offset for index and offset secure_seq: use the 64 bits of the siphash for port offset calculation wireguard: selftests: set panic_on_warn=1 from cmdline wireguard: selftests: bump package deps wireguard: selftests: restore support for ccache wireguard: selftests: use newer toolchains to fill out architectures wireguard: selftests: limit parallelism to $(nproc) tests at once wireguard: selftests: make routing loop test non-fatal net/mlx5: Fix matching on inner TTC net/mlx5: Avoid double clear or set of sync reset requested net/mlx5: Fix deadlock in sync reset flow net/mlx5e: Fix trust state reset in reload net/mlx5e: Avoid checking offload capability in post_parse action ...
2022-05-03net/mlx5: Remove not-supported ICV lengthLeon Romanovsky1-2/+0
mlx5 doesn't allow to configure any AEAD ICV length other than 128, so remove the logic that configures other unsupported values. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: Merge various control path IPsec headers into one fileLeon Romanovsky1-145/+0
The mlx5 IPsec code has logical separation between code that operates with XFRM objects (ipsec.c), HW objects (ipsec_offload.c), flow steering logic (ipsec_fs.c) and data path (ipsec_rxtx.c). Such separation makes sense for C-files, but isn't needed at all for H-files as they are included in batch anyway. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: Remove useless validity checkLeon Romanovsky1-10/+0
All callers build xfrm attributes with help of mlx5e_ipsec_build_accel_xfrm_attrs() function that ensure validity of attributes. There is no need to recheck them again. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03net/mlx5: Store IPsec ESN update work in XFRM stateLeon Romanovsky1-4/+6
mlx5 IPsec code updated ESN through workqueue with allocation calls in the data path, which can be saved easily if the work is created during XFRM state initialization routine. The locking used later in the work didn't protect from anything because change of HW context is possible during XFRM state add or delete only, which can cancel work and make sure that it is not running. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03netdev: reshuffle netif_napi_add() APIs to allow dropping weightJakub Kicinski1-17/+33
Most drivers should not have to worry about selecting the right weight for their NAPI instances and pass NAPI_POLL_WEIGHT. It'd be best if we didn't require the argument at all and selected the default internally. This change prepares the ground for such reshuffling, allowing for a smooth transition. The following API should remain after the next release cycle: netif_napi_add() netif_napi_add_weight() netif_napi_add_tx() netif_napi_add_tx_weight() Where the _weight() variants take an explicit weight argument. I opted for a _weight() suffix rather than a __ prefix, because we use __ in places to mean that caller needs to also issue a synchronize_net() call. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20220502232703.396351-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03Merge tag 'mlx5-updates-2022-05-02' of ↵Paolo Abeni2-10/+18
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-05-02 1) Trivial Misc updates to mlx5 driver 2) From Mark Bloch: Flow steering, general steering refactoring/cleaning An issue with flow steering deletion flow (when creating a rule without dests) turned out to be easy to fix but during the fix some issue with the flow steering creation/deletion flows have been found. The following patch series tries to fix long standing issues with flow steering code and hopefully preventing silly future bugs. A) Fix an issue where a proper dest type wasn't assigned. B) Refactor and fix dests enums values, refactor deletion function and do proper bookkeeping of dests. C) Change mlx5_del_flow_rules() to delete rules when there are no no more rules attached associated with an FTE. D) Don't call hard coded deletion function but use the node's defined one. E) Add a WARN_ON() to catch future bugs when an FTE with dests is deleted. * tag 'mlx5-updates-2022-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: fs, an FTE should have no dests when deleted net/mlx5: fs, call the deletion function of the node net/mlx5: fs, delete the FTE when there are no rules attached to it net/mlx5: fs, do proper bookkeeping for forward destinations net/mlx5: fs, add unused destination type net/mlx5: fs, jump to exit point and don't fall through net/mlx5: fs, refactor software deletion rule net/mlx5: fs, split software and IFC flow destination definitions net/mlx5e: TC, set proper dest type net/mlx5e: Remove unused mlx5e_dcbnl_build_rep_netdev function net/mlx5e: Drop error CQE handling from the XSK RX handler net/mlx5: Print initializing field in case of timeout net/mlx5: Delete redundant default assignment of runtime devlink params net/mlx5: Remove useless kfree net/mlx5: use kvfree() for kvzalloc() in mlx5_ct_fs_smfs_matcher_create ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03net: sysctl: introduce sysctl SYSCTL_THREETonghao Zhang1-4/+5
This patch introdues the SYSCTL_THREE. KUnit: [00:10:14] ================ sysctl_test (10 subtests) ================= [00:10:14] [PASSED] sysctl_test_api_dointvec_null_tbl_data [00:10:14] [PASSED] sysctl_test_api_dointvec_table_maxlen_unset [00:10:14] [PASSED] sysctl_test_api_dointvec_table_len_is_zero [00:10:14] [PASSED] sysctl_test_api_dointvec_table_read_but_position_set [00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_positive [00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_negative [00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_positive [00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_negative [00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_less_int_min [00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_greater_int_max [00:10:14] =================== [PASSED] sysctl_test =================== ./run_kselftest.sh -c sysctl ... ok 1 selftests: sysctl: sysctl.sh Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Ahern <dsahern@kernel.org> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: Shuah Khan <shuah@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Lorenz Bauer <lmb@cloudflare.com> Cc: Akhmat Karakotov <hmukos@yandex-team.ru> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-02net/mlx5: fs, add unused destination typeMark Bloch1-0/+1
When the caller doesn't pass a destination fs_core will create a unused rule just so a context can be returned. This unused rule is zeroed out and its type is 0 which can be mixed up with MLX5_FLOW_DESTINATION_TYPE_VPORT. Create a dedicated type to differentiate between the two named MLX5_FLOW_DESTINATION_TYPE_NONE. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02net/mlx5: fs, split software and IFC flow destination definitionsMark Bloch2-10/+17
Separate flow destinations between software and IFC. Flow destination type passed by callers was used as the input in firmware commands and over the years software only types were added which resulted in mixing between the two. Create an IFC enum that contains only the flow destinations defined when talking to the firmware. Now that there is a proper software only enum for flow destinations the hardcoded values can be removed as the values are no longer used in firmware commands. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02Stefan Schmidt says:Jakub Kicinski1-8/+73
==================== pull-request: ieee802154-next 2022-05-01 Miquel Raynal landed two patch series bundled in this pull request. The first series re-works the symbol duration handling to better accommodate the needs of the various phy layers in ieee802154. In the second series Miquel improves th errors handling from drivers up mac802154. THis streamlines the error handling throughout the ieee/mac802154 stack in preparation for sync TX to be introduced for MLME frames. ==================== Link: https://lore.kernel.org/r/20220501194614.1198325-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-01Merge tag 'x86_urgent_for_v5.18_rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - A fix to disable PCI/MSI[-X] masking for XEN_HVM guests as that is solely controlled by the hypervisor - A build fix to make the function prototype (__warn()) as visible as the definition itself - A bunch of objtool annotation fixes which have accumulated over time - An ORC unwinder fix to handle bad input gracefully - Well, we thought the microcode gets loaded in time in order to restore the microcode-emulated MSRs but we thought wrong. So there's a fix for that to have the ordering done properly - Add new Intel model numbers - A spelling fix * tag 'x86_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pci/xen: Disable PCI/MSI[-X] masking for XEN_HVM guests bug: Have __warn() prototype defined unconditionally x86/Kconfig: fix the spelling of 'becoming' in X86_KERNEL_IBT config objtool: Use offstr() to print address of missing ENDBR objtool: Print data address for "!ENDBR" data warnings x86/xen: Add ANNOTATE_NOENDBR to startup_xen() x86/uaccess: Add ENDBR to __put_user_nocheck*() x86/retpoline: Add ANNOTATE_NOENDBR for retpolines x86/static_call: Add ANNOTATE_NOENDBR to static call trampoline objtool: Enable unreachable warnings for CLANG LTO x86,objtool: Explicitly mark idtentry_body()s tail REACHABLE x86,objtool: Mark cpu_startup_entry() __noreturn x86,xen,objtool: Add UNWIND hint lib/strn*,objtool: Enforce user_access_begin() rules MAINTAINERS: Add x86 unwinding entry x86/unwind/orc: Recheck address range after stack info was updated x86/cpu: Load microcode during restore_processor_state() x86/cpu: Add new Alderlake and Raptorlake CPU model numbers
2022-05-01net: phy: Add 10BASE-T1L support in phy-c45Alexandru Tachici2-0/+73
This patch is needed because the BASE-T1 uses different registers for status, control and advertisement to those already employed in the existing phy-c45 functions. Where required, genphy_c45 functions will now check whether the device supports BASE-T1 and use the specific registers instead: 45.2.7.19 BASE-T1 AN control register, 45.2.7.20 BASE-T1 AN status, 45.2.7.21 BASE-T1 AN advertisement register, 45.2.7.22 BASE-T1 AN LP Base Page ability register, 45.2.1.185 BASE-T1 PMA/PMD control register. Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01ethtool: Add 10base-T1L link mode entryAlexandru Tachici1-1/+1
Add entry for the 10base-T1L full duplex mode. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01net: stmmac: disable Split Header (SPH) for Intel platformsTan Tee Min1-0/+1
Based on DesignWare Ethernet QoS datasheet, we are seeing the limitation of Split Header (SPH) feature is not supported for Ipv4 fragmented packet. This SPH limitation will cause ping failure when the packets size exceed the MTU size. For example, the issue happens once the basic ping packet size is larger than the configured MTU size and the data is lost inside the fragmented packet, replaced by zeros/corrupted values, and leads to ping fail. So, disable the Split Header for Intel platforms. v2: Add fixes tag in commit message. Fixes: 67afd6d1cfdf("net: stmmac: Add Split Header support and enable it in XGMAC cores") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Ong, Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30Merge tag 'usb-5.18-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB driver fixes for 5.18-rc5 for some reported issues and new quirks. They include: - dwc3 driver fixes - xhci driver fixes - typec driver fixes - new usb-serial driver ids - added new USB devices to existing quirk tables - other tiny fixes All of these have been in linux-next for a while with no reported issues" * tag 'usb-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (31 commits) usb: phy: generic: Get the vbus supply usb: dwc3: gadget: Return proper request status usb: dwc3: pci: add support for the Intel Meteor Lake-P usb: dwc3: core: Only handle soft-reset in DCTL usb: gadget: configfs: clear deactivation flag in configfs_composite_unbind() usb: misc: eud: Fix an error handling path in eud_probe() usb: core: Don't hold the device lock while sleeping in do_proc_control() usb: dwc3: Try usb-role-switch first in dwc3_drd_init usb: dwc3: core: Fix tx/rx threshold settings usb: mtu3: fix USB 3.0 dual-role-switch from device to host xhci: Enable runtime PM on second Alderlake controller usb: dwc3: fix backwards compat with rockchip devices dt-bindings: usb: samsung,exynos-usb2: add missing required reg usb: misc: fix improper handling of refcount in uss720_probe() USB: Fix ehci infinite suspend-resume loop issue in zhaoxin usb: typec: tcpm: Fix undefined behavior due to shift overflowing the constant usb: typec: rt1719: Fix build error without CONFIG_POWER_SUPPLY usb: typec: ucsi: Fix role swapping usb: typec: ucsi: Fix reuse of completion structure usb: xhci: tegra:Fix PM usage reference leak of tegra_xusb_unpowergate_partitions ...
2022-04-30net: lan966x: Fix compilation errorHoratiu Vultur1-0/+4
Starting from the blamed commit, the lan966x build fails with the following compilation error: drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c:342:9: error: implicit declaration of function ‘ptp_find_pin_unlocked’ [-Werror=implicit-function-declaration] 342 | pin = ptp_find_pin_unlocked(phc->clock, PTP_PF_EXTTS, 0); The issue is that there is no stub function for ptp_find_pin_unlocked in case CONFIG_PTP_1588_CLOCK is not selected. Therefore add one. Reported-by: kernel test robot <lkp@intel.com> Fixes: f3d8e0a9c28ba0 ("net: lan966x: Add support for PTP_PF_EXTTS") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30net: inline dev_queue_xmit()Pavel Begunkov1-2/+12
Inline dev_queue_xmit() and dev_queue_xmit_accel(), they both are small proxy functions doing nothing but redirecting the control flow to __dev_queue_xmit(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30net: inline skb_zerocopy_iter_dgramPavel Begunkov1-14/+22
skb_zerocopy_iter_dgram() is a small proxy function, inline it. For that, move __zerocopy_sg_from_iter into linux/skbuff.h Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29Merge branch 'tcp-pass-back-data-left-in-socket-after-receive' of ↵Jakub Kicinski1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29tcp: pass back data left in socket after receiveJens Axboe1-1/+5
This is currently done for CMSG_INQ, add an ability to do so via struct msghdr as well and have CMSG_INQ use that too. If the caller sets msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq. Rearrange struct msghdr a bit so we can add this member while shrinking it at the same time. On a 64-bit build, it was 96 bytes before this change and 88 bytes afterwards. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/650c22ca-cffc-0255-9a05-2413a1e20826@kernel.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski10-285/+66
include/linux/netdevice.h net/core/dev.c 6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats") 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats") https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/ drivers/net/wan/cosa.c d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()") 89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards") https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge tag 'net-5.18-rc5' of ↵Linus Torvalds1-12/+9
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf and netfilter. Current release - new code bugs: - bridge: switchdev: check br_vlan_group() return value - use this_cpu_inc() to increment net->core_stats, fix preempt-rt Previous releases - regressions: - eth: stmmac: fix write to sgmii_adapter_base Previous releases - always broken: - netfilter: nf_conntrack_tcp: re-init for syn packets only, resolving issues with TCP fastopen - tcp: md5: fix incorrect tcp_header_len for incoming connections - tcp: fix F-RTO may not work correctly when receiving DSACK - tcp: ensure use of most recently sent skb when filling rate samples - tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT - virtio_net: fix wrong buf address calculation when using xdp - xsk: fix forwarding when combining copy mode with busy poll - xsk: fix possible crash when multiple sockets are created - bpf: lwt: fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook - sctp: null-check asoc strreset_chunk in sctp_generate_reconf_event - wireguard: device: check for metadata_dst with skb_valid_dst() - netfilter: update ip6_route_me_harder to consider L3 domain - gre: make o_seqno start from 0 in native mode - gre: switch o_seqno to atomic to prevent races in collect_md mode Misc: - add Eric Dumazet to networking maintainers - dt: dsa: realtek: remove realtek,rtl8367s string - netfilter: flowtable: Remove the empty file" * tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) tcp: fix F-RTO may not work correctly when receiving DSACK Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits" net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK ixgbe: ensure IPsec VF<->PF compatibility MAINTAINERS: Update BNXT entry with firmware files netfilter: nft_socket: only do sk lookups when indev is available net: fec: add missing of_node_put() in fec_enet_init_stop_mode() bnx2x: fix napi API usage sequence tls: Skip tls_append_frag on zero copy size Add Eric Dumazet to networking maintainers netfilter: conntrack: fix udp offload timeout sysctl netfilter: nf_conntrack_tcp: re-init for syn packets only net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK net: Use this_cpu_inc() to increment net->core_stats Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted Bluetooth: hci_event: Fix creating hci_conn object on error status Bluetooth: hci_event: Fix checking for invalid handle on error status ice: fix use-after-free when deinitializing mailbox snapshot ice: wait 5 s for EMP reset after firmware flash ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg() ...
2022-04-27Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski6-149/+122
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-27 We've added 85 non-merge commits during the last 18 day(s) which contain a total of 163 files changed, 4499 insertions(+), 1521 deletions(-). The main changes are: 1) Teach libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocations, from Andrii Nakryiko. 2) Add typed pointer support in BPF maps and enable it for unreferenced pointers (via probe read) and referenced ones that can be passed to in-kernel helpers, from Kumar Kartikeya Dwivedi. 3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel. 4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced the effective prog array before the rcu_read_lock, from Stanislav Fomichev. 5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic for USDT arguments under riscv{32,64}, from Pu Lehui. 6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire. 7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage so it can be shipped under Alpine which is musl-based, from Dominique Martinet. 8) Clean up {sk,task,inode} local storage trace RCU handling as they do not need to use call_rcu_tasks_trace() barrier, from KP Singh. 9) Improve libbpf API documentation and fix error return handling of various API functions, from Grant Seltzer. 10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data length of frags + frag_list may surpass old offset limit, from Liu Jian. 11) Various improvements to prog_tests in area of logging, test execution and by-name subtest selection, from Mykola Lysenko. 12) Simplify map_btf_id generation for all map types by moving this process to build time with help of resolve_btfids infra, from Menglong Dong. 13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*() helpers; the probing caused always to use old helpers, from Runqing Yang. 14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS tracing macros, from Vladimir Isaev. 15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel switched to memcg-based memory accouting a while ago, from Yafang Shao. 16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu. 17) Fix BPF selftests in two occasions to work around regressions caused by latest LLVM to unblock CI until their fixes are worked out, from Yonghong Song. 18) Misc cleanups all over the place, from various others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add libbpf's log fixup logic selftests libbpf: Fix up verifier log for unguarded failed CO-RE relos libbpf: Simplify bpf_core_parse_spec() signature libbpf: Refactor CO-RE relo human description formatting routine libbpf: Record subprog-resolved CO-RE relocations unconditionally selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Drop unhelpful "program too large" guess libbpf: Fix anonymous type check in CO-RE logic bpf: Compute map_btf_id during build time selftests/bpf: Add test for strict BTF type check selftests/bpf: Add verifier tests for kptr selftests/bpf: Add C tests for kptr libbpf: Add kptr type tag macros to bpf_helpers.h bpf: Make BTF type match stricter for release arguments bpf: Teach verifier about kptr_get kfunc helpers bpf: Wire up freeing of referenced kptr bpf: Populate pairs of btf_id and destructor kfunc in btf bpf: Adapt copy_map_value for multiple offset case ... ==================== Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27hex2bin: make the function hex_to_bin constant-timeMikulas Patocka1-1/+1
The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>