aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-12selftests: forwarding: Add bridge MDB testIdo Schimmel2-0/+1165
Add a selftests that includes the following test cases: 1. Configuration tests. Both valid and invalid configurations are tested across all entry types (e.g., L2, IPv4). 2. Forwarding tests. Both host and port group entries are tested across all entry types. 3. Interaction between user installed MDB entries and IGMP / MLD control packets. Example output: INFO: # Host entries configuration tests TEST: Common host entries configuration tests (IPv4) [ OK ] TEST: Common host entries configuration tests (IPv6) [ OK ] TEST: Common host entries configuration tests (L2) [ OK ] INFO: # Port group entries configuration tests - (*, G) TEST: Common port group entries configuration tests (IPv4 (*, G)) [ OK ] TEST: Common port group entries configuration tests (IPv6 (*, G)) [ OK ] TEST: IPv4 (*, G) port group entries configuration tests [ OK ] TEST: IPv6 (*, G) port group entries configuration tests [ OK ] INFO: # Port group entries configuration tests - (S, G) TEST: Common port group entries configuration tests (IPv4 (S, G)) [ OK ] TEST: Common port group entries configuration tests (IPv6 (S, G)) [ OK ] TEST: IPv4 (S, G) port group entries configuration tests [ OK ] TEST: IPv6 (S, G) port group entries configuration tests [ OK ] INFO: # Port group entries configuration tests - L2 TEST: Common port group entries configuration tests (L2 (*, G)) [ OK ] TEST: L2 (*, G) port group entries configuration tests [ OK ] INFO: # Forwarding tests TEST: IPv4 host entries forwarding tests [ OK ] TEST: IPv6 host entries forwarding tests [ OK ] TEST: L2 host entries forwarding tests [ OK ] TEST: IPv4 port group "exclude" entries forwarding tests [ OK ] TEST: IPv6 port group "exclude" entries forwarding tests [ OK ] TEST: IPv4 port group "include" entries forwarding tests [ OK ] TEST: IPv6 port group "include" entries forwarding tests [ OK ] TEST: L2 port entries forwarding tests [ OK ] INFO: # Control packets tests TEST: IGMPv3 MODE_IS_INCLUE tests [ OK ] TEST: MLDv2 MODE_IS_INCLUDE tests [ OK ] Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12selftests: forwarding: Rename bridge_mdb testIdo Schimmel2-1/+1
The test is only concerned with host MDB entries and not with MDB entries as a whole. Rename the test to reflect that. Subsequent patches will add a more general test that will contain the test cases for host MDB entries and remove the current test. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Support replacement of MDB port group entriesIdo Schimmel2-5/+98
Now that user space can specify additional attributes of port group entries such as filter mode and source list, it makes sense to allow user space to atomically modify these attributes by replacing entries instead of forcing user space to delete the entries and add them back. Replace MDB port group entries when the 'NLM_F_REPLACE' flag is specified in the netlink message header. When a (*, G) entry is replaced, update the following attributes: Source list, state, filter mode, protocol and flags. If the entry is temporary and in EXCLUDE mode, reset the group timer to the group membership interval. If the entry is temporary and in INCLUDE mode, reset the source timers of associated sources to the group membership interval. Examples: # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.2 filter_mode include # bridge -d -s mdb show dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.2 permanent filter_mode include proto static 0.00 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto static 0.00 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode include source_list 192.0.2.2/0.00,192.0.2.1/0.00 proto static 0.00 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.3 filter_mode exclude proto zebra # bridge -d -s mdb show dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 permanent filter_mode include proto zebra blocked 0.00 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra blocked 0.00 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude source_list 192.0.2.3/0.00,192.0.2.1/0.00 proto zebra 0.00 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 temp source_list 192.0.2.4,192.0.2.3 filter_mode include proto bgp # bridge -d -s mdb show dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.4 temp filter_mode include proto bgp 0.00 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 temp filter_mode include proto bgp 0.00 dev br0 port dummy10 grp 239.1.1.1 temp filter_mode include source_list 192.0.2.4/259.44,192.0.2.3/259.44 proto bgp 0.00 Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Allow user space to specify MDB entry routing protocolIdo Schimmel3-2/+15
Add the 'MDBE_ATTR_RTPORT' attribute to allow user space to specify the routing protocol of the MDB port group entry. Enforce a minimum value of 'RTPROT_STATIC' to prevent user space from using protocol values that should only be set by the kernel (e.g., 'RTPROT_KERNEL'). Maintain backward compatibility by defaulting to 'RTPROT_STATIC'. The protocol is already visible to user space in RTM_NEWMDB responses and notifications via the 'MDBA_MDB_EATTR_RTPROT' attribute. The routing protocol allows a routing daemon to distinguish between entries configured by it and those configured by the administrator. Once MDB flush is supported, the protocol can be used as a criterion according to which the flush is performed. Examples: # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto kernel Error: integer out of range. # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto static # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent proto zebra # bridge mdb add dev br0 port dummy10 grp 239.1.1.2 permanent source_list 198.51.100.1,198.51.100.2 filter_mode include proto 250 # bridge -d mdb show dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.2 permanent filter_mode include proto 250 dev br0 port dummy10 grp 239.1.1.2 src 198.51.100.1 permanent filter_mode include proto 250 dev br0 port dummy10 grp 239.1.1.2 permanent filter_mode include source_list 198.51.100.2/0.00,198.51.100.1/0.00 proto 250 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude proto static Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Allow user space to add (*, G) with a source list and filter modeIdo Schimmel2-0/+150
Add new netlink attributes to the RTM_NEWMDB request that allow user space to add (*, G) with a source list and filter mode. The RTM_NEWMDB message can already dump such entries (created by the kernel) so there is no need to add dump support. However, the message contains a different set of attributes depending if it is a request or a response. The naming and structure of the new attributes try to follow the existing ones used in the response. Request: [ struct nlmsghdr ] [ struct br_port_msg ] [ MDBA_SET_ENTRY ] struct br_mdb_entry [ MDBA_SET_ENTRY_ATTRS ] [ MDBE_ATTR_SOURCE ] struct in_addr / struct in6_addr [ MDBE_ATTR_SRC_LIST ] // new [ MDBE_SRC_LIST_ENTRY ] [ MDBE_SRCATTR_ADDRESS ] struct in_addr / struct in6_addr [ ...] [ MDBE_ATTR_GROUP_MODE ] // new u8 Response: [ struct nlmsghdr ] [ struct br_port_msg ] [ MDBA_MDB ] [ MDBA_MDB_ENTRY ] [ MDBA_MDB_ENTRY_INFO ] struct br_mdb_entry [ MDBA_MDB_EATTR_TIMER ] u32 [ MDBA_MDB_EATTR_SOURCE ] struct in_addr / struct in6_addr [ MDBA_MDB_EATTR_RTPROT ] u8 [ MDBA_MDB_EATTR_SRC_LIST ] [ MDBA_MDB_SRCLIST_ENTRY ] [ MDBA_MDB_SRCATTR_ADDRESS ] struct in_addr / struct in6_addr [ MDBA_MDB_SRCATTR_TIMER ] u8 [...] [ MDBA_MDB_EATTR_GROUP_MODE ] u8 Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Add support for (*, G) with a source list and filter modeIdo Schimmel2-3/+132
In preparation for allowing user space to add (*, G) entries with a source list and associated filter mode, add the necessary plumbing to handle such requests. Extend the MDB configuration structure with a currently empty source array and filter mode that is currently hard coded to EXCLUDE. Add the source entries and the corresponding (S, G) entries before making the new (*, G) port group entry visible to the data path. Handle the creation of each source entry in a similar fashion to how it is created from the data path in response to received Membership Reports: Create the source entry, arm the source timer (if needed), add a corresponding (S, G) forwarding entry and finally mark the source entry as installed (by user space). Add the (S, G) entry by populating an MDB configuration structure and calling br_mdb_add_group_sg() as if a new entry is created by user space, with the sole difference that the 'src_entry' field is set to make sure that the group timer of such entries is never armed. Note that it is not currently possible to add more than 32 source entries to a port group entry. If this proves to be a problem we can either increase 'PG_SRC_ENT_LIMIT' or avoid forcing a limit on entries created by user space. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Avoid arming group timer when (S, G) corresponds to a sourceIdo Schimmel2-1/+2
User space will soon be able to install a (*, G) with a source list, prompting the creation of a (S, G) entry for each source. In this case, the group timer of the (S, G) entry should never be set. Solve this by adding a new field to the MDB configuration structure that denotes whether the (S, G) corresponds to a source or not. The field will be set in a subsequent patch where br_mdb_add_group_sg() is called in order to create a (S, G) entry for each user provided source. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Add a flag for user installed source entriesIdo Schimmel2-1/+3
There are a few places where the bridge driver differentiates between (S, G) entries installed by the kernel (in response to Membership Reports) and those installed by user space. One of them is when deleting an (S, G) entry corresponding to a source entry that is being deleted. While user space cannot currently add a source entry to a (*, G), it can add an (S, G) entry that later corresponds to a source entry created by the reception of a Membership Report. If this source entry is later deleted because its source timer expired or because the (*, G) entry is being deleted, the bridge driver will not delete the corresponding (S, G) entry if it was added by user space as permanent. This is going to be a problem when the ability to install a (*, G) with a source list is exposed to user space. In this case, when user space installs the (*, G) as permanent, then all the (S, G) entries corresponding to its source list will also be installed as permanent. When user space deletes the (*, G), all the source entries will be deleted and the expectation is that the corresponding (S, G) entries will be deleted as well. Solve this by introducing a new source entry flag denoting that the entry was installed by user space. When the entry is deleted, delete the corresponding (S, G) entry even if it was installed by user space as permanent, as the flag tells us that it was installed in response to the source entry being created. The flag will be set in a subsequent patch where source entries are created in response to user requests. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Expose __br_multicast_del_group_src()Ido Schimmel2-3/+9
Expose __br_multicast_del_group_src() which is symmetric to br_multicast_new_group_src() and does not remove the installed {S, G} forwarding entry, unlike br_multicast_del_group_src(). The function will be used in the error path when user space was able to add a new source entry, but failed to install a corresponding forwarding entry. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Expose br_multicast_new_group_src()Ido Schimmel2-1/+4
Currently, new group source entries are only created in response to received Membership Reports. Subsequent patches are going to allow user space to install (*, G) entries with a source list. As a preparatory step, expose br_multicast_new_group_src() so that it could later be invoked from the MDB code (i.e., br_mdb.c) that handles RTM_NEWMDB messages. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Add a centralized error pathIdo Schimmel1-4/+6
Subsequent patches will add memory allocations in br_mdb_config_init() as the MDB configuration structure will include a linked list of source entries. This memory will need to be freed regardless if br_mdb_add() succeeded or failed. As a preparation for this change, add a centralized error path where the memory will be freed. Note that br_mdb_del() already has one error path and therefore does not require any changes. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Place netlink policy before validation functionsIdo Schimmel1-6/+6
Subsequent patches are going to add additional validation functions and netlink policies. Some of these functions will need to perform parsing using nla_parse_nested() and the new policies. In order to keep all the policies next to each other, move the current policy to before the validation functions. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Split (*, G) and (S, G) addition into different functionsIdo Schimmel1-49/+96
When the bridge is using IGMP version 3 or MLD version 2, it handles the addition of (*, G) and (S, G) entries differently. When a new (S, G) port group entry is added, all the (*, G) EXCLUDE ports need to be added to the port group of the new entry. Similarly, when a new (*, G) EXCLUDE port group entry is added, the port needs to be added to the port group of all the matching (S, G) entries. Subsequent patches will create more differences between both entry types. Namely, filter mode and source list can only be specified for (*, G) entries. Given the current and future differences between both entry types, handle the addition of each entry type in a different function, thereby avoiding the creation of one complex function. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12bridge: mcast: Do not derive entry type from its filter modeIdo Schimmel1-6/+3
Currently, the filter mode (i.e., INCLUDE / EXCLUDE) of MDB entries cannot be set from user space. Instead, it is set by the kernel according to the entry type: (*, G) entries are treated as EXCLUDE and (S, G) entries are treated as INCLUDE. This allows the kernel to derive the entry type from its filter mode. Subsequent patches will allow user space to set the filter mode of (*, G) entries, making the current assumption incorrect. As a preparation, remove the current assumption and instead determine the entry type from its key, which is a more direct way. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12qlcnic: Clean up some inconsistent indentingJiapeng Chong1-1/+1
No functional modification involved. drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c:714 qlcnic_validate_ring_count() warn: inconsistent indenting. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3419 Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: tag_8021q: avoid leaking ctx on dsa_tag_8021q_register() error pathVladimir Oltean1-1/+10
If dsa_tag_8021q_setup() fails, for example due to the inability of the device to install a VLAN, the tag_8021q context of the switch will leak. Make sure it is freed on the error path. Fixes: 328621f6131f ("net: dsa: tag_8021q: absorb dsa_8021q_setup into dsa_tag_8021q_{,un}register") Signed-off-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12i40e: allow toggling loopback mode via ndo_set_features callbackTirthendu Sarkar4-4/+61
Add support for NETIF_F_LOOPBACK. This feature can be set via: $ ethtool -K eth0 loopback <on|off> This sets the MAC Tx->Rx loopback. This feature is used for the xsk selftests, and might have other uses too. Signed-off-by: Tirthendu Sarkar <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Tested-by: Magnus Karlsson <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12i40e: Fix the inability to attach XDP program on downed interfaceBartosz Staszewski1-12/+24
Whenever trying to load XDP prog on downed interface, function i40e_xdp was passing vsi->rx_buf_len field to i40e_xdp_setup() which was equal 0. i40e_open() calls i40e_vsi_configure_rx() which configures that field, but that only happens when interface is up. When it is down, i40e_open() is not being called, thus vsi->rx_buf_len is not set. Solution for this is calculate buffer length in newly created function - i40e_calculate_vsi_rx_buf_len() that return actual buffer length. Buffer length is being calculated based on the same rules applied previously in i40e_vsi_configure_rx() function. Fixes: 613142b0bb88 ("i40e: Log error for oversized MTU on device") Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Signed-off-by: Bartosz Staszewski <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Shwetha Nagaraju <[email protected]> Reviewed-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge tag 'perf-core-2022-12-12' of ↵Linus Torvalds21-1220/+1755
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Thoroughly rewrite the data structures that implement perf task context handling, with the goal of fixing various quirks and unfeatures both in already merged, and in upcoming proposed code. The old data structure is the per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ | ^ | ^ `---------------------------------' | `--> pmu ---' v ^ perf_event ------' In this new design this is replaced with a single task context and a single CPU context, plus intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ | ^ ^ `---------------------------' | | | | perf_cpu_pmu_context <--. | `----. ^ | | | | | | v v | | ,--> perf_event_pmu_context | | | | | | | v v | perf_event ---> pmu ----------------' [ See commit bd2756811766 for more details. ] This rewrite was developed by Peter Zijlstra and Ravi Bangoria. - Optimize perf_tp_event() - Update the Intel uncore PMU driver, extending it with UPI topology discovery on various hardware models. - Misc fixes & cleanups * tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box() perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map() perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox() perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology() perf/x86/intel/uncore: Make set_mapping() procedure void perf/x86/intel/uncore: Update sysfs-devices-mapping file perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server perf/x86/intel/uncore: Get UPI NodeID and GroupID perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D perf/x86/intel/uncore: Clear attr_update properly perf/x86/intel/uncore: Introduce UPI topology type perf/x86/intel/uncore: Generalize IIO topology support perf/core: Don't allow grouping events from different hw pmus perf/amd/ibs: Make IBS a core pmu perf: Fix function pointer case perf/x86/amd: Remove the repeated declaration perf: Fix possible memleak in pmu_dev_alloc() ...
2022-12-12Merge branch 'net-add-iff_no_addrconf-to-prevent-ipv6-addrconf'Jakub Kicinski5-11/+22
Xin Long says: ==================== net: add IFF_NO_ADDRCONF to prevent ipv6 addrconf This patchset adds IFF_NO_ADDRCONF flag for dev->priv_flags to prevent ipv6 addrconf, as Jiri Pirko's suggestion. For Bonding it changes to use this flag instead of IFF_SLAVE flag in Patch 1, and for Teaming and Net Failover it sets this flag before calling dev_open() in Patch 2 and 3. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconfXin Long1-3/+3
Similar to Bonding and Team, to prevent ipv6 addrconf with IFF_NO_ADDRCONF in slave_dev->priv_flags for slave ports is also needed in net failover. Note that dev_open(slave_dev) is called in .slave_register, which is called after the IFF_NO_ADDRCONF flag is set in failover_slave_register(). Signed-off-by: Xin Long <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconfXin Long1-0/+2
This patch is to use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf for Team port. This flag will be set in team_port_enter(), which is called before dev_open(), and cleared in team_port_leave(), called after dev_close() and the err path in team_port_add(). Signed-off-by: Xin Long <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconfXin Long3-8/+17
Currently, in bonding it reused the IFF_SLAVE flag and checked it in ipv6 addrconf to prevent ipv6 addrconf. However, it is not a proper flag to use for no ipv6 addrconf, for bonding it has to move IFF_SLAVE flag setting ahead of dev_open() in bond_enslave(). Also, IFF_MASTER/SLAVE are historical flags used in bonding and eql, as Jiri mentioned, the new devices like Team, Failover do not use this flag. So as Jiri suggested, this patch adds IFF_NO_ADDRCONF in priv_flags of the device to indicate no ipv6 addconf, and uses it in bonding and moves IFF_SLAVE flag setting back to its original place. Signed-off-by: Xin Long <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge tag 'locking-core-2022-12-12' of ↵Linus Torvalds2-15/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Two changes in this cycle: - a micro-optimization in static_key_slow_inc_cpuslocked() - fix futex death-notification wakeup bug" * tag 'locking-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Resend potentially swallowed owner death notification jump_label: Use atomic_try_cmpxchg() in static_key_slow_inc_cpuslocked()
2022-12-12stmmac: fix potential division by 0Piergiorgio Beruto2-2/+3
When the MAC is connected to a 10 Mb/s PHY and the PTP clock is derived from the MAC reference clock (default), the clk_ptp_rate becomes too small and the calculated sub second increment becomes 0 when computed by the stmmac_config_sub_second_increment() function within stmmac_init_tstamp_counter(). Therefore, the subsequent div_u64 in stmmac_init_tstamp_counter() operation triggers a divide by 0 exception as shown below. [ 95.062067] socfpga-dwmac ff700000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 95.076440] socfpga-dwmac ff700000.ethernet eth0: PHY [stmmac-0:08] driver [NCN26000] (irq=49) [ 95.095964] dwmac1000: Master AXI performs any burst length [ 95.101588] socfpga-dwmac ff700000.ethernet eth0: No Safety Features support found [ 95.109428] Division by zero in kernel. [ 95.113447] CPU: 0 PID: 239 Comm: ifconfig Not tainted 6.1.0-rc7-centurion3-1.0.3.0-01574-gb624218205b7-dirty #77 [ 95.123686] Hardware name: Altera SOCFPGA [ 95.127695] unwind_backtrace from show_stack+0x10/0x14 [ 95.132938] show_stack from dump_stack_lvl+0x40/0x4c [ 95.137992] dump_stack_lvl from Ldiv0+0x8/0x10 [ 95.142527] Ldiv0 from __aeabi_uidivmod+0x8/0x18 [ 95.147232] __aeabi_uidivmod from div_u64_rem+0x1c/0x40 [ 95.152552] div_u64_rem from stmmac_init_tstamp_counter+0xd0/0x164 [ 95.158826] stmmac_init_tstamp_counter from stmmac_hw_setup+0x430/0xf00 [ 95.165533] stmmac_hw_setup from __stmmac_open+0x214/0x2d4 [ 95.171117] __stmmac_open from stmmac_open+0x30/0x44 [ 95.176182] stmmac_open from __dev_open+0x11c/0x134 [ 95.181172] __dev_open from __dev_change_flags+0x168/0x17c [ 95.186750] __dev_change_flags from dev_change_flags+0x14/0x50 [ 95.192662] dev_change_flags from devinet_ioctl+0x2b4/0x604 [ 95.198321] devinet_ioctl from inet_ioctl+0x1ec/0x214 [ 95.203462] inet_ioctl from sock_ioctl+0x14c/0x3c4 [ 95.208354] sock_ioctl from vfs_ioctl+0x20/0x38 [ 95.212984] vfs_ioctl from sys_ioctl+0x250/0x844 [ 95.217691] sys_ioctl from ret_fast_syscall+0x0/0x4c [ 95.222743] Exception stack(0xd0ee1fa8 to 0xd0ee1ff0) [ 95.227790] 1fa0: 00574c4f be9aeca4 00000003 00008914 be9aeca4 be9aec50 [ 95.235945] 1fc0: 00574c4f be9aeca4 0059f078 00000036 be9aee8c be9aef7a 00000015 00000000 [ 95.244096] 1fe0: 005a01f0 be9aec38 004d7484 b6e67d74 Signed-off-by: Piergiorgio Beruto <[email protected]> Fixes: 91a2559c1dc5 ("net: stmmac: Fix sub-second increment") Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/de4c64ccac9084952c56a06a8171d738604c4770.1670678513.git.piergiorgio.beruto@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12octeontx2-af: cn10k: mcs: Fix a resource leak in the probe and remove functionsChristophe JAILLET1-1/+5
In mcs_register_interrupts(), a call to request_irq() is not balanced by a corresponding free_irq(), neither in the error handling path, nor in the remove function. Add the missing calls. Fixes: 6c635f78c474 ("octeontx2-af: cn10k: mcs: Handle MCS block interrupts") Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/69f153db5152a141069f990206e7389f961d41ec.1670693669.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12lib: packing: replace bit_reverse() with bitrev8()Uladzislau Koshchanka2-14/+3
Remove bit_reverse() function. Instead use bitrev8() from linux/bitrev.h + bitshift. Reduces code-repetition. Signed-off-by: Uladzislau Koshchanka <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12dt-bindings: net: dsa: hellcreek: Sync DSA maintainersKurt Kanzenbach1-1/+1
The current DSA maintainers are Florian Fainelli, Andrew Lunn and Vladimir Oltean. Update the hellcreek binding accordingly. Signed-off-by: Kurt Kanzenbach <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Acked-by: Rob Herring <[email protected]> Acked-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: tso: inline tso_count_descs()Yunsheng Lin2-9/+7
tso_count_descs() is a small function doing simple calculation, and tso_count_descs() is used in fast path, so inline it to reduce the overhead of calls. Signed-off-by: Yunsheng Lin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: don't call ptp_classify_raw() if switch doesn't provide RX ↵Vladimir Oltean1-4/+4
timestamping ptp_classify_raw() is not exactly cheap, since it invokes a BPF program for every skb in the receive path. For switches which do not provide ds->ops->port_rxtstamp(), running ptp_classify_raw() provides precisely nothing, so check for the presence of the function pointer first, since that is much cheaper. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge branch 'trace-points-for-mv88e6xxx'Jakub Kicinski5-24/+174
Vladimir Oltean says: ==================== Trace points for mv88e6xxx While testing Hans Schultz' attempt at offloading MAB on mv88e6xxx: https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/ I noticed that he still didn't get rid of the huge log spam caused by ATU and VTU violations, even if we discussed about this: https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/#25091076 It seems unlikely he's going to ever do this, so here is my own stab at converting those messages to trace points. This is IMO an improvement regardless of whether Hans' work with MAB lands or not, especially the VTU violations which were quite annoying to me as well. A small sample of before: $ ./bridge_locked_port.sh lan1 lan2 lan3 lan4 [ 114.465272] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.550508] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 34 callbacks suppressed [ 120.369586] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 120.473658] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 125.535209] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 21 callbacks suppressed [ 125.535243] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 126.174558] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.234055] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 fid 3 portvec 4 spid 2 [ 130.338193] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 fid 3 portvec 4 spid 2 [ 134.626099] mv88e6xxx_g1_atu_prob_irq_thread_fn: 38 callbacks suppressed [ 134.626132] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 fid 3 portvec 4 spid 2 and after: $ trace-cmd record -e mv88e6xxx ./bridge_locked_port.sh lan1 lan2 lan3 lan4 $ trace-cmd report irq/35-moxtet-60 [001] 93.929734: mv88e6xxx_vtu_miss_violation: dev d0032004.mdio-mii:10 spid 9 vid 100 irq/35-moxtet-60 [001] 94.183209: mv88e6xxx_vtu_miss_violation: dev d0032004.mdio-mii:10 spid 9 vid 100 irq/35-moxtet-60 [001] 101.865545: mv88e6xxx_vtu_miss_violation: dev d0032004.mdio-mii:10 spid 9 vid 100 irq/35-moxtet-60 [001] 121.831261: mv88e6xxx_vtu_member_violation: dev d0032004.mdio-mii:10 spid 9 vid 100 irq/35-moxtet-60 [001] 122.371238: mv88e6xxx_vtu_member_violation: dev d0032004.mdio-mii:10 spid 9 vid 100 irq/35-moxtet-60 [001] 148.452932: mv88e6xxx_atu_miss_violation: dev d0032004.mdio-mii:10 spid 2 portvec 0x4 addr 00:01:02:03:04:01 fid 0 v1 at: https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: mv88e6xxx: replace VTU violation prints with trace pointsVladimir Oltean2-4/+33
It is possible to trigger these VTU violation messages very easily, it's only necessary to send packets with an unknown VLAN ID to a port that belongs to a VLAN-aware bridge. Do a similar thing as for ATU violation messages, and hide them in the kernel's trace buffer. New usage model: $ trace-cmd list | grep mv88e6xxx mv88e6xxx mv88e6xxx:mv88e6xxx_vtu_miss_violation mv88e6xxx:mv88e6xxx_vtu_member_violation $ trace-cmd report Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: mv88e6xxx: replace ATU violation prints with trace pointsVladimir Oltean4-9/+86
In applications where the switch ports must perform 802.1X based authentication and are therefore locked, ATU violation interrupts are quite to be expected as part of normal operation. The problem is that they currently spam the kernel log, even if rate limited. Create a series of trace points, all derived from the same event class, which log these violations to the kernel's trace buffer, which is both much faster and much easier to ignore than printing to a serial console. New usage model: $ trace-cmd list | grep mv88e6xxx mv88e6xxx mv88e6xxx:mv88e6xxx_atu_full_violation mv88e6xxx:mv88e6xxx_atu_miss_violation mv88e6xxx:mv88e6xxx_atu_member_violation $ trace-cmd record -e mv88e6xxx sleep 10 Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: mv88e6xxx: read FID when handling ATU violationsHans J. Schultz1-11/+61
When an ATU violation occurs, the switch uses the ATU FID register to report the FID of the MAC address that incurred the violation. It would be good for the driver to know the FID value for purposes such as logging and CPU-based authentication. Up until now, the driver has been calling the mv88e6xxx_g1_atu_op() function to read ATU violations, but that doesn't do exactly what we want, namely it calls mv88e6xxx_g1_atu_fid_write() with FID 0. (side note, the documentation for the ATU Get/Clear Violation command says that writes to the ATU FID register have no effect before the operation starts, it's only that we disregard the value that this register provides once the operation completes) So mv88e6xxx_g1_atu_fid_write() is not what we want, but rather mv88e6xxx_g1_atu_fid_read(). However, the latter doesn't exist, we need to write it. The remainder of mv88e6xxx_g1_atu_op() except for mv88e6xxx_g1_atu_fid_write() is still needed, namely to send a GET_CLR_VIOLATION command to the ATU. In principle we could have still kept calling mv88e6xxx_g1_atu_op(), but the MDIO writes to the ATU FID register are pointless, but in the interest of doing less CPU work per interrupt, write a new function called mv88e6xxx_g1_read_atu_violation() and call it. The FID will be the port default FID as set by mv88e6xxx_port_set_fid() if the VID from the packet cannot be found in the VTU. Otherwise it is the FID derived from the VTU entry associated with that VID. Signed-off-by: Hans J. Schultz <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12net: dsa: mv88e6xxx: remove ATU age out violation printVladimir Oltean1-6/+0
Currently, the MV88E6XXX_PORT_ASSOC_VECTOR_INT_AGE_OUT bit (interrupt on age out) is not enabled by the driver, and as a result, the print for age out violations is dead code. Remove it until there is some way for this to be triggered. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge tag 'x86_alternatives_for_v6.2' of ↵Linus Torvalds1-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 alternative update from Borislav Petkov: "A single alternatives patching fix for modules: - Have alternatives patch the same sections in modules as in vmlinux" * tag 'x86_alternatives_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternative: Consistently patch SMP locks in vmlinux and modules
2022-12-12Merge tag 'ras_core_for_v6.2' of ↵Linus Torvalds3-17/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Borislav Petkov: - Fix confusing output from /sys/kernel/debug/ras/daemon_active - Add another MCE severity error case to the Intel error severity table to promote UC and AR errors to panic severity and remove the corresponding code condition doing that. - Make sure the thresholding and deferred error interrupts on AMD SMCA systems clear the all registers reporting an error so that there are no multiple errors logged for the same event * tag 'ras_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: RAS: Fix return value from show_trace() x86/mce: Use severity table to handle uncorrected errors in kernel x86/MCE/AMD: Clear DFR errors found in THR handler
2022-12-12Merge tag 'for-net-next-2022-12-12' of ↵Jakub Kicinski41-121/+3117
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add a new VID/PID 0489/e0f2 for MT7922 - Add Realtek RTL8852BE support ID 0x0cb8:0xc559 - Add a new PID/VID 13d3/3549 for RTL8822CU - Add support for broadcom BCM43430A0 & BCM43430A1 - Add CONFIG_BT_HCIBTUSB_POLL_SYNC - Add CONFIG_BT_LE_L2CAP_ECRED - Add support for CYW4373A0 - Add support for RTL8723DS - Add more device IDs for WCN6855 - Add Broadcom BCM4377 family PCIe Bluetooth * tag 'for-net-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits) Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to complete Bluetooth: ISO: Avoid circular locking dependency Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: hci_bcsp: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: hci_h5: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: hci_ll: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: hci_qca: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: btusb: don't call kfree_skb() under spin_lock_irqsave() Bluetooth: btintel: Fix missing free skb in btintel_setup_combined() Bluetooth: hci_conn: Fix crash on hci_create_cis_sync Bluetooth: btintel: Fix existing sparce warnings Bluetooth: btusb: Fix existing sparce warning Bluetooth: btusb: Fix new sparce warnings Bluetooth: btusb: Add a new PID/VID 13d3/3549 for RTL8822CU Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0cb8:0xc559 dt-bindings: net: realtek-bluetooth: Add RTL8723DS Bluetooth: btusb: Add a new VID/PID 0489/e0f2 for MT7922 dt-bindings: bluetooth: broadcom: add BCM43430A0 & BCM43430A1 Bluetooth: hci_bcm4377: Fix missing pci_disable_device() on error in bcm4377_probe() ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge tag 'edac_updates_for_6.2' of ↵Linus Torvalds19-72/+194
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Make ghes_edac a simple module like the rest of the EDAC drivers and drop the forced built-in only configuration by disentangling it from GHES (Jia He) - The usual small cleanups and improvements all over EDAC land * tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper() EDAC/i5400: Fix typo in comment: vaious -> various EDAC/mc_sysfs: Increase legacy channel support to 12 MAINTAINERS: Make Mauro EDAC reviewer MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac EDAC/igen6: Return the correct error type when not the MC owner apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg() EDAC: Check for GHES preference in the chipset-specific EDAC drivers EDAC/ghes: Make ghes_edac a proper module EDAC/ghes: Prepare to make ghes_edac a proper module EDAC/ghes: Add a notifier for reporting memory errors efi/cper: Export several helpers for ghes_edac to use EDAC/i5000: Mark as BROKEN
2022-12-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski22-360/+1731
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next 1) Incorrect error check in nft_expr_inner_parse(), from Dan Carpenter. 2) Add DATA_SENT state to SCTP connection tracking helper, from Sriram Yagnaraman. 3) Consolidate nf_confirm for ipv4 and ipv6, from Florian Westphal. 4) Add bitmask support for ipset, from Vishwanath Pai. 5) Handle icmpv6 redirects as RELATED, from Florian Westphal. 6) Add WARN_ON_ONCE() to impossible case in flowtable datapath, from Li Qiong. 7) A large batch of IPVS updates to replace timer-based estimators by kthreads to scale up wrt. CPUs and workload (millions of estimators). Julian Anastasov says: This patchset implements stats estimation in kthread context. It replaces the code that runs on single CPU in timer context every 2 seconds and causing latency splats as shown in reports [1], [2], [3]. The solution targets setups with thousands of IPVS services, destinations and multi-CPU boxes. Spread the estimation on multiple (configured) CPUs and multiple time slots (timer ticks) by using multiple chains organized under RCU rules. When stats are not needed, it is recommended to use run_estimation=0 as already implemented before this change. RCU Locking: - As stats are now RCU-locked, tot_stats, svc and dest which hold estimator structures are now always freed from RCU callback. This ensures RCU grace period after the ip_vs_stop_estimator() call. Kthread data: - every kthread works over its own data structure and all such structures are attached to array. For now we limit kthreads depending on the number of CPUs. - even while there can be a kthread structure, its task may not be running, eg. before first service is added or while the sysctl var is set to an empty cpulist or when run_estimation is set to 0 to disable the estimation. - the allocated kthread context may grow from 1 to 50 allocated structures for timer ticks which saves memory for setups with small number of estimators - a task and its structure may be released if all estimators are unlinked from its chains, leaving the slot in the array empty - every kthread data structure allows limited number of estimators. Kthread 0 is also used to initially calculate the max number of estimators to allow in every chain considering a sub-100 microsecond cond_resched rate. This number can be from 1 to hundreds. - kthread 0 has an additional job of optimizing the adding of estimators: they are first added in temp list (est_temp_list) and later kthread 0 distributes them to other kthreads. The optimization is based on the fact that newly added estimator should be estimated after 2 seconds, so we have the time to offload the adding to chain from controlling process to kthread 0. - to add new estimators we use the last added kthread context (est_add_ktid). The new estimators are linked to the chains just before the estimated one, based on add_row. This ensures their estimation will start after 2 seconds. If estimators are added in bursts, common case if all services and dests are initially configured, we may spread the estimators to more chains and as result, reducing the initial delay below 2 seconds. Many thanks to Jiri Wiesner for his valuable comments and for spending a lot of time reviewing and testing the changes on different platforms with 48-256 CPUs and 1-8 NUMA nodes under different cpufreq governors. The new IPVS estimators do not use workqueue infrastructure because: - The estimation can take long time when using multiple IPVS rules (eg. millions estimator structures) and especially when box has multiple CPUs due to the for_each_possible_cpu usage that expects packets from any CPU. With est_nice sysctl we have more control how to prioritize the estimation kthreads compared to other processes/kthreads that have latency requirements (such as servers). As a benefit, we can see these kthreads in top and decide if we will need some further control to limit their CPU usage (max number of structure to estimate per kthread). - with kthreads we run code that is read-mostly, no write/lock operations to process the estimators in 2-second intervals. - work items are one-shot: as estimators are processed every 2 seconds, they need to be re-added every time. This again loads the timers (add_timer) if we use delayed works, as there are no kthreads to do the timings. [1] Report from Yunhong Jiang: https://lore.kernel.org/netdev/[email protected]/ [2] https://marc.info/?l=linux-virtual-server&m=159679809118027&w=2 [3] Report from Dust: https://archive.linuxvirtualserver.org/html/lvs-devel/2020-12/msg00000.html * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: ipvs: run_estimation should control the kthread tasks ipvs: add est_cpulist and est_nice sysctl vars ipvs: use kthreads for stats estimation ipvs: use u64_stats_t for the per-cpu counters ipvs: use common functions for stats allocation ipvs: add rcu protection to stats netfilter: flowtable: add a 'default' case to flowtable datapath netfilter: conntrack: set icmpv6 redirects as RELATED netfilter: ipset: Add support for new bitmask parameter netfilter: conntrack: merge ipv4+ipv6 confirm functions netfilter: conntrack: add sctp DATA_SENT state netfilter: nft_inner: fix IS_ERR() vs NULL check ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-12Merge tag 'x86_fpu_for_6.2' of ↵Linus Torvalds8-33/+208
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Dave Hansen: "There are two little fixes in here, one to give better XSAVE warnings and another to address some undefined behavior in offsetof(). There is also a collection of patches to fix some issues with ptrace and the protection keys register (PKRU). PKRU is a real oddity because it is exposed in the XSAVE-related ABIs, but it is generally managed without using XSAVE in the kernel. This fix thankfully came with a selftest to ward off future regressions. Summary: - Clarify XSAVE consistency warnings - Fix up ptrace interface to protection keys register (PKRU) - Avoid undefined compiler behavior with TYPE_ALIGN" * tag 'x86_fpu_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set x86/fpu: Allow PKRU to be (once again) written by ptrace. x86/fpu: Add a pkru argument to copy_uabi_to_xstate() x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate(). x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate() x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics
2022-12-12Merge tag 'x86_splitlock_for_6.2' of ↵Linus Torvalds2-10/+76
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 splitlock updates from Dave Hansen: "Add a sysctl to control the split lock misery mode. This enables users to reduce the penalty inflicted on split lock users. There are some proprietary, binary-only games which became entirely unplayable with the old penalty. Anyone opting into the new mode is, of course, more exposed to the DoS nasitness inherent with split locks, but they can play their games again" * tag 'x86_splitlock_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/split_lock: Add sysctl to control the misery mode
2022-12-12Merge tag 'x86_cache_for_6.2' of ↵Linus Torvalds7-31/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource control updates from Dave Hansen: "These declare the resource control (rectrl) MSRs a bit more normally and clean up an unnecessary structure member: - Remove unnecessary arch_has_empty_bitmaps structure memory - Move rescrtl MSR defines into msr-index.h, like normal MSRs" * tag 'x86_cache_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Move MSR defines into msr-index.h x86/resctrl: Remove arch_has_empty_bitmaps
2022-12-12Merge tag 'x86_tdx_for_6.2' of ↵Linus Torvalds15-0/+469
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tdx updates from Dave Hansen: "This includes a single chunk of new functionality for TDX guests which allows them to talk to the trusted TDX module software and obtain an attestation report. This report can then be used to prove the trustworthiness of the guest to a third party and get access to things like storage encryption keys" * tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/tdx: Test TDX attestation GetReport support virt: Add TDX guest driver x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module
2022-12-12Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to completeLuiz Augusto von Dentz1-6/+16
This make sure HCI_OP_WRITE_AUTH_PAYLOAD_TO completes before notifying the encryption change just as is done with HCI_OP_READ_ENC_KEY_SIZE. Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-12Bluetooth: ISO: Avoid circular locking dependencyLuiz Augusto von Dentz1-23/+38
This attempts to avoid circular locking dependency between sock_lock and hdev_lock: WARNING: possible circular locking dependency detected 6.0.0-rc7-03728-g18dd8ab0a783 #3 Not tainted ------------------------------------------------------ kworker/u3:2/53 is trying to acquire lock: ffff888000254130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at: iso_conn_del+0xbd/0x1d0 but task is already holding lock: ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_le_cis_estabilished_evt+0x1b5/0x500 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (hci_cb_list_lock){+.+.}-{3:3}: __mutex_lock+0x10e/0xfe0 hci_le_remote_feat_complete_evt+0x17f/0x320 hci_event_packet+0x39c/0x7d0 hci_rx_work+0x2bf/0x950 process_one_work+0x569/0x980 worker_thread+0x2a3/0x6f0 kthread+0x153/0x180 ret_from_fork+0x22/0x30 -> #1 (&hdev->lock){+.+.}-{3:3}: __mutex_lock+0x10e/0xfe0 iso_connect_cis+0x6f/0x5a0 iso_sock_connect+0x1af/0x710 __sys_connect+0x17e/0x1b0 __x64_sys_connect+0x37/0x50 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x62/0xcc -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}: __lock_acquire+0x1b51/0x33d0 lock_acquire+0x16f/0x3b0 lock_sock_nested+0x32/0x80 iso_conn_del+0xbd/0x1d0 iso_connect_cfm+0x226/0x680 hci_le_cis_estabilished_evt+0x1ed/0x500 hci_event_packet+0x39c/0x7d0 hci_rx_work+0x2bf/0x950 process_one_work+0x569/0x980 worker_thread+0x2a3/0x6f0 kthread+0x153/0x180 ret_from_fork+0x22/0x30 other info that might help us debug this: Chain exists of: sk_lock-AF_BLUETOOTH-BTPROTO_ISO --> &hdev->lock --> hci_cb_list_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hci_cb_list_lock); lock(&hdev->lock); lock(hci_cb_list_lock); lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO); *** DEADLOCK *** 4 locks held by kworker/u3:2/53: #0: ffff8880021d9130 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work+0x4ad/0x980 #1: ffff888002387de0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_one_work+0x4ad/0x980 #2: ffff888001ac0070 (&hdev->lock){+.+.}-{3:3}, at: hci_le_cis_estabilished_evt+0xc3/0x500 #3: ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_le_cis_estabilished_evt+0x1b5/0x500 Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-12Bluetooth: RFCOMM: don't call kfree_skb() under spin_lock_irqsave()Yang Yingliang1-1/+1
It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. So replace kfree_skb() with dev_kfree_skb_irq() under spin_lock_irqsave(). Fixes: 81be03e026dc ("Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg") Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-12Bluetooth: hci_core: don't call kfree_skb() under spin_lock_irqsave()Yang Yingliang1-1/+1
It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. So replace kfree_skb() with dev_kfree_skb_irq() under spin_lock_irqsave(). Fixes: 9238f36a5a50 ("Bluetooth: Add request cmd_complete and cmd_status functions") Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-12Bluetooth: hci_bcsp: don't call kfree_skb() under spin_lock_irqsave()Yang Yingliang1-1/+1
It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. So replace kfree_skb() with dev_kfree_skb_irq() under spin_lock_irqsave(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-12Bluetooth: hci_h5: don't call kfree_skb() under spin_lock_irqsave()Yang Yingliang1-1/+1
It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. So replace kfree_skb() with dev_kfree_skb_irq() under spin_lock_irqsave(). Fixes: 43eb12d78960 ("Bluetooth: Fix/implement Three-wire reliable packet sending") Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>