blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-03-17	net: core: dev.c: fix a documentation warning	Mauro Carvalho Chehab	1	-1/+1
	There's a markup for link with is "foo_". On this kernel-doc comment, we don't want this, but instead, place a literal reference. So, escape the literal with ``foo``, in order to avoid this warning: ./net/core/dev.c:5195: WARNING: Unknown target name: "page_is". Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: phy: sfp-bus.c: get rid of docs warnings	Mauro Carvalho Chehab	1	-14/+18
	The indentation for the returned values are weird, causing those warnings: ./drivers/net/phy/sfp-bus.c:579: WARNING: Unexpected indentation. ./drivers/net/phy/sfp-bus.c:619: WARNING: Unexpected indentation. Use a list and change the identation for it to be properly parsed by the documentation toolchain. Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	Merge branch 'ENA-driver-bug-fixes'	David S. Miller	1	-8/+19
	Arthur Kiyanovski says: ==================== ENA driver bug fixes ==================== Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: ena: fix continuous keep-alive resets	Arthur Kiyanovski	1	-0/+1
	last_keep_alive_jiffies is updated in probe and when a keep-alive event is received. In case the driver times-out on a keep-alive event, it has high chances of continuously timing-out on keep-alive events. This is because when the driver recovers from the keep-alive-timeout reset the value of last_keep_alive_jiffies is very old, and if a keep-alive event is not received before the next timer expires, the value of last_keep_alive_jiffies will cause another keep-alive-timeout reset and so forth in a loop. Solution: Update last_keep_alive_jiffies whenever the device is restored after reset. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Noam Dagan <[email protected]> Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: ena: avoid memory access violation by validating req_id properly	Arthur Kiyanovski	1	-4/+11
	Rx req_id is an index in struct ena_eth_io_rx_cdesc_base. The driver should validate that the Rx req_id it received from the device is in range [0, ring_size -1]. Failure to do so could yield to potential memory access violoation. The validation was mistakenly done when refilling the Rx submission queue and not in Rx completion queue. Fixes: ad974baef2a1 ("net: ena: add support for out of order rx buffers refill") Signed-off-by: Noam Dagan <[email protected]> Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: ena: fix request of incorrect number of IRQ vectors	Arthur Kiyanovski	1	-3/+6
	Bug: In short the main issue is caused by the fact that the number of queues is changed using ethtool after ena_probe() has been called and before ena_up() was executed. Here is the full scenario in detail: * ena_probe() is called when the driver is loaded, the driver is not up yet at the end of ena_probe(). * The number of queues is changed -> io_queue_count is changed as well - ena_up() is not called since the "dev_was_up" boolean in ena_update_queue_count() is false. * ena_up() is called by the kernel (it's called asynchronously some time after ena_probe()). ena_setup_io_intr() is called by ena_up() and it uses io_queue_count to get the suitable irq lines for each msix vector. The function ena_request_io_irq() is called right after that and it uses msix_vecs - This value only changes during ena_probe() and ena_restore() - to request the irq vectors. This results in "Failed to request I/O IRQ" error for i > io_queue_count. Numeric example: * After ena_probe() io_queue_count = 8, msix_vecs = 9. * The number of queues changes to 4 -> io_queue_count = 4, msix_vecs = 9. * ena_up() is executed for the first time: ena_setup_io_intr() inits the vectors only up to io_queue_count. ena_request_io_irq() calls request_irq() and fails for i = 5. How to reproduce: simply run the following commands: sudo rmmod ena && sudo insmod ena.ko; sudo ethtool -L eth1 combined 3; Fix: Use ENA_MAX_MSIX_VEC(adapter->num_io_queues + adapter->xdp_num_queues) instead of adapter->msix_vecs. We need to take XDP queues into consideration as they need to have msix vectors assigned to them as well. Note that the XDP cannot be attached before the driver is up and running but in XDP mode the issue might occur when the number of queues changes right after a reset trigger. The ENA_MAX_MSIX_VEC simply adds one to the argument since the first msix vector is reserved for management queue. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: ena: fix incorrect setting of the number of msix vectors	Arthur Kiyanovski	1	-1/+1
	Overview: We don't frequently change the msix vectors throughout the life cycle of the driver. We do so in two functions: ena_probe() and ena_restore(). ena_probe() is only called when the driver is loaded. ena_restore() on the other hand is called during device reset / resume operations. We use num_io_queues for calculating and allocating the number of msix vectors. At ena_probe() this value is equal to max_num_io_queues and thus this is not an issue, however ena_restore() might be called after the number of io queues has changed. A possible bug scenario is as follows: * Change number of queues from 8 to 4. (num_io_queues = 4, max_num_io_queues = 8, msix_vecs = 9,) * Trigger reset occurs -> ena_restore is called. (num_io_queues = 4, max_num_io_queues =8 , msix_vecs = 5) * Change number of queues from 4 to 6. (num_io_queues = 6, max_num_io_queues = 8, msix_vecs = 5) * The driver will reset due to failure of check_for_rx_interrupt_queue() Fix: This can be easily fixed by always using max_num_io_queues to init the msix_vecs, since this number won't change as opposed to num_io_queues. Fixes: 4d19266022ec ("net: ena: multiple queue creation related cleanups") Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: phy: mdio-mux-bcm-iproc: check clk_prepare_enable() return value	Rayagonda Kokatanur	1	-1/+6
	Check clk_prepare_enable() return value. Fixes: 2c7230446bc9 ("net: phy: Add pm support to Broadcom iProc mdio mux driver") Signed-off-by: Rayagonda Kokatanur <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	Merge branch 'net-bcmgenet-revisit-MAC-reset'	David S. Miller	3	-40/+16
	Doug Berger says: ==================== net: bcmgenet: revisit MAC reset Commit 3a55402c9387 ("net: bcmgenet: use RGMII loopback for MAC reset") was intended to resolve issues with reseting the UniMAC core within the GENET block by providing better control over the clocks used by the UniMAC core. Unfortunately, it is not compatible with all of the supported system configurations so an alternative method must be applied. This commit set provides such an alternative. The first commit reverts the previous change and the second commit provides the alternative reset sequence that addresses the concerns observed with the previous implementation. This replacement implementation should be applied to the stable branches wherever commit 3a55402c9387 ("net: bcmgenet: use RGMII loopback for MAC reset") has been applied. Unfortunately, reverting that commit may conflict with some restructuring changes introduced by commit 4f8d81b77e66 ("net: bcmgenet: Refactor register access in bcmgenet_mii_config"). The first commit in this set has been manually edited to resolve the conflict on net/master. I would be happy to help stable maintainers with resolving any such conflicts if they occur. However, I do not expect that commit to have been backported to stable branch so hopefully the revert can be applied cleanly. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: bcmgenet: keep MAC in reset until PHY is up	Doug Berger	3	-7/+15
	As noted in commit 28c2d1a7a0bf ("net: bcmgenet: enable loopback during UniMAC sw_reset") the UniMAC must be clocked at least 5 cycles while the sw_reset is asserted to ensure a clean reset. That commit enabled local loopback to provide an Rx clock from the GENET sourced Tx clk. However, when connected in MII mode the Tx clk is sourced by the PHY so if an EPHY is not supplying clocks (e.g. when the link is down) the UniMAC does not receive the necessary clocks. This commit extends the sw_reset window until the PHY reports that the link is up thereby ensuring that the clocks are being provided to the MAC to produce a clean reset. One consequence is that if the system attempts to enter a Wake on LAN suspend state when the PHY link has not been active the MAC may not have had a chance to initialize cleanly. In this case, we remove the sw_reset and enable the WoL reception path as normal with the hope that the PHY will provide the necessary clocks to drive the WoL blocks if the link becomes active after the system has entered suspend. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	Revert "net: bcmgenet: use RGMII loopback for MAC reset"	Doug Berger	2	-34/+2
	This reverts commit 3a55402c93877d291b0a612d25edb03d1b4b93ac. This is not a good solution when connecting to an external switch that may not support the isolation of the TXC signal resulting in output driver contention on the pin. A different solution is necessary. Signed-off-by: Doug Berger <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	Merge branch 'net-mvmdio-avoid-error-message-for-optional-IRQ'	David S. Miller	1	-2/+2
	Chris Packham says: ==================== net: mvmdio: avoid error message for optional IRQ I've gone ahead an sent a revert. This is the same as the original v1 except I've added Andrew's review to the commit message. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-17	net: mvmdio: avoid error message for optional IRQ	Chris Packham	1	-1/+1
	Per the dt-binding the interrupt is optional so use platform_get_irq_optional() instead of platform_get_irq(). Since commit 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()") platform_get_irq() produces an error message orion-mdio f1072004.mdio: IRQ index 0 not found which is perfectly normal if one hasn't specified the optional property in the device tree. Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-17	Revert "net: mvmdio: avoid error message for optional IRQ"	Chris Packham	1	-3/+3
	This reverts commit e1f550dc44a4d535da4e25ada1b0eaf8f3417929. platform_get_irq_optional() will still return -ENXIO when no interrupt is provided so the additional error handling caused the driver prone to fail when no interrupt was specified. Revert the change so we can apply the correct minimal fix. Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	net: ip_gre: Accept IFLA_INFO_DATA-less configuration	Petr Machata	1	-0/+2
	The fix referenced below causes a crash when an ERSPAN tunnel is created without passing IFLA_INFO_DATA. Fix by validating passed-in data in the same way as ipgre does. Fixes: e1f8f78ffe98 ("net: ip_gre: Separate ERSPAN newlink / changelink callbacks") Reported-by: [email protected] Signed-off-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	net: mvneta: Fix the case where the last poll did not process all rx	Jisheng Zhang	1	-2/+1
	For the case where the last mvneta_poll did not process all RX packets, we need to xor the pp->cause_rx_tx or port->cause_rx_tx before claculating the rx_queue. Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU") Signed-off-by: Jisheng Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	net: vxge: fix wrong __VA_ARGS__ usage	Zheng Wei	2	-8/+8
	printk in macro vxge_debug_ll uses __VA_ARGS__ without "##" prefix, it causes a build error when there is no variable arguments(e.g. only fmt is specified.). Signed-off-by: Zheng Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	Merge branch 'QorIQ-DPAA-ARM-RDBs-need-internal-delay-on-RGMII'	David S. Miller	3	-5/+5
	Madalin Bucur says: ==================== QorIQ DPAA ARM RDBs need internal delay on RGMII v2: used phy_interface_mode_is_rgmii() to identify RGMII The QorIQ DPAA 1 based RDB boards require internal delay on both Tx and Rx to be set. The patch set ensures all RGMII modes are treated correctly by the FMan driver and sets the phy-connection-type to "rgmii-id" to restore functionality. Previously Rx internal delay was set by board pull-ups and was left untouched by the PHY driver. Since commit 1b3047b5208a80 ("net: phy: realtek: add support for configuring the RX delay on RTL8211F") the Realtek 8211F PHY driver has control over the RGMII RX delay and it is disabling it for other modes than RGMII_RXID and RGMII_ID. Please note that u-boot in particular performs a fix-up of the PHY connection type and will overwrite the values from the Linux device tree. Another patch set was sent for u-boot and one needs to apply that [1] to the boot loader, to ensure this fix is complete, unless a different bootloader is used. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-16	arm64: dts: ls1046ardb: set RGMII interfaces to RGMII_ID mode	Madalin Bucur	1	-2/+2
	The correct setting for the RGMII ports on LS1046ARDB is to enable delay on both Rx and Tx so the interface mode used must be PHY_INTERFACE_MODE_RGMII_ID. Since commit 1b3047b5208a80 ("net: phy: realtek: add support for configuring the RX delay on RTL8211F") the Realtek 8211F PHY driver has control over the RGMII RX delay and it is disabling it for RGMII_TXID. The LS1046ARDB uses two such PHYs in RGMII_ID mode but in the device tree the mode was described as "rgmii". Changing the phy-connection-type to "rgmii-id" to address the issue. Fixes: 3fa395d2c48a ("arm64: dts: add LS1046A DPAA FMan nodes") Signed-off-by: Madalin Bucur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	arm64: dts: ls1043a-rdb: correct RGMII delay mode to rgmii-id	Madalin Bucur	1	-2/+2
	The correct setting for the RGMII ports on LS1043ARDB is to enable delay on both Rx and Tx so the interface mode used must be PHY_INTERFACE_MODE_RGMII_ID. Since commit 1b3047b5208a80 ("net: phy: realtek: add support for configuring the RX delay on RTL8211F") the Realtek 8211F PHY driver has control over the RGMII RX delay and it is disabling it for RGMII_TXID. The LS1043ARDB uses two such PHYs in RGMII_ID mode but in the device tree the mode was described as "rgmii_txid". This issue was not apparent at the time as the PHY driver took the same action for RGMII_TXID and RGMII_ID back then but it became visible (RX no longer working) after the above patch. Changing the phy-connection-type to "rgmii-id" to address the issue. Fixes: bf02f2ffe59c ("arm64: dts: add LS1043A DPAA FMan support") Signed-off-by: Madalin Bucur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	net: fsl/fman: treat all RGMII modes in memac_adjust_link()	Madalin Bucur	1	-1/+1
	Treat all internal delay variants the same as RGMII. Signed-off-by: Madalin Bucur <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	Merge branch 'ethtool-fail-with-error-if-request-has-unknown-flags'	David S. Miller	3	-30/+38
	Michal Kubecek says: ==================== ethtool: fail with error if request has unknown flags Jakub Kicinski pointed out that if unrecognized flags are set in netlink header request, kernel shoud fail with an error rather than silently ignore them so that we have more freedom in future flags semantics. To help userspace with handling such errors, inform the client which flags are supported by kernel. For that purpose, we need to allow passing cookies as part of extack also in case of error (they can be only passed on success now). ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-16	ethtool: reject unrecognized request flags	Michal Kubecek	1	-4/+12
	As pointed out by Jakub Kicinski, we ethtool netlink code should respond with an error if request head has flags set which are not recognized by kernel, either as a mistake or because it expects functionality introduced in later kernel versions. To avoid unnecessary roundtrips, use extack cookie to provide the information about supported request flags. Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	netlink: add nl_set_extack_cookie_u32()	Michal Kubecek	1	-0/+9
	Similar to existing nl_set_extack_cookie_u64(), add new helper nl_set_extack_cookie_u32() which sets extack cookie to a u32 value. Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	netlink: allow extack cookie also for error messages	Michal Kubecek	1	-26/+17
	Commit ba0dc5f6e0ba ("netlink: allow sending extended ACK with cookie on success") introduced a cookie which can be sent to userspace as part of extended ack message in the form of NLMSGERR_ATTR_COOKIE attribute. Currently the cookie is ignored if error code is non-zero but there is no technical reason for such limitation and it can be useful to provide machine parseable information as part of an error message. Include NLMSGERR_ATTR_COOKIE whenever the cookie has been set, regardless of error code. Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	net_sched: cls_route: remove the right filter from hashtable	Cong Wang	1	-2/+2
	route4_change() allocates a new filter and copies values from the old one. After the new filter is inserted into the hash table, the old filter should be removed and freed, as the final step of the update. However, the current code mistakenly removes the new one. This looks apparently wrong to me, and it causes double "free" and use-after-free too, as reported by syzbot. Reported-and-tested-by: [email protected] Reported-and-tested-by: [email protected] Reported-and-tested-by: [email protected] Fixes: 1109c00547fc ("net: sched: RCU cls_route") Cc: Jamal Hadi Salim <[email protected]> Cc: Jiri Pirko <[email protected]> Cc: John Fastabend <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	Merge branch 'hsr-fix-several-bugs-in-generic-netlink-callback'	David S. Miller	2	-35/+44
	Taehee Yoo says: ==================== hsr: fix several bugs in generic netlink callback This patchset is to fix several bugs they are related in generic netlink callback in hsr module. 1. The first patch is to add missing rcu_read_lock() in hsr_get_node_{list/status}(). The hsr_get_node_{list/status}() are not protected by RTNL because they are callback functions of generic netlink. But it calls __dev_get_by_index() without acquiring RTNL. So, it would use unsafe data. 2. The second patch is to avoid failure of hsr_get_node_list(). hsr_get_node_list() is a callback of generic netlink and it is used to get node information in userspace. But, if there are so many nodes, it fails because of buffer size. So, in this patch, restart routine is added. 3. The third patch is to set .netnsok flag to true. If .netnsok flag is false, non-init_net namespace is not allowed to operate generic netlink operations. So, currently, non-init_net namespace has no way to get node information because .netnsok is false in the current hsr code. Change log: v1->v2: - Preserve reverse christmas tree variable ordering in the second patch. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-16	hsr: set .netnsok flag	Taehee Yoo	1	-0/+1
	The hsr module has been supporting the list and status command. (HSR_C_GET_NODE_LIST and HSR_C_GET_NODE_STATUS) These commands send node information to the user-space via generic netlink. But, in the non-init_net namespace, these commands are not allowed because .netnsok flag is false. So, there is no way to get node information in the non-init_net namespace. Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	hsr: add restart routine into hsr_get_node_list()	Taehee Yoo	1	-14/+24
	The hsr_get_node_list() is to send node addresses to the userspace. If there are so many nodes, it could fail because of buffer size. In order to avoid this failure, the restart routine is added. Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-16	hsr: use rcu_read_lock() in hsr_get_node_{list/status}()	Taehee Yoo	2	-25/+23
	hsr_get_node_{list/status}() are not under rtnl_lock() because they are callback functions of generic netlink. But they use __dev_get_by_index() without rtnl_lock(). So, it would use unsafe data. In order to fix it, rcu_read_lock() and dev_get_by_index_rcu() are used instead of __dev_get_by_index(). Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	Merge branch 'net-Use-scnprintf-for-avoiding-potential-buffer-overflow'	David S. Miller	6	-107/+111
	Takashi Iwai says: ==================== net: Use scnprintf() for avoiding potential buffer overflow here is a respin of trivial patch series just to convert suspicious snprintf() usages with the more safer one, scnprintf(). v1->v2: Align the remaining lines to the open parenthesis Excluded i40e patch that was already queued ==================== Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: netdevsim: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-15/+15
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Cc: "David S . Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: sfc: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-14/+18
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Cc: "David S . Miller" <[email protected]> Cc: Edward Cree <[email protected]> Cc: Martin Habets <[email protected]> Cc: Solarflare linux maintainers <[email protected]> Cc: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: ionic: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-7/+7
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Reviewed-by: Simon Horman <[email protected]> Acked-by: Shannon Nelson <[email protected]> Cc: "David S . Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: nfp: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-4/+4
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Reviewed-by: Simon Horman <[email protected]> Cc: "David S . Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: [email protected] To: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: mlx4: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-31/+31
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Cc: "David S . Miller" <[email protected]> Cc: Tariq Toukan <[email protected]> To: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: caif: Use scnprintf() for avoiding potential buffer overflow	Takashi Iwai	1	-36/+36
	Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Cc: "David S . Miller" <[email protected]> Cc: [email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	mlxsw: reg: Increase register field length to 31 bits	Ido Schimmel	1	-1/+1
	The cited commit set a value of 2^31-1 in order to "disable" the shaper on a given a port. However, the length of the maximum shaper rate field was not updated from 28 bits to 31 bits, which means ports are still limited to ~268Gbps despite supporting speeds of 400Gbps. Fix this by increasing the field's length. Fixes: 92afbfedb77d ("mlxsw: reg: Increase MLXSW_REG_QEEC_MAS_DIS") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	geneve: move debug check after netdev unregister	Florian Westphal	1	-2/+6
	The debug check must be done after unregister_netdevice_many() call -- the list_del() for this is done inside .ndo_stop. Fixes: 2843a25348f8 ("geneve: speedup geneve tunnels dismantle") Reported-and-tested-by: <[email protected]> Cc: Haishuang Yan <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net/packet: tpacket_rcv: avoid a producer race condition	Willem de Bruijn	2	-1/+25
	PACKET_RX_RING can cause multiple writers to access the same slot if a fast writer wraps the ring while a slow writer is still copying. This is particularly likely with few, large, slots (e.g., GSO packets). Synchronize kernel thread ownership of rx ring slots with a bitmap. Writers acquire a slot race-free by testing tp_status TP_STATUS_KERNEL while holding the sk receive queue lock. They release this lock before copying and set tp_status to TP_STATUS_USER to release to userspace when done. During copying, another writer may take the lock, also see TP_STATUS_KERNEL, and start writing to the same slot. Introduce a new rx_owner_map bitmap with a bit per slot. To acquire a slot, test and set with the lock held. To release race-free, update tp_status and owner bit as a transaction, so take the lock again. This is the one of a variety of discussed options (see Link below): * instead of a shadow ring, embed the data in the slot itself, such as in tp_padding. But any test for this field may match a value left by userspace, causing deadlock. * avoid the lock on release. This leaves a small race if releasing the shadow slot before setting TP_STATUS_USER. The below reproducer showed that this race is not academic. If releasing the slot after tp_status, the race is more subtle. See the first link for details. * add a new tp_status TP_KERNEL_OWNED to avoid the transactional store of two fields. But, legacy applications may interpret all non-zero tp_status as owned by the user. As libpcap does. So this is possible only opt-in by newer processes. It can be added as an optional mode. * embed the struct at the tail of pg_vec to avoid extra allocation. The implementation proved no less complex than a separate field. The additional locking cost on release adds contention, no different than scaling on multicore or multiqueue h/w. In practice, below reproducer nor small packet tcpdump showed a noticeable change in perf report in cycles spent in spinlock. Where contention is problematic, packet sockets support mitigation through PACKET_FANOUT. And we can consider adding opt-in state TP_KERNEL_OWNED. Easy to reproduce by running multiple netperf or similar TCP_STREAM flows concurrently with `tcpdump -B 129 -n greater 60000`. Based on an earlier patchset by Jon Rosen. See links below. I believe this issue goes back to the introduction of tpacket_rcv, which predates git history. Link: https://www.mail-archive.com/[email protected]/msg237222.html Suggested-by: Jon Rosen <[email protected]> Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: Jon Rosen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	net: ip_gre: Separate ERSPAN newlink / changelink callbacks	Petr Machata	1	-18/+85
	ERSPAN shares most of the code path with GRE and gretap code. While that helps keep the code compact, it is also error prone. Currently a broken userspace can turn a gretap tunnel into a de facto ERSPAN one by passing IFLA_GRE_ERSPAN_VER. There has been a similar issue in ip6gretap in the past. To prevent these problems in future, split the newlink and changelink code paths. Split the ERSPAN code out of ipgre_netlink_parms() into a new function erspan_netlink_parms(). Extract a piece of common logic from ipgre_newlink() and ipgre_changelink() into ipgre_newlink_encap_setup(). Add erspan_newlink() and erspan_changelink(). Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-15	cxgb4: fix delete filter entry fail in unload path	Shahjada Abul Husain	1	-2/+2
	Currently, the hardware TID index is assumed to start from index 0. However, with the following changeset, commit c21939998802 ("cxgb4: add support for high priority filters") hardware TID index can start after the high priority region, which has introduced a regression resulting in remove filters entry failure for cxgb4 unload path. This patch fix that. Fixes: c21939998802 ("cxgb4: add support for high priority filters") Signed-off-by: Shahjada Abul Husain <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-14	net: stmmac: platform: Fix misleading interrupt error msg	Markus Fuchs	1	-4/+10
	Not every stmmac based platform makes use of the eth_wake_irq or eth_lpi interrupts. Use the platform_get_irq_byname_optional variant for these interrupts, so no error message is displayed, if they can't be found. Rather print an information to hint something might be wrong to assist debugging on platforms which use these interrupts. Signed-off-by: Markus Fuchs <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-14	net/bpfilter: fix dprintf usage for /dev/kmsg	Bruno Meneguele	1	-6/+8
	The bpfilter UMH code was recently changed to log its informative messages to /dev/kmsg, however this interface doesn't support SEEK_CUR yet, used by dprintf(). As result dprintf() returns -EINVAL and doesn't log anything. However there already had some discussions about supporting SEEK_CUR into /dev/kmsg interface in the past it wasn't concluded. Since the only user of that from userspace perspective inside the kernel is the bpfilter UMH (userspace) module it's better to correct it here instead waiting a conclusion on the interface. Fixes: 36c4357c63f3 ("net: bpfilter: print umh messages to /dev/kmsg") Signed-off-by: Bruno Meneguele <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-14	net_sched: keep alloc_hash updated after hash allocation	Cong Wang	1	-0/+1
	In commit 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex") I moved cp->hash calculation before the first tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched. This difference could lead to another out of bound access. cp->alloc_hash should always be the size allocated, we should update it after this tcindex_alloc_perfect_hash(). Reported-and-tested-by: [email protected] Reported-and-tested-by: [email protected] Fixes: 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex") Cc: Jamal Hadi Salim <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-14	net_sched: hold rtnl lock in tcindex_partial_destroy_work()	Cong Wang	1	-0/+2
	syzbot reported a use-after-free in tcindex_dump(). This is due to the lack of RTNL in the deferred rcu work. We queue this work with RTNL in tcindex_change(), later, tcindex_dump() is called: fh = tp->ops->get(tp, t->tcm_handle); ... err = tp->ops->change(..., &fh, ...); tfilter_notify(..., fh, ...); but there is nothing to serialize the pending tcindex_partial_destroy_work() with tcindex_dump(). Fix this by simply holding RTNL in tcindex_partial_destroy_work(), so that it won't be called until RTNL is released after tc_new_tfilter() is completed. Reported-and-tested-by: [email protected] Fixes: 3d210534cc93 ("net_sched: fix a race condition in tcindex_destroy()") Cc: Jamal Hadi Salim <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-03-13	afs: Fix client call Rx-phase signal handling	David Howells	5	-67/+6
	Fix the handling of signals in client rxrpc calls made by the afs filesystem. Ignore signals completely, leaving call abandonment or connection loss to be detected by timeouts inside AF_RXRPC. Allowing a filesystem call to be interrupted after the entire request has been transmitted and an abort sent means that the server may or may not have done the action - and we don't know. It may even be worse than that for older servers. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <[email protected]>
2020-03-13	afs: Fix handling of an abort from a service handler	David Howells	3	-33/+26
	When an AFS service handler function aborts a call, AF_RXRPC marks the call as complete - which means that it's not going to get any more packets from the receiver. This is a problem because reception of the final ACK is what triggers afs_deliver_to_call() to drop the final ref on the afs_call object. Instead, aborted AFS service calls may then just sit around waiting for ever or until they're displaced by a new call on the same connection channel or a connection-level abort. Fix this by calling afs_set_call_complete() to finalise the afs_call struct representing the call. However, we then need to drop the ref that stops the call from being deallocated. We can do this in afs_set_call_complete(), as the work queue is holding a separate ref of its own, but then we shouldn't do it in afs_process_async_call() and afs_delete_async_call(). call->drop_ref is set to indicate that a ref needs dropping for a call and this is dealt with when we transition a call to AFS_CALL_COMPLETE. But then we also need to get rid of the ref that pins an asynchronous client call. We can do this by the same mechanism, setting call->drop_ref for an async client call too. We can also get rid of call->incoming since nothing ever sets it and only one thing ever checks it (futilely). A trace of the rxrpc_call and afs_call struct ref counting looks like: <idle>-0 [001] ..s5 164.764892: rxrpc_call: c=00000002 SEE u=3 sp=rxrpc_new_incoming_call+0x473/0xb34 a=00000000442095b5 <idle>-0 [001] .Ns5 164.766001: rxrpc_call: c=00000002 QUE u=4 sp=rxrpc_propose_ACK+0xbe/0x551 a=00000000442095b5 <idle>-0 [001] .Ns4 164.766005: rxrpc_call: c=00000002 PUT u=3 sp=rxrpc_new_incoming_call+0xa3f/0xb34 a=00000000442095b5 <idle>-0 [001] .Ns7 164.766433: afs_call: c=00000002 WAKE u=2 o=11 sp=rxrpc_notify_socket+0x196/0x33c kworker/1:2-1810 [001] ...1 164.768409: rxrpc_call: c=00000002 SEE u=3 sp=rxrpc_process_call+0x25/0x7ae a=00000000442095b5 kworker/1:2-1810 [001] ...1 164.769439: rxrpc_tx_packet: c=00000002 e9f1a7a8:95786a88:00000008:09c5 00000001 00000000 02 22 ACK CallAck kworker/1:2-1810 [001] ...1 164.769459: rxrpc_call: c=00000002 PUT u=2 sp=rxrpc_process_call+0x74f/0x7ae a=00000000442095b5 kworker/1:2-1810 [001] ...1 164.770794: afs_call: c=00000002 QUEUE u=3 o=12 sp=afs_deliver_to_call+0x449/0x72c kworker/1:2-1810 [001] ...1 164.770829: afs_call: c=00000002 PUT u=2 o=12 sp=afs_process_async_call+0xdb/0x11e kworker/1:2-1810 [001] ...2 164.771084: rxrpc_abort: c=00000002 95786a88:00000008 s=0 a=1 e=1 K-1 kworker/1:2-1810 [001] ...1 164.771461: rxrpc_tx_packet: c=00000002 e9f1a7a8:95786a88:00000008:09c5 00000002 00000000 04 00 ABORT CallAbort kworker/1:2-1810 [001] ...1 164.771466: afs_call: c=00000002 PUT u=1 o=12 sp=SRXAFSCB_ProbeUuid+0xc1/0x106 The abort generated in SRXAFSCB_ProbeUuid(), labelled "K-1", indicates that the local filesystem/cache manager didn't recognise the UUID as its own. Fixes: 2067b2b3f484 ("afs: Fix the CB.ProbeUuid service handler to reply correctly") Signed-off-by: David Howells <[email protected]>
2020-03-13	afs: Fix some tracing details	David Howells	2	-3/+3
	Fix a couple of tracelines to indicate the usage count after the atomic op, not the usage count before it to be consistent with other afs and rxrpc trace lines. Change the wording of the afs_call_trace_work trace ID label from "WORK" to "QUEUE" to reflect the fact that it's queueing work, not doing work. Fixes: 341f741f04be ("afs: Refcount the afs_call struct") Signed-off-by: David Howells <[email protected]>
2020-03-13	rxrpc: Fix sendmsg(MSG_WAITALL) handling	David Howells	1	-2/+2
	Fix the handling of sendmsg() with MSG_WAITALL for userspace to round the timeout for when a signal occurs up to at least two jiffies as a 1 jiffy timeout may end up being effectively 0 if jiffies wraps at the wrong time. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <[email protected]>