aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-05-22selftests: netdevsim: Always initialize 'RET' variableIdo Schimmel1-0/+4
The variable is used by log_test() to check if the test case completely successfully or not. In case it is not initialized at the start of a test case, it is possible for the test case to fail despite not encountering any errors. Example: ``` ... TEST: Trap group statistics [ OK ] TEST: Trap policer [FAIL] Policer drop counter was not incremented TEST: Trap policer binding [FAIL] Policer drop counter was not incremented ``` Failure of trap_policer_test() caused trap_policer_bind_test() to fail as well. Fix by adding missing initialization of the variable. Fixes: 5fbff58e27a1 ("selftests: netdevsim: Add test cases for devlink-trap policers") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22netdevsim: Ensure policer drop counter always increasesIdo Schimmel1-2/+1
In case the policer drop counter is retrieved when the jiffies value is a multiple of 64, the counter will not be incremented. This randomly breaks a selftest [1] the reads the counter twice and checks that it was incremented: ``` TEST: Trap policer [FAIL] Policer drop counter was not incremented ``` Fix by always incrementing the counter by 1. [1] tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh Fixes: ad188458d012 ("netdevsim: Add devlink-trap policer support") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22Merge tag 'rxrpc-fixes-20200520' of ↵David S. Miller17-159/+335
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fix retransmission timeout and ACK discard Here are a couple of fixes and an extra tracepoint for AF_RXRPC: (1) Calculate the RTO pretty much as TCP does, rather than making something up, including an initial 4s timeout (which causes return probes from the fileserver to fail if a packet goes missing), and add backoff. (2) Fix the discarding of out-of-order received ACKs. We mustn't let the hard-ACK point regress, nor do we want to do unnecessary retransmission because the soft-ACK list regresses. This is not trivial, however, due to some loose wording in various old protocol specs, the ACK field that should be used for this sometimes has the wrong information in it. (3) Add a tracepoint to log a discarded ACK. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-22net/ethernet/freescale: rework quiesce/activate for ucc_gethValentin Longchamp1-6/+7
ugeth_quiesce/activate are used to halt the controller when there is a link change that requires to reconfigure the mac. The previous implementation called netif_device_detach(). This however causes the initial activation of the netdevice to fail precisely because it's detached. For details, see [1]. A possible workaround was the revert of commit net: linkwatch: add check for netdevice being present to linkwatch_do_dev However, the check introduced in the above commit is correct and shall be kept. The netif_device_detach() is thus replaced with netif_tx_stop_all_queues() that prevents any tranmission. This allows to perform mac config change required by the link change, without detaching the corresponding netdevice and thus not preventing its initial activation. [1] https://lists.openwall.net/netdev/2020/01/08/201 Signed-off-by: Valentin Longchamp <[email protected]> Acked-by: Matteo Ghidoni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and ↵Jere Leppänen1-4/+5
socket is closed Commit bdf6fa52f01b ("sctp: handle association restarts when the socket is closed.") starts shutdown when an association is restarted, if in SHUTDOWN-PENDING state and the socket is closed. However, the rationale stated in that commit applies also when in SHUTDOWN-SENT state - we don't want to move an association to ESTABLISHED state when the socket has been closed, because that results in an association that is unreachable from user space. The problem scenario: 1. Client crashes and/or restarts. 2. Server (using one-to-one socket) calls close(). SHUTDOWN is lost. 3. Client reconnects using the same addresses and ports. 4. Server's association is restarted. The association and the socket move to ESTABLISHED state, even though the server process has closed its descriptor. Also, after step 4 when the server process exits, some resources are leaked in an attempt to release the underlying inet sock structure in ESTABLISHED state: IPv4: Attempt to release TCP socket in state 1 00000000377288c7 Fix by acting the same way as in SHUTDOWN-PENDING state. That is, if an association is restarted in SHUTDOWN-SENT state and the socket is closed, then start shutdown and don't move the association or the socket to ESTABLISHED state. Fixes: bdf6fa52f01b ("sctp: handle association restarts when the socket is closed.") Signed-off-by: Jere Leppänen <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22tipc: block BH before using dst_cacheEric Dumazet1-1/+5
dst_cache_get() documents it must be used with BH disabled. sysbot reported : BUG: using smp_processor_id() in preemptible [00000000] code: /21697 caller is dst_cache_get+0x3a/0xb0 net/core/dst_cache.c:68 CPU: 0 PID: 21697 Comm: Not tainted 5.7.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 check_preemption_disabled lib/smp_processor_id.c:47 [inline] debug_smp_processor_id.cold+0x88/0x9b lib/smp_processor_id.c:57 dst_cache_get+0x3a/0xb0 net/core/dst_cache.c:68 tipc_udp_xmit.isra.0+0xb9/0xad0 net/tipc/udp_media.c:164 tipc_udp_send_msg+0x3e6/0x490 net/tipc/udp_media.c:244 tipc_bearer_xmit_skb+0x1de/0x3f0 net/tipc/bearer.c:526 tipc_enable_bearer+0xb2f/0xd60 net/tipc/bearer.c:331 __tipc_nl_bearer_enable+0x2bf/0x390 net/tipc/bearer.c:995 tipc_nl_bearer_enable+0x1e/0x30 net/tipc/bearer.c:1003 genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline] genl_family_rcv_msg net/netlink/genetlink.c:718 [inline] genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 genl_rcv+0x24/0x40 net/netlink/genetlink.c:746 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x45ca29 Fixes: e9c1a793210f ("tipc: add dst_cache support for udp media") Cc: Xin Long <[email protected]> Cc: Jon Maloy <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22net: mvpp2: fix RX hashing for non-10G portsRussell King1-1/+1
When rxhash is enabled on any ethernet port except the first in each CP block, traffic flow is prevented. The analysis is below: I've been investigating this afternoon, and what I've found, comparing a kernel without 895586d5dc32 and with 895586d5dc32 applied is: - The table programmed into the hardware via mvpp22_rss_fill_table() appears to be identical with or without the commit. - When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports that c2.attr[0] and c2.attr[2] are written back containing: - with 895586d5dc32, failing: 00200000 40000000 - without 895586d5dc32, working: 04000000 40000000 - When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as: 04000000 00000000 The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the first value is the queue number, which comprises two fields. The high 5 bits are 24:29 and the low three are 21:23 inclusive. This comes from: c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) | MVPP22_CLS_C2_ATTR0_QLOW(ql); So, the working case gives eth2 a queue id of 4.0, or 32 as per port->first_rxq, and the non-working case a queue id of 0.1, or 1. The allocation of queue IDs seems to be in mvpp2_port_probe(): if (priv->hw_version == MVPP21) port->first_rxq = port->id * port->nrxqs; else port->first_rxq = port->id * priv->max_port_rxqs; Where: if (priv->hw_version == MVPP21) priv->max_port_rxqs = 8; else priv->max_port_rxqs = 32; Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1 (eth2) be 32. It seems the idea is that the first 32 queues belong to port 0, the second 32 queues belong to port 1, etc. mvpp2_rss_port_c2_enable() gets the queue number from it's parameter, 'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns port->rss_ctx[0]. mvpp22_rss_context_create() is responsible for allocating that, which it does by looking for an unallocated priv->rss_tables[] pointer. This table is shared amongst all ports on the CP silicon. When we write the tables in mvpp22_rss_fill_table(), the RSS table entry is defined by: u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) | MVPP22_RSS_INDEX_TABLE_ENTRY(i); where rss_ctx is the context ID (queue number) and i is the index in the table. If we look at what is written: - The first table to be written has "sel" values of 00000000..0000001f, containing values 0..3. This appears to be for eth1. This is table 0, RX queue number 0. - The second table has "sel" values of 00000100..0000011f, and appears to be for eth2. These contain values 0x20..0x23. This is table 1, RX queue number 0. - The third table has "sel" values of 00000200..0000021f, and appears to be for eth3. These contain values 0x40..0x43. This is table 2, RX queue number 0. How do queue numbers translate to the RSS table? There is another table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE register. Before 895586d5dc32, it was: mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(port->first_rxq)); mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(port->id)); and after: mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx)); mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx)); Before the commit, for eth2, that would've contained '32' for the index and '1' for the table pointer - mapping queue 32 to table 1. Remember that this is queue-high.queue-low of 4.0. After the commit, we appear to map queue 1 to table 1. That again looks fine on the face of it. Section 9.3.1 of the A8040 manual seems indicate the reason that the queue number is separated. queue-low seems to always come from the classifier, whereas queue-high can be from the ingress physical port number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG. We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to be where our bug comes from. mvpp2_cls_oversize_rxq_set() sets this up as: mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id), (port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS)); val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG); val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id); mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val); Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high for eth2 is _always_ 4, so only queues 32 through 39 inclusive are available to eth2. Yet, we're trying to tell the classifier to set queue-high, which will be ignored, to zero. Hence, the queue-high field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be ignored. This means we end up directing traffic from eth2 not to queue 1, but to queue 33, and then we tell it to look up queue 33 in the RSS table. However, RSS table has not been programmed for queue 33, and so it ends up (presumably) dropping the packets. It seems that mvpp22_rss_context_create() doesn't take account of the fact that the upper 5 bits of the queue ID can't actually be changed due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either way, these two functions mutually disagree with what queue number should be used. Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used for over-sized packets attempting to egress through this port. With the classifier having had RSS enabled and directing eth2 traffic to queue 1, we may still have packets appearing on queue 32 for this port. However, the only way we may end up with over-sized packets attempting to egress through eth2 - is if the A8040 forwards frames between its ports. From what I can see, we don't support that feature, and the kernel restricts the egress packet size to the MTU. In any case, if we were to attempt to transmit an oversized packet, we have no support in the kernel to deal with that appearing in the port's receive queue. So, this patch attempts to solve the issue by clearing the MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH() from the classifier to define the queue-high field of the queue number. My testing seems to confirm my findings above - clearing this bit means that if I enable rxhash on eth2, the interface can then pass traffic, as we are now directing traffic to RX queue 1 rather than queue 33. Traffic still seems to work with rxhash off as well. Reported-by: Matteo Croce <[email protected]> Tested-by: Matteo Croce <[email protected]> Fixes: 895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables") Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller5-11/+69
Daniel Borkmann says: ==================== pull-request: bpf 2020-05-22 The following pull-request contains BPF updates for your *net* tree. We've added 3 non-merge commits during the last 3 day(s) which contain a total of 5 files changed, 69 insertions(+), 11 deletions(-). The main changes are: 1) Fix to reject mmap()'ing read-only array maps as writable since BPF verifier relies on such map content to be frozen, from Andrii Nakryiko. 2) Fix breaking audit from secid_to_secctx() LSM hook by avoiding to use call_int_hook() since this hook is not stackable, from KP Singh. 3) Fix BPF flow dissector program ref leak on netns cleanup, from Jakub Sitnicki. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-22felix: Fix initialization of ioremap resourcesClaudiu Manoil3-27/+24
The caller of devm_ioremap_resource(), either accidentally or by wrong assumption, is writing back derived resource data to global static resource initialization tables that should have been constant. Meaning that after it computes the final physical start address it saves the address for no reason in the static tables. This doesn't affect the first driver probing after reboot, but it breaks consecutive driver reloads (i.e. driver unbind & bind) because the initialization tables no longer have the correct initial values. So the next probe() will map the device registers to wrong physical addresses, causing ARM SError async exceptions. This patch fixes all of the above. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Claudiu Manoil <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22mptcp: use untruncated hash in ADD_ADDR HMACTodd Malsbary4-25/+24
There is some ambiguity in the RFC as to whether the ADD_ADDR HMAC is the rightmost 64 bits of the entire hash or of the leftmost 160 bits of the hash. The intention, as clarified with the author of the RFC, is the entire hash. This change returns the entire hash from mptcp_crypto_hmac_sha (instead of only the first 160 bits), and moves any truncation/selection operation on the hash to the caller. Fixes: 12555a2d97e5 ("mptcp: use rightmost 64 bits in ADD_ADDR HMAC") Reviewed-by: Christoph Paasch <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Todd Malsbary <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22Merge tag 'io_uring-5.7-2020-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds1-26/+34
Pull io_uring fixes from Jens Axboe: "A small collection of small fixes that should go into this release: - Two fixes for async request preparation (Pavel) - Busy clear fix for SQPOLL (Xiaoguang) - Don't use kiocb->private for O_DIRECT buf index, some file systems use it (Bijan) - Kill dead check in io_splice() - Ensure sqo_wait is initialized early - Cancel task_work if we fail adding to original process - Only add (IO)pollable requests to iopoll list, fixing a regression in this merge window" * tag 'io_uring-5.7-2020-05-22' of git://git.kernel.dk/linux-block: io_uring: reset -EBUSY error when io sq thread is waken up io_uring: don't add non-IO requests to iopoll pending list io_uring: don't use kiocb.private to store buf_index io_uring: cancel work if task_work_add() fails io_uring: remove dead check in io_splice() io_uring: fix FORCE_ASYNC req preparation io_uring: don't prepare DRAIN reqs twice io_uring: initialize ctx->sqo_wait earlier
2020-05-22Merge tag 'block-5.7-2020-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds2-0/+11
Pull block fixes from Jens Axboe: "Two fixes for null_blk zone mode" * tag 'block-5.7-2020-05-22' of git://git.kernel.dk/linux-block: null_blk: don't allow discard for zoned mode null_blk: return error for invalid zone size
2020-05-22Merge tag 'riscv-for-linus-5.7-rc7' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "Two fixes: - Another !MMU build fix that was a straggler from last week - A fix to use the "register" keyword for the GP global register variable" * tag 'riscv-for-linus-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: gp_in_global needs register keyword riscv: Fix print_vm_layout build error if NOMMU
2020-05-22Merge tag 'efi-fixes-for-v5.7-rc6' of ↵Borislav Petkov12-39/+124
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent Pull EFI fixes from Ard Biesheuvel: "- fix EFI framebuffer earlycon for wide fonts - avoid filling screen_info with garbage if the EFI framebuffer is not available - fix a potential host tool build error due to a symbol clash on x86 - work around a EFI firmware bug regarding the binary format of the TPM final events table - fix a missing memory free by reworking the E820 table sizing routine to not do the allocation in the first place - add CPER parsing for firmware errors"
2020-05-22x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasksJosh Poimboeuf1-0/+7
Normally, show_trace_log_lvl() scans the stack, looking for text addresses to print. In parallel, it unwinds the stack with unwind_next_frame(). If the stack address matches the pointer returned by unwind_get_return_address_ptr() for the current frame, the text address is printed normally without a question mark. Otherwise it's considered a breadcrumb (potentially from a previous call path) and it's printed with a question mark to indicate that the address is unreliable and typically can be ignored. Since the following commit: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") ... for inactive tasks, show_trace_log_lvl() prints *only* unreliable addresses (prepended with '?'). That happens because, for the first frame of an inactive task, unwind_get_return_address_ptr() returns the wrong return address pointer: one word *below* the task stack pointer. show_trace_log_lvl() starts scanning at the stack pointer itself, so it never finds the first 'reliable' address, causing only guesses to being printed. The first frame of an inactive task isn't a normal stack frame. It's actually just an instance of 'struct inactive_task_frame' which is left behind by __switch_to_asm(). Now that this inactive frame is actually exposed to callers, fix unwind_get_return_address_ptr() to interpret it properly. Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Reported-by: Tetsuo Handa <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/20200522135435.vbxs7umku5pyrdbk@treble
2020-05-22Merge tag 'arm64-fixes' of ↵Linus Torvalds2-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Bring the PTRACE_SYSEMU semantics in line with the man page. - Annotate variable assignment in get_user() with the type to avoid sparse warnings. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Add get_user() type annotation on the !access_ok() path arm64: Fix PTRACE_SYSEMU semantics
2020-05-22Merge tag 'sound-5.7-rc7' of ↵Linus Torvalds3-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just a few small fixes: the only significant one is a slight improvement for PCM running position update with no-period-elapsed case while the rest are HD-audio fixups and ice1712 model quirk" * tag 'sound-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Add more fixup entries for Clevo machines ALSA: iec1712: Initialize STDSP24 properly when using the model=staudio option ALSA: hda/realtek - Fix silent output on Gigabyte X570 Aorus Xtreme ALSA: pcm: fix incorrect hw_base increase
2020-05-22arm64: Add get_user() type annotation on the !access_ok() pathAl Viro1-1/+1
Sparse reports "Using plain integer as NULL pointer" when the arm64 __get_user_error() assigns 0 to a pointer type. Use proper type annotation. Signed-of-by: Al Viro <[email protected]> Reported-by: kbuild test robot <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-05-22Merge tag 'powerpc-5.7-5' of ↵Linus Torvalds4-14/+19
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - a revert of a recent change to the PTE bits for 32-bit BookS, which broke swap. - a "fix" to disable STRICT_KERNEL_RWX for 64-bit in Kconfig, as it's causing crashes for some people. Thanks to Christophe Leroy and Rui Salvaterra. * tag 'powerpc-5.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Disable STRICT_KERNEL_RWX Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."
2020-05-22misc: rtsx: Add short delay after exit from ASPMKlaus Doth1-0/+3
DMA transfers to and from the SD card stall for 10 seconds and run into timeout on RTS5260 card readers after ASPM was enabled. Adding a short msleep after disabling ASPM fixes the issue on several Dell Precision 7530/7540 systems I tested. This function is only called when waking up after the chip went into power-save after not transferring data for a few seconds. The added msleep does therefore not change anything in data transfer speed or induce any excessive waiting while data transfers are running, or the chip is sleeping. Only the transition from sleep to active is affected. Signed-off-by: Klaus Doth <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-05-21flow_dissector: Drop BPF flow dissector prog ref on netns cleanupJakub Sitnicki1-5/+21
When attaching a flow dissector program to a network namespace with bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog. If netns gets destroyed while a flow dissector is still attached, and there are no other references to the prog, we leak the reference and the program remains loaded. Leak can be reproduced by running flow dissector tests from selftests/bpf: # bpftool prog list # ./test_flow_dissector.sh ... selftests: test_flow_dissector [PASS] # bpftool prog list 4: flow_dissector name _dissect tag e314084d332a5338 gpl loaded_at 2020-05-20T18:50:53+0200 uid 0 xlated 552B jited 355B memlock 4096B map_ids 3,4 btf_id 4 # Fix it by detaching the flow dissector program when netns is going away. Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") Signed-off-by: Jakub Sitnicki <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-05-22Merge tag 'amd-drm-fixes-5.7-2020-05-21' of ↵Dave Airlie18-95/+126
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.7-2020-05-21: amdgpu: - DP fix - Floating point fix - Fix cursor stutter issue Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-05-21net: sgi: ioc3-eth: Fix return value check in ioc3eth_probe()Tang Bin1-4/+4
In the function devm_platform_ioremap_resource(), if get resource failed, the return value is ERR_PTR() not NULL. Thus it must be replaced by IS_ERR(), or else it may result in crashes if a critical error path is encountered. Fixes: 0ce5ebd24d25 ("mfd: ioc3: Add driver for SGI IOC3 chip") Signed-off-by: Zhang Shengju <[email protected]> Signed-off-by: Tang Bin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: don't return invalid table id error when we fall back to PF_UNSPECSabrina Dubroca5-6/+4
In case we can't find a ->dumpit callback for the requested (family,type) pair, we fall back to (PF_UNSPEC,type). In effect, we're in the same situation as if userspace had requested a PF_UNSPEC dump. For RTM_GETROUTE, that handler is rtnl_dump_all, which calls all the registered RTM_GETROUTE handlers. The requested table id may or may not exist for all of those families. commit ae677bbb4441 ("net: Don't return invalid table id error when dumping all families") fixed the problem when userspace explicitly requests a PF_UNSPEC dump, but missed the fallback case. For example, when we pass ipv6.disable=1 to a kernel with CONFIG_IP_MROUTE=y and CONFIG_IP_MROUTE_MULTIPLE_TABLES=y, the (PF_INET6, RTM_GETROUTE) handler isn't registered, so we end up in rtnl_dump_all, and listing IPv6 routes will unexpectedly print: # ip -6 r Error: ipv4: MR table does not exist. Dump terminated commit ae677bbb4441 introduced the dump_all_families variable, which gets set when userspace requests a PF_UNSPEC dump. However, we can't simply set the family to PF_UNSPEC in rtnetlink_rcv_msg in the fallback case to get dump_all_families == true, because some messages types (for example RTM_GETRULE and RTM_GETNEIGH) only register the PF_UNSPEC handler and use the family to filter in the kernel what is dumped to userspace. We would then export more entries, that userspace would have to filter. iproute does that, but other programs may not. Instead, this patch removes dump_all_families and updates the RTM_GETROUTE handlers to check if the family that is being dumped is their own. When it's not, which covers both the intentional PF_UNSPEC dumps (as dump_all_families did) and the fallback case, ignore the missing table id error. Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps") Signed-off-by: Sabrina Dubroca <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-22Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux ↵Dave Airlie2-2/+4
into drm-fixes two fixes: - memory leak fix when userspace passes a invalid softpin address - off-by-one crashing the kernel in the perfmon domain iteration when the GPU core has both 2D and 3D capabilities Signed-off-by: Dave Airlie <[email protected]> From: Lucas Stach <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-05-21net: ipip: fix wrong address family in init error pathVadim Fedorenko1-1/+1
In case of error with MPLS support the code is misusing AF_INET instead of AF_MPLS. Fixes: 1b69e7e6c4da ("ipip: support MPLS over IPv4") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21Merge branch 'net-tls-fix-encryption-error-path'David S. Miller1-7/+10
Vadim Fedorenko says: ==================== net/tls: fix encryption error path The problem with data stream corruption was found in KTLS transmit path with small socket send buffers and large amount of data. bpf_exec_tx_verdict() frees open record on any type of error including EAGAIN, ENOMEM and ENOSPC while callers are able to recover this transient errors. Also wrong error code was returned to user space in that case. This patchset fixes the problems. ==================== Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net/tls: free record only on encryption errorVadim Fedorenko1-2/+4
We cannot free record on any transient error because it leads to losing previos data. Check socket error to know whether record must be freed or not. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net/tls: fix encryption error checkingVadim Fedorenko1-5/+6
bpf_exec_tx_verdict() can return negative value for copied variable. In that case this value will be pushed back to caller and the real error code will be lost. Fix it using signed type and checking for positive value. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21Merge branch 'net-ethernet-ti-fix-some-return-value-check'David S. Miller4-6/+7
Wei Yongjun says: ==================== net: ethernet: ti: fix some return value check This patchset convert cpsw_ale_create() to return PTR_ERR() only, and changed all the caller to check IS_ERR() instead of NULL. Since v2: 1) rebased on net.git, as Jakub's suggest 2) split am65-cpsw-nuss.c changes, as Grygorii's suggest ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: ethernet: ti: am65-cpsw-nuss: fix error handling of am65_cpsw_nuss_probeWei Yongjun1-1/+2
Convert to using IS_ERR() instead of NULL test for cpsw_ale_create() error handling. Also fix to return negative error code from this error handling case instead of 0 in. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: ethernet: ti: fix some return value check of cpsw_ale_create()Wei Yongjun3-5/+5
cpsw_ale_create() can return both NULL and PTR_ERR(), but all of the caller only check NULL for error handling. This patch convert it to only return PTR_ERR() in all error cases, and the caller using IS_ERR() instead of NULL test. Fixes: 4b41d3436796 ("net: ethernet: ti: cpsw: allow untagged traffic on host port") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: qrtr: Fix passing invalid reference to qrtr_local_enqueue()Manivannan Sadhasivam1-1/+1
Once the traversal of the list is completed with list_for_each_entry(), the iterator (node) will point to an invalid object. So passing this to qrtr_local_enqueue() which is outside of the iterator block is erroneous eventhough the object is not used. So fix this by passing NULL to qrtr_local_enqueue(). Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Reported-by: kbuild test robot <[email protected]> Reported-by: Julia Lawall <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21ethtool: count header size in reply size estimateMichal Kubecek2-3/+2
As ethnl_request_ops::reply_size handlers do not include common header size into calculated/estimated reply size, it needs to be added in ethnl_default_doit() and ethnl_default_notify() before allocating the message. On the other hand, strset_reply_size() should not add common header size. Fixes: 728480f12442 ("ethtool: default handlers for GET requests") Reported-by: Oleksij Rempel <[email protected]> Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21RDMA/mlx5: Fix NULL pointer dereference in destroy_prefetch_workMaor Gottlieb1-0/+1
q_deferred_work isn't initialized when creating an explicit ODP memory region. This can lead to a NULL pointer dereference when user performs asynchronous prefetch MR. Fix it by initializing q_deferred_work for explicit ODP. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 4 PID: 6074 Comm: kworker/u16:6 Not tainted 5.7.0-rc1-for-upstream-perf-2020-04-17_07-03-39-64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound mlx5_ib_prefetch_mr_work [mlx5_ib] RIP: 0010:__wake_up_common+0x49/0x120 Code: 04 89 54 24 0c 89 4c 24 08 74 0a 41 f6 01 04 0f 85 8e 00 00 00 48 8b 47 08 48 83 e8 18 4c 8d 67 08 48 8d 50 18 49 39 d4 74 66 <48> 8b 70 18 31 db 4c 8d 7e e8 eb 17 49 8b 47 18 48 8d 50 e8 49 8d RSP: 0000:ffffc9000097bd88 EFLAGS: 00010082 RAX: ffffffffffffffe8 RBX: ffff888454cd9f90 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff888454cd9f90 RBP: ffffc9000097bdd0 R08: 0000000000000000 R09: ffffc9000097bdd0 R10: 0000000000000000 R11: 0000000000000001 R12: ffff888454cd9f98 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000044c19e002 CR4: 0000000000760ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: __wake_up_common_lock+0x7a/0xc0 destroy_prefetch_work+0x5a/0x60 [mlx5_ib] mlx5_ib_prefetch_mr_work+0x64/0x80 [mlx5_ib] process_one_work+0x15b/0x360 worker_thread+0x49/0x3d0 kthread+0xf5/0x130 ? rescuer_thread+0x310/0x310 ? kthread_bind+0x10/0x10 ret_from_fork+0x1f/0x30 Fixes: de5ed007a03d ("IB/mlx5: Fix implicit ODP race") Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-05-21Merge tag 'apparmor-pr-2020-05-21' of ↵Linus Torvalds3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor bug fixes from John Johansen: - Fix use-after-free in aa_audit_rule_init - Fix refcnt leak in policy_update - Fix potential label refcnt leak in aa_change_profile * tag 'apparmor-pr-2020-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: Fix use-after-free in aa_audit_rule_init apparmor: Fix aa_label refcnt leak in policy_update apparmor: fix potential label refcnt leak in aa_change_profile
2020-05-21exfat: add the dummy mount options to be backward compatible with staging/exfatNamjae Jeon1-0/+19
As Ubuntu and Fedora release new version used kernel version equal to or higher than v5.4, They started to support kernel exfat filesystem. Linus reported a mount error with new version of exfat on Fedora: exfat: Unknown parameter 'namecase' This is because there is a difference in mount option between old staging/exfat and new exfat. And utf8, debug, and codepage options as well as namecase have been removed from new exfat. This patch add the dummy mount options as deprecated option to be backward compatible with old one. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Namjae Jeon <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Al Viro <[email protected]> Cc: Eric Sandeen <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-05-21apparmor: Fix use-after-free in aa_audit_rule_initNavid Emamdoost1-1/+2
In the implementation of aa_audit_rule_init(), when aa_label_parse() fails the allocated memory for rule is released using aa_audit_rule_free(). But after this release, the return statement tries to access the label field of the rule which results in use-after-free. Before releasing the rule, copy errNo and return it after release. Fixes: 52e8c38001d8 ("apparmor: Fix memory leak of rule on error exit path") Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21apparmor: Fix aa_label refcnt leak in policy_updateXiyu Yang1-1/+2
policy_update() invokes begin_current_label_crit_section(), which returns a reference of the updated aa_label object to "label" with increased refcount. When policy_update() returns, "label" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of policy_update(). When aa_may_manage_policy() returns not NULL, the refcnt increased by begin_current_label_crit_section() is not decreased, causing a refcnt leak. Fix this issue by jumping to "end_section" label when aa_may_manage_policy() returns not NULL. Fixes: 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre internal transform") Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21apparmor: fix potential label refcnt leak in aa_change_profileXiyu Yang1-2/+1
aa_change_profile() invokes aa_get_current_label(), which returns a reference of the current task's label. According to the comment of aa_get_current_label(), the returned reference must be put with aa_put_label(). However, when the original object pointed by "label" becomes unreachable because aa_change_profile() returns or a new object is assigned to "label", reference count increased by aa_get_current_label() is not decreased, causing a refcnt leak. Fix this by calling aa_put_label() before aa_change_profile() return and dropping unnecessary aa_get_current_label(). Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp") Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21RISC-V: gp_in_global needs register keywordPalmer Dabbelt1-1/+1
The Intel kernel build robot recently pointed out that I missed the register keyword on this one when I refactored the code to remove local register variables (which aren't supported by LLVM). GCC's manual indicates that global register variables must have the register keyword, As far as I can tell lacking the register keyword causes GCC to ignore the __asm__ and treat this as a regular variable, but I'm not sure how that didn't show up as some sort of failure. Fixes: 52e7c52d2ded ("RISC-V: Stop relying on GCC's register allocator's hueristics") Signed-off-by: Palmer Dabbelt <[email protected]>
2020-05-21Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-10/+9
Pull virtio fixes from Michael Tsirkin: "Fix a couple of build warnings" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: missing __user tags vdpasim: remove unused variable 'ret'
2020-05-21Merge tag 'dmaengine-fix-5.7-rc7' of ↵Linus Torvalds7-21/+40
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Some driver fixes: - dmatest restoration of defaults - tegra210-adma probe handling fix - k3-udma flags fixed for slave_sg and memcpy - list fix for zynqmp_dma - idxd interrupt completion fix - lock fix for owl" * tag 'dmaengine-fix-5.7-rc7' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: tegra210-adma: Fix an error handling path in 'tegra_adma_probe()' dmaengine: ti: k3-udma: Fix TR mode flags for slave_sg and memcpy dmaengine: zynqmp_dma: Move list_del inside zynqmp_dma_free_descriptor. dmaengine: dmatest: Restore default for channel dmaengine: idxd: fix interrupt completion after unmasking dmaengine: owl: Use correct lock in owl_dma_get_pchan()
2020-05-21Merge tag 'fiemap-regression-fix' of ↵Linus Torvalds3-32/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix regression in ext4's FIEMAP handling introduced in v5.7-rc1" * tag 'fiemap-regression-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix fiemap size checks for bitmap files ext4: fix EXT4_MAX_LOGICAL_BLOCK macro
2020-05-21null_blk: don't allow discard for zoned modeChaitanya Kulkarni1-0/+7
Zoned block device specification do not define the behavior of discard/trim command as this command is generally replaced by the reset write pointer (zone reset) command. Emulate this in null_blk by making zoned and discard options mutually exclusive. Suggested-by: Damien Le Moal <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-05-21null_blk: return error for invalid zone sizeChaitanya Kulkarni1-0/+4
In null_init_zone_dev() check if the zone size is larger than device capacity, return error if needed. This also fixes the following oops :- null_blk: changed the number of conventional zones to 4294967295 BUG: kernel NULL pointer dereference, address: 0000000000000010 PGD 7d76c5067 P4D 7d76c5067 PUD 7d240c067 PMD 0 Oops: 0002 [#1] SMP NOPTI CPU: 4 PID: 5508 Comm: nullbtests.sh Tainted: G OE 5.7.0-rc4lblk-fnext0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e4 RIP: 0010:null_init_zoned_dev+0x17a/0x27f [null_blk] RSP: 0018:ffffc90007007e00 EFLAGS: 00010246 RAX: 0000000000000020 RBX: ffff8887fb3f3c00 RCX: 0000000000000007 RDX: 0000000000000000 RSI: ffff8887ca09d688 RDI: ffff888810fea510 RBP: 0000000000000010 R08: ffff8887ca09d688 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887c26e8000 R13: ffffffffa05e9390 R14: 0000000000000000 R15: 0000000000000001 FS: 00007fcb5256f740(0000) GS:ffff888810e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000081e8fe000 CR4: 00000000003406e0 Call Trace: null_add_dev+0x534/0x71b [null_blk] nullb_device_power_store.cold.41+0x8/0x2e [null_blk] configfs_write_file+0xe6/0x150 vfs_write+0xba/0x1e0 ksys_write+0x5f/0xe0 do_syscall_64+0x60/0x250 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x7fcb51c71840 Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-05-22powerpc/64s: Disable STRICT_KERNEL_RWXMichael Ellerman1-1/+1
Several strange crashes have been eventually traced back to STRICT_KERNEL_RWX and its interaction with code patching. Various paths in our ftrace, kprobes and other patching code need to be hardened against patching failures, otherwise we can end up running with partially/incorrectly patched ftrace paths, kprobes or jump labels, which can then cause strange crashes. Although fixes for those are in development, they're not -rc material. There also seem to be problems with the underlying strict RWX logic, which needs further debugging. So for now disable STRICT_KERNEL_RWX on 64-bit to prevent people from enabling the option and tripping over the bugs. Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Cc: [email protected] # v4.13+ Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-21kobject: Make sure the parent does not get released before its childrenHeikki Krogerus1-10/+20
In the function kobject_cleanup(), kobject_del(kobj) is called before the kobj->release(). That makes it possible to release the parent of the kobject before the kobject itself. To fix that, adding function __kboject_del() that does everything that kobject_del() does except release the parent reference. kobject_cleanup() then calls __kobject_del() instead of kobject_del(), and separately decrements the reference count of the parent kobject after kobj->release() has been called. Reported-by: Naresh Kamboju <[email protected]> Reported-by: kernel test robot <[email protected]> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"") Suggested-by: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Heikki Krogerus <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Tested-by: Brendan Higgins <[email protected]> Acked-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-05-21driver core: Fix handling of SYNC_STATE_ONLY + STATELESS device linksSaravana Kannan1-3/+5
Commit 21c27f06587d ("driver core: Fix SYNC_STATE_ONLY device link implementation") didn't completely fix STATELESS + SYNC_STATE_ONLY handling. What looks like an optimization in that commit is actually a bug that causes an if condition to always take the else path. This prevents reordering of devices in the dpm_list when a DL_FLAG_STATELESS device link is create on top of an existing DL_FLAG_SYNC_STATE_ONLY device link. Fixes: 21c27f06587d ("driver core: Fix SYNC_STATE_ONLY device link implementation") Signed-off-by: Saravana Kannan <[email protected]> Cc: stable <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-05-20net: nlmsg_cancel() if put fails for nhmsgStephen Worley1-0/+1
Fixes data remnant seen when we fail to reserve space for a nexthop group during a larger dump. If we fail the reservation, we goto nla_put_failure and cancel the message. Reproduce with the following iproute2 commands: ===================== ip link add dummy1 type dummy ip link add dummy2 type dummy ip link add dummy3 type dummy ip link add dummy4 type dummy ip link add dummy5 type dummy ip link add dummy6 type dummy ip link add dummy7 type dummy ip link add dummy8 type dummy ip link add dummy9 type dummy ip link add dummy10 type dummy ip link add dummy11 type dummy ip link add dummy12 type dummy ip link add dummy13 type dummy ip link add dummy14 type dummy ip link add dummy15 type dummy ip link add dummy16 type dummy ip link add dummy17 type dummy ip link add dummy18 type dummy ip link add dummy19 type dummy ip link add dummy20 type dummy ip link add dummy21 type dummy ip link add dummy22 type dummy ip link add dummy23 type dummy ip link add dummy24 type dummy ip link add dummy25 type dummy ip link add dummy26 type dummy ip link add dummy27 type dummy ip link add dummy28 type dummy ip link add dummy29 type dummy ip link add dummy30 type dummy ip link add dummy31 type dummy ip link add dummy32 type dummy ip link set dummy1 up ip link set dummy2 up ip link set dummy3 up ip link set dummy4 up ip link set dummy5 up ip link set dummy6 up ip link set dummy7 up ip link set dummy8 up ip link set dummy9 up ip link set dummy10 up ip link set dummy11 up ip link set dummy12 up ip link set dummy13 up ip link set dummy14 up ip link set dummy15 up ip link set dummy16 up ip link set dummy17 up ip link set dummy18 up ip link set dummy19 up ip link set dummy20 up ip link set dummy21 up ip link set dummy22 up ip link set dummy23 up ip link set dummy24 up ip link set dummy25 up ip link set dummy26 up ip link set dummy27 up ip link set dummy28 up ip link set dummy29 up ip link set dummy30 up ip link set dummy31 up ip link set dummy32 up ip link set dummy33 up ip link set dummy34 up ip link set vrf-red up ip link set vrf-blue up ip link set dummyVRFred up ip link set dummyVRFblue up ip ro add 1.1.1.1/32 dev dummy1 ip ro add 1.1.1.2/32 dev dummy2 ip ro add 1.1.1.3/32 dev dummy3 ip ro add 1.1.1.4/32 dev dummy4 ip ro add 1.1.1.5/32 dev dummy5 ip ro add 1.1.1.6/32 dev dummy6 ip ro add 1.1.1.7/32 dev dummy7 ip ro add 1.1.1.8/32 dev dummy8 ip ro add 1.1.1.9/32 dev dummy9 ip ro add 1.1.1.10/32 dev dummy10 ip ro add 1.1.1.11/32 dev dummy11 ip ro add 1.1.1.12/32 dev dummy12 ip ro add 1.1.1.13/32 dev dummy13 ip ro add 1.1.1.14/32 dev dummy14 ip ro add 1.1.1.15/32 dev dummy15 ip ro add 1.1.1.16/32 dev dummy16 ip ro add 1.1.1.17/32 dev dummy17 ip ro add 1.1.1.18/32 dev dummy18 ip ro add 1.1.1.19/32 dev dummy19 ip ro add 1.1.1.20/32 dev dummy20 ip ro add 1.1.1.21/32 dev dummy21 ip ro add 1.1.1.22/32 dev dummy22 ip ro add 1.1.1.23/32 dev dummy23 ip ro add 1.1.1.24/32 dev dummy24 ip ro add 1.1.1.25/32 dev dummy25 ip ro add 1.1.1.26/32 dev dummy26 ip ro add 1.1.1.27/32 dev dummy27 ip ro add 1.1.1.28/32 dev dummy28 ip ro add 1.1.1.29/32 dev dummy29 ip ro add 1.1.1.30/32 dev dummy30 ip ro add 1.1.1.31/32 dev dummy31 ip ro add 1.1.1.32/32 dev dummy32 ip next add id 1 via 1.1.1.1 dev dummy1 ip next add id 2 via 1.1.1.2 dev dummy2 ip next add id 3 via 1.1.1.3 dev dummy3 ip next add id 4 via 1.1.1.4 dev dummy4 ip next add id 5 via 1.1.1.5 dev dummy5 ip next add id 6 via 1.1.1.6 dev dummy6 ip next add id 7 via 1.1.1.7 dev dummy7 ip next add id 8 via 1.1.1.8 dev dummy8 ip next add id 9 via 1.1.1.9 dev dummy9 ip next add id 10 via 1.1.1.10 dev dummy10 ip next add id 11 via 1.1.1.11 dev dummy11 ip next add id 12 via 1.1.1.12 dev dummy12 ip next add id 13 via 1.1.1.13 dev dummy13 ip next add id 14 via 1.1.1.14 dev dummy14 ip next add id 15 via 1.1.1.15 dev dummy15 ip next add id 16 via 1.1.1.16 dev dummy16 ip next add id 17 via 1.1.1.17 dev dummy17 ip next add id 18 via 1.1.1.18 dev dummy18 ip next add id 19 via 1.1.1.19 dev dummy19 ip next add id 20 via 1.1.1.20 dev dummy20 ip next add id 21 via 1.1.1.21 dev dummy21 ip next add id 22 via 1.1.1.22 dev dummy22 ip next add id 23 via 1.1.1.23 dev dummy23 ip next add id 24 via 1.1.1.24 dev dummy24 ip next add id 25 via 1.1.1.25 dev dummy25 ip next add id 26 via 1.1.1.26 dev dummy26 ip next add id 27 via 1.1.1.27 dev dummy27 ip next add id 28 via 1.1.1.28 dev dummy28 ip next add id 29 via 1.1.1.29 dev dummy29 ip next add id 30 via 1.1.1.30 dev dummy30 ip next add id 31 via 1.1.1.31 dev dummy31 ip next add id 32 via 1.1.1.32 dev dummy32 i=100 while [ $i -le 200 ] do ip next add id $i group 1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19 echo $i ((i++)) done ip next add id 999 group 1/2/3/4/5/6 ip next ls ======================== Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Stephen Worley <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>