blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-02-17	sched/features: Fix hrtick reprogramming	Juri Lelli	2	-5/+4
	Hung tasks and RCU stall cases were reported on systems which were not 100% busy. Investigation of such unexpected cases (no sign of potential starvation caused by tasks hogging the system) pointed out that the periodic sched tick timer wasn't serviced anymore after a certain point and that caused all machinery that depends on it (timers, RCU, etc.) to stop working as well. This issues was however only reproducible if HRTICK was enabled. Looking at core dumps it was found that the rbtree of the hrtimer base used also for the hrtick was corrupted (i.e. next as seen from the base root and actual leftmost obtained by traversing the tree are different). Same base is also used for periodic tick hrtimer, which might get "lost" if the rbtree gets corrupted. Much alike what described in commit 1f71addd34f4c ("tick/sched: Do not mess with an enqueued hrtimer") there is a race window between hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in __hrtick_restart() in which the former might be operating on an already queued hrtick hrtimer, which might lead to corruption of the base. Use hrtick_start() (which removes the timer before enqueuing it back) to ensure hrtick hrtimer reprogramming is entirely guarded by the base lock, so that no race conditions can occur. Signed-off-by: Juri Lelli <[email protected]> Signed-off-by: Luis Claudio R. Goncalves <[email protected]> Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()	Dietmar Eggemann	1	-4/+7
	dl_add_task_root_domain() is called during sched domain rebuild: rebuild_sched_domains_locked() partition_and_rebuild_sched_domains() rebuild_root_domains() for all top_cpuset descendants: update_tasks_root_domain() for all tasks of cpuset: dl_add_task_root_domain() Change it so that only the task pi lock is taken to check if the task has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the rq lock as well to be able to safely de-reference root domain's DL bandwidth structure. Most of the tasks will have another policy (namely SCHED_NORMAL) and can now bail without taking the rq lock. One thing to note here: Even in case that there aren't any DL user tasks, a slow frequency switching system with cpufreq gov schedutil has a DL task (sugov) per frequency domain running which participates in DL bandwidth management. Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Quentin Perret <[email protected]> Reviewed-by: Valentin Schneider <[email protected]> Reviewed-by: Daniel Bristot de Oliveira <[email protected]> Acked-by: Juri Lelli <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	uprobes: (Re)add missing get_uprobe() in __find_uprobe()	Sven Schnelle	1	-1/+1
	commit c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers") accidentally removed the refcount increase. Add it again. Fixes: c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers") Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	smp: Process pending softirqs in flush_smp_call_function_from_idle()	Sebastian Andrzej Siewior	1	-0/+4
	send_call_function_single_ipi() may wake an idle CPU without sending an IPI. The woken up CPU will process the SMP-functions in flush_smp_call_function_from_idle(). Any raised softirq from within the SMP-function call will not be processed. Should the CPU have no tasks assigned, then it will go back to idle with pending softirqs and the NOHZ will rightfully complain. Process pending softirqs on return from flush_smp_call_function_queue(). Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") Reported-by: Jens Axboe <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched: Harden PREEMPT_DYNAMIC	Peter Zijlstra	4	-8/+8
	Use the new EXPORT_STATIC_CALL_TRAMP() / static_call_mod() to unexport the static_call_key for the PREEMPT_DYNAMIC calls such that modules can no longer update these calls. Having modules change/hi-jack the preemption calls would be horrible. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2021-02-17	static_call: Allow module use without exposing static_call_key	Josh Poimboeuf	7	-11/+149
	When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module can use static_call_update() to change the function called. This is not desirable in general. Not exporting static_call_key however also disallows usage of static_call(), since objtool needs the key to construct the static_call_site. Solve this by allowing objtool to create the static_call_site using the trampoline address when it builds a module and cannot find the static_call_key symbol. The module loader will then try and map the trampole back to a key before it constructs the normal sites list. Doing this requires a trampoline -> key associsation, so add another magic section that keeps those. Originally-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
2021-02-17	sched: Add /debug/sched_preempt	Peter Zijlstra	1	-9/+126
	Add a debugfs file to muck about with the preempt mode at runtime. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	preempt/dynamic: Support dynamic preempt with preempt= boot option	Peter Zijlstra (Intel)	1	-1/+67
	Support the preempt= boot option and patch the static call sites accordingly. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	preempt/dynamic: Provide irqentry_exit_cond_resched() static call	Peter Zijlstra (Intel)	2	-1/+13
	Provide static call to control IRQ preemption (called in CONFIG_PREEMPT) so that we can override its behaviour when preempt= is overriden. Since the default behaviour is full preemption, its call is initialized to provide IRQ preemption when preempt= isn't passed. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	preempt/dynamic: Provide preempt_schedule[_notrace]() static calls	Peter Zijlstra (Intel)	2	-8/+38
	Provide static calls to control preempt_schedule[_notrace]() (called in CONFIG_PREEMPT) so that we can override their behaviour when preempt= is overriden. Since the default behaviour is full preemption, both their calls are initialized to the arch provided wrapper, if any. [fweisbec: only define static calls when PREEMPT_DYNAMIC, make it less dependent on x86 with __preempt_schedule_func] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	preempt/dynamic: Provide cond_resched() and might_resched() static calls	Peter Zijlstra (Intel)	3	-10/+56
	Provide static calls to control cond_resched() (called in !CONFIG_PREEMPT) and might_resched() (called in CONFIG_PREEMPT_VOLUNTARY) to that we can override their behaviour when preempt= is overriden. Since the default behaviour is full preemption, both their calls are ignored when preempt= isn't passed. [fweisbec: branch might_resched() directly to __cond_resched(), only define static calls when PREEMPT_DYNAMIC] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	preempt: Introduce CONFIG_PREEMPT_DYNAMIC	Michal Hocko	4	-0/+36
	Preemption mode selection is currently hardcoded on Kconfig choices. Introduce a dedicated option to tune preemption flavour at boot time, This will be only available on architectures efficiently supporting static calls in order not to tempt with the feature against additional overhead that might be prohibitive or undesirable. CONFIG_PREEMPT_DYNAMIC is automatically selected by CONFIG_PREEMPT if the architecture provides the necessary support (CONFIG_STATIC_CALL_INLINE, CONFIG_GENERIC_ENTRY, and provide with __preempt_schedule_function() / __preempt_schedule_notrace_function()). Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> [peterz: relax requirement to HAVE_STATIC_CALL] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	static_call: Provide DEFINE_STATIC_CALL_RET0()	Frederic Weisbecker	1	-8/+14
	DECLARE_STATIC_CALL() must pass the original function targeted for a given static call. But DEFINE_STATIC_CALL() may want to initialize it as off. In this case we can't pass NULL (for functions without return value) or __static_call_return0 (for functions returning a value) directly to DEFINE_STATIC_CALL() as that may trigger a static call redeclaration with a different function prototype. Type casts neither can work around that as they don't get along with typeof(). The proper way to do that for functions that don't return a value is to use DEFINE_STATIC_CALL_NULL(). But functions returning a actual value don't have an equivalent yet. Provide DEFINE_STATIC_CALL_RET0() to solve this situation. Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	static_call/x86: Add __static_call_return0()	Peter Zijlstra	3	-2/+32
	Provide a stub function that return 0 and wire up the static call site patching to replace the CALL with a single 5 byte instruction that clears %RAX, the return value register. The function can be cast to any function pointer type that has a single %RAX return (including pointers). Also provide a version that returns an int for convenience. We are clearing the entire %RAX register in any case, whether the return value is 32 or 64 bits, since %RAX is always a scratch register anyway. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	static_call: Pull some static_call declarations to the type headers	Peter Zijlstra	3	-21/+54
	Some static call declarations are going to be needed on low level header files. Move the necessary material to the dedicated static call types header to avoid inclusion dependency hell. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched/core: Update task_prio() function header	Dietmar Eggemann	1	-2/+6
	The description of the RT offset and the values for 'normal' tasks needs update. Moreover there are DL tasks now. task_prio() has to stay like it is to guarantee compatibility with the /proc/<pid>/stat priority field: # cat /proc/<pid>/stat \| awk '{ print $18; }' Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO	Dietmar Eggemann	3	-11/+2
	The only remaining use of MAX_USER_PRIO (and USER_PRIO) is the SCALE_PRIO() definition in the PowerPC Cell architecture's Synergistic Processor Unit (SPU) scheduler. TASK_USER_PRIO isn't used anymore. Commit fe443ef2ac42 ("[POWERPC] spusched: Dynamic timeslicing for SCHED_OTHER") copied SCALE_PRIO() from the task scheduler in v2.6.23. Commit a4ec24b48dde ("sched: tidy up SCHED_RR") removed it from the task scheduler in v2.6.24. Commit 3ee237dddcd8 ("sched/prio: Add 3 macros of MAX_NICE, MIN_NICE and NICE_WIDTH in prio.h") introduced NICE_WIDTH much later. With: MAX_USER_PRIO = USER_PRIO(MAX_PRIO) = MAX_PRIO - MAX_RT_PRIO MAX_PRIO = MAX_RT_PRIO + NICE_WIDTH MAX_USER_PRIO = MAX_RT_PRIO + NICE_WIDTH - MAX_RT_PRIO MAX_USER_PRIO = NICE_WIDTH MAX_USER_PRIO can be replaced by NICE_WIDTH to be able to remove all the {*_}USER_PRIO defines. Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched: Remove MAX_USER_RT_PRIO	Dietmar Eggemann	2	-12/+4
	Commit d46523ea32a7 ("[PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO") was introduced due to a a small time period in which the realtime patch set was using different values for MAX_USER_RT_PRIO and MAX_RT_PRIO. This is no longer true, i.e. now MAX_RT_PRIO == MAX_USER_RT_PRIO. Get rid of MAX_USER_RT_PRIO and make everything use MAX_RT_PRIO instead. Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()	Dietmar Eggemann	1	-1/+1
	Commit "sched/topology: Make sched_init_numa() use a set for the deduplicating sort" allocates 'i + nr_levels (level)' instead of 'i + nr_levels + 1' sched_domain_topology_level. This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG): sched_init_domains build_sched_domains() __free_domain_allocs() __sdt_free() { ... for_each_sd_topology(tl) ... sd = *per_cpu_ptr(sdd->sd, j); <-- ... } Signed-off-by: Dietmar Eggemann <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Tested-by: Vincent Guittot <[email protected]> Tested-by: Barry Song <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	rbtree, timerqueue: Use rb_add_cached()	Peter Zijlstra	1	-19/+9
	Reduce rbtree boiler plate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree, rtmutex: Use rb_add_cached()	Peter Zijlstra	1	-36/+18
	Reduce rbtree boiler plate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree, uprobes: Use rbtree helpers	Peter Zijlstra	1	-41/+39
	Reduce rbtree boilerplate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree, perf: Use new rbtree helpers	Peter Zijlstra	2	-107/+92
	Reduce rbtree boiler plate by using the new helpers. One noteworthy change is unification of the various (partial) compare functions. We construct a subtree match by forcing the sub-order to always match, see __group_cmp(). Due to 'const' we had to touch cgroup_id(). Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree, sched/deadline: Use rb_add_cached()	Peter Zijlstra	2	-53/+42
	Reduce rbtree boiler plate by using the new helpers. Make rb_add_cached() / rb_erase_cached() return a pointer to the leftmost node to aid in updating additional state. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree, sched/fair: Use rb_add_cached()	Peter Zijlstra	1	-32/+14
	Reduce rbtree boiler plate by using the new helper function. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	rbtree: Add generic add and find helpers	Peter Zijlstra	3	-63/+392
	I've always been bothered by the endless (fragile) boilerplate for rbtree, and I recently wrote some rbtree helpers for objtool and figured I should lift them into the kernel and use them more widely. Provide: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two Inlining and constant propagation should see the compiler inline the whole thing, including the various compare functions. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Michel Lespinasse <[email protected]> Acked-by: Davidlohr Bueso <[email protected]>
2021-02-17	sched/fair: Merge select_idle_core/cpu()	Mel Gorman	1	-40/+59
	Both select_idle_core() and select_idle_cpu() do a loop over the same cpumask. Observe that by clearing the already visited CPUs, we can fold the iteration and iterate a core at a time. All we need to do is remember any non-idle CPU we encountered while scanning for an idle core. This way we'll only iterate every CPU once. Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	sched/fair: Remove select_idle_smt()	Mel Gorman	1	-30/+0
	In order to make the next patch more readable, and to quantify the actual effectiveness of this pass, start by removing it. Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-17	Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch	Ingo Molnar	1642	-9628/+19353
	Signed-off-by: Ingo Molnar <[email protected]>
2021-02-17	Merge branch 'perf/kprobes' into perf/core, to pick up finished branch	Ingo Molnar	2	-98/+81
	Signed-off-by: Ingo Molnar <[email protected]>
2021-02-16	net: re-solve some conflicts after net -> net-next merge	Jakub Kicinski	5	-37/+11
	Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller	56	-336/+724

2021-02-16	net: dsa: tag_rtl4_a: Support also egress tags	Linus Walleij	1	-14/+29
	Support also transmitting frames using the custom "8899 A" 4 byte tag. Qingfang came up with the solution: we need to pad the ethernet frame to 60 bytes using eth_skb_pad(), then the switch will happily accept frames with custom tags. Cc: Mauri Sandberg <[email protected]> Reported-by: DENG Qingfang <[email protected]> Fixes: efd7fe68f0c6 ("net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag") Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	Merge branch 'broadcom-next'	David S. Miller	4	-21/+104
	Robert Hancock says: ==================== Broadcom PHY driver updates Updates to the Broadcom PHY driver related to use with copper SFP modules. Changed since v3: -fixed kerneldoc error Changed since v2: -Create flag for PHY on SFP module and use that rather than accessing attached_dev directly in PHY driver Changed since v1: -Reversed conditional to reduce indentation -Added missing setting of MII_BCM54XX_AUXCTL_MISC_WREN in MII_BCM54XX_AUXCTL_SHDWSEL_MISC register ==================== Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: phy: broadcom: Do not modify LED configuration for SFP module PHYs	Robert Hancock	1	-9/+15
	bcm54xx_config_init was modifying the PHY LED configuration to enable link and activity indications. However, some SFP modules (such as Bel-Fuse SFP-1GBT-06) have no LEDs but use the LED outputs to control the SFP LOS signal, and modifying the LED settings will cause the LOS output to malfunction. Skip this configuration for PHYs which are bound to an SFP bus. Signed-off-by: Robert Hancock <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: phy: Add is_on_sfp_module flag and phy_on_sfp helper	Robert Hancock	2	-0/+13
	Add a flag and helper function to indicate that a PHY device is part of an SFP module, which is set on attach. This can be used by PHY drivers to handle SFP-specific quirks or behavior. Signed-off-by: Robert Hancock <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S	Robert Hancock	2	-12/+76
	The default configuration for the BCM54616S PHY may not match the desired mode when using 1000BaseX or SGMII interface modes, such as when it is on an SFP module. Add code to explicitly set the correct mode using programming sequences provided by Bel-Fuse: https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf Signed-off-by: Robert Hancock <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	lan743x: sync only the received area of an rx ring buffer	Sven Van Asbroeck	1	-9/+26
	On cpu architectures w/o dma cache snooping, dma_unmap() is a is a very expensive operation, because its resulting sync needs to invalidate cpu caches. Increase efficiency/performance by syncing only those sections of the lan743x's rx ring buffers that are actually in use. Signed-off-by: Sven Van Asbroeck <[email protected]> Reviewed-by: Bryan Whitehead <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	lan743x: boost performance on cpu archs w/o dma cache snooping	Sven Van Asbroeck	2	-181/+148
	The buffers in the lan743x driver's receive ring are always 9K, even when the largest packet that can be received (the mtu) is much smaller. This performs particularly badly on cpu archs without dma cache snooping (such as ARM): each received packet results in a 9K dma_{map\|unmap} operation, which is very expensive because cpu caches need to be invalidated. Careful measurement of the driver rx path on armv7 reveals that the cpu spends the majority of its time waiting for cache invalidation. Optimize by keeping the rx ring buffer size as close as possible to the mtu. This limits the amount of cache that requires invalidation. This optimization would normally force us to re-allocate all ring buffers when the mtu is changed - a disruptive event, because it can only happen when the network interface is down. Remove the need to re-allocate all ring buffers by adding support for multi-buffer frames. Now any combination of mtu and ring buffer size will work. When the mtu changes from mtu1 to mtu2, consumed buffers of size mtu1 are lazily replaced by newly allocated buffers of size mtu2. These optimizations double the rx performance on armv7. Third parties report 3x rx speedup on armv8. Tested with iperf3 on a freescale imx6qp + lan7430, both sides set to mtu 1500 bytes, measure rx performance: Before: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0 After: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0 Signed-off-by: Sven Van Asbroeck <[email protected]> Reviewed-by: Bryan Whitehead <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: enetc: fix destroyed phylink dereference during unbind	Vladimir Oltean	1	-2/+3
	The following call path suggests that calling unregister_netdev on an interface that is up will first bring it down. enetc_pf_remove -> unregister_netdev -> unregister_netdevice_queue -> unregister_netdevice_many -> dev_close_many -> __dev_close_many -> enetc_close -> enetc_stop -> phylink_stop However, enetc first destroys the phylink instance, then calls unregister_netdev. This is already dissimilar to the setup (and error path teardown path) from enetc_pf_probe, but more than that, it is buggy because it is invalid to call phylink_stop after phylink_destroy. So let's first unregister the netdev (and let the .ndo_stop events consume themselves), then destroy the phylink instance, then free the netdev. Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	Merge branch 'net-mvneta-implement-basic-MQPrio-support'	David S. Miller	1	-1/+69
	Maxime Chevallier says: ==================== net: mvneta: implement basic MQPrio support This is V2 for the MQPrio support in mvneta. This small series adds basic support for mqprio offloading, by having the rx queueing mirroring the TCs based on VLAN prio fields. This was tested on Armada 3700, and proves useful to make sure high-priority traffic has a better chance not getting dropped when there's lots of packets incoming. The first patch of the series deals with the per-cpu interrupts on the armada 3700. Since they don't work, there were already some patches applied to keep all queue mappings to CPU0, but there still were some remaining mappings left to be dealt with. The second patch implements the MQPrio offloading for the receive path. Changes in V2 : - Add a Fixes tag for the first patch - Fix some warnings and the xmas tree in the second patch ==================== Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: mvneta: Implement mqprio support	Maxime Chevallier	1	-0/+61
	Implement a basic MQPrio support, inserting rules in RX that translate the TC to prio mapping into vlan prio to queues. The TX logic stays the same as when we don't offload the qdisc. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: mvneta: Remove per-cpu queue mapping for Armada 3700	Maxime Chevallier	1	-1/+8
	According to Errata #23 "The per-CPU GbE interrupt is limited to Core 0", we can't use the per-cpu interrupt mechanism on the Armada 3700 familly. This is correctly checked for RSS configuration, but the initial queue mapping is still done by having the queues spread across all the CPUs in the system, both in the init path and in the cpu_hotplug path. Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC") Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: sched: fix police ext initialization	Vlad Buslov	3	-1/+3
	When police action is created by cls API tcf_exts_validate() first conditional that calls tcf_action_init_1() directly, the action idr is not updated according to latest changes in action API that require caller to commit newly created action to idr with tcf_idr_insert_many(). This results such action not being accessible through act API and causes crash reported by syzbot: ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline] BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline] BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204 CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 ================================================================== Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G B 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 panic+0x306/0x73d kernel/panic.c:231 end_report+0x58/0x5e mm/kasan/report.c:100 __kasan_report mm/kasan/report.c:403 [inline] kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Kernel Offset: disabled Fix the issue by calling tcf_idr_insert_many() after successful action initialization. Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together") Reported-by: [email protected] Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	Merge branch 'mlx5-next' of ↵	David S. Miller	11	-129/+501
	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== pull-request: mlx5-next 2021-02-16 The patches in this pr are already submitted and reviewed through the netdev and rdma mailing lists. The series includes mlx5 HW bits and definitions for mlx5 real time clock translation and handling in the mlx5 driver clock module to enable and support such mode [1] [1] https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ ==================== Signed-off-by: David S. Miller <[email protected]>
2021-02-16	drivers: net: xilinx_emaclite: remove arch limitation	Gary Guo	2	-3/+2
	The changes made in eccd540 is enough for xilinx_emaclite to run without problem on 64-bit systems. I have tested it on a Xilinx FPGA with RV64 softcore. The architecture limitation in Kconfig seems no longer necessary. A small change is included to print address with %lx instead of casting to int and print with %x. Signed-off-by: Gary Guo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	Merge branch 'bridge-mrp-Extend-br_mrp_switchdev_'	David S. Miller	17	-104/+715
	Horatiu Vulturv says: ==================== bridge: mrp: Extend br_mrp_switchdev_* This patch series extends MRP switchdev to allow the SW to have a better understanding if the HW can implement the MRP functionality or it needs to help the HW to run it. There are 3 cases: - when HW can't implement at all the functionality. - when HW can implement a part of the functionality but needs the SW implement the rest. For example if it can't detect when it stops receiving MRP Test frames but it can copy the MRP frames to CPU to allow the SW to determine this. Another example is generating the MRP Test frames. If HW can't do that then the SW is used as backup. - when HW can implement completely the functionality. So, initially the SW tries to offload the entire functionality in HW, if that fails it tries offload parts of the functionality in HW and use the SW as helper and if also this fails then MRP can't run on this HW. Based on these new calls, implement the switchdev for Ocelot driver. This is an example where the HW can't run completely the functionality but it can help the SW to run it, by trapping all MRP frames to CPU. Also this patch series adds MRP support to DSA and implements the Felix driver which just reuse the Ocelot functions. This part was just compiled tested because I don't have any HW on which to do the actual tests. v4: - remove ifdef MRP from include/net/switchdev.h - move MRP implementation for Ocelot in a different file such that Felix driver can use it. - extend DSA with MRP support - implement MRP support for Felix. v3: - implement the switchdev calls needed by Ocelot driver. v2: - fix typos in comments and in commit messages - remove some of the comments - move repeated code in helper function - fix issue when deleting a node when sw_backup was true ==================== Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: dsa: felix: Add support for MRP	Horatiu Vultur	2	-0/+46
	Implement functions 'port_mrp_add', 'port_mrp_del', 'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp functions from ocelot. Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ will be forward by the SW. Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: dsa: add MRP support	Horatiu Vultur	5	-0/+213
	Add support for offloading MRP in HW. Currently implement the switchdev calls 'SWITCHDEV_OBJ_ID_MRP', 'SWITCHDEV_OBJ_ID_RING_ROLE_MRP', to allow to create MRP instances and to set the role of these instances. Add DSA_NOTIFIER_MRP_ADD/DEL and DSA_NOTIFIER_MRP_ADD/DEL_RING_ROLE which calls to .port_mrp_add/del and .port_mrp_add/del_ring_role in the DSA driver for the switch. Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16	net: mscc: ocelot: Add support for MRP	Horatiu Vultur	6	-1/+295
	Add basic support for MRP. The HW will just trap all MRP frames on the ring ports to CPU and allow the SW to process them. In this way it is possible to for this node to behave both as MRM and MRC. Current limitations are: - it doesn't support Interconnect roles. - it supports only a single ring. - the HW should be able to do forwarding of MRP Test frames so the SW will not need to do this. So it would be able to have the role MRC without SW support. Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>