blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-10-24	Merge tag 'net-6.12-rc5' of ↵	Linus Torvalds	1	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfiler, xfrm and bluetooth. Oddly this includes a fix for a posix clock regression; in our previous PR we included a change there as a pre-requisite for networking one. That fix proved to be buggy and requires the follow-up included here. Thomas suggested we should send it, given we sent the buggy patch. Current release - regressions: - posix-clock: Fix unbalanced locking in pc_clock_settime() - netfilter: fix typo causing some targets not to load on IPv6 Current release - new code bugs: - xfrm: policy: remove last remnants of pernet inexact list Previous releases - regressions: - core: fix races in netdev_tx_sent_queue()/dev_watchdog() - bluetooth: fix UAF on sco_sock_timeout - eth: hv_netvsc: fix VF namespace also in synthetic NIC NETDEV_REGISTER event - eth: usbnet: fix name regression - eth: be2net: fix potential memory leak in be_xmit() - eth: plip: fix transmit path breakage Previous releases - always broken: - sched: deny mismatched skip_sw/skip_hw flags for actions created by classifiers - netfilter: bpf: must hold reference on net namespace - eth: virtio_net: fix integer overflow in stats - eth: bnxt_en: replace ptp_lock with irqsave variant - eth: octeon_ep: add SKB allocation failures handling in __octep_oq_process_rx() Misc: - MAINTAINERS: add Simon as an official reviewer" * tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits) net: dsa: mv88e6xxx: support 4000ps cycle counter period net: dsa: mv88e6xxx: read cycle counter period from hardware net: dsa: mv88e6xxx: group cycle counter coefficients net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event net: dsa: microchip: disable EEE for KSZ879x/KSZ877x/KSZ876x Bluetooth: ISO: Fix UAF on iso_sock_timeout Bluetooth: SCO: Fix UAF on sco_sock_timeout Bluetooth: hci_core: Disable works on hci_unregister_dev posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime() r8169: avoid unsolicited interrupts net: sched: use RCU read-side critical section in taprio_dump() net: sched: fix use-after-free in taprio_change() net/sched: act_api: deny mismatched skip_sw/skip_hw flags for actions created by classifiers net: usb: usbnet: fix name regression mlxsw: spectrum_router: fix xa_store() error checking virtio_net: fix integer overflow in stats net: fix races in netdev_tx_sent_queue()/dev_watchdog() net: wwan: fix global oob in wwan_rtnl_policy netfilter: xtables: fix typo causing some targets not to load on IPv6 ...
2024-10-23	posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime()	Jinjie Ruan	1	-3/+3
	If get_clock_desc() succeeds, it calls fget() for the clockid's fd, and get the clk->rwsem read lock, so the error path should release the lock to make the lock balance and fput the clockid's fd to make the refcount balance and release the fd related resource. However the below commit left the error path locked behind resulting in unbalanced locking. Check timespec64_valid_strict() before get_clock_desc() to fix it, because the "ts" is not changed after that. Fixes: d8794ac20a29 ("posix-clock: Fix missing timespec64 check in pc_clock_settime()") Acked-by: Richard Cochran <[email protected]> Signed-off-by: Jinjie Ruan <[email protected]> Acked-by: Anna-Maria Behnsen <[email protected]> [[email protected]: fixed commit message typo] Signed-off-by: Paolo Abeni <[email protected]>
2024-10-20	Merge tag 'sched_urgent_for_v6.12_rc4' of ↵	Linus Torvalds	1	-0/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduling fixes from Borislav Petkov: - Add PREEMPT_RT maintainers - Fix another aspect of delayed dequeued tasks wrt determining their state, i.e., whether they're runnable or blocked - Handle delayed dequeued tasks and their migration wrt PSI properly - Fix the situation where a delayed dequeue task gets enqueued into a new class, which should not happen - Fix a case where memory allocation would happen while the runqueue lock is held, which is a no-no - Do not over-schedule when tasks with shorter slices preempt the currently running task - Make sure delayed to deque entities are properly handled before unthrottling - Other smaller cleanups and improvements * tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add an entry for PREEMPT_RT. sched/fair: Fix external p->on_rq users sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug sched/core: Dequeue PSI signals for blocked tasks that are delayed sched: Fix delayed_dequeue vs switched_from_fair() sched/core: Disable page allocation in task_tick_mm_cid() sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl() sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running sched: Fix sched_delayed vs cfs_bandwidth
2024-10-14	posix-clock: Fix missing timespec64 check in pc_clock_settime()	Jinjie Ruan	1	-0/+3
	As Andrew pointed out, it will make sense that the PTP core checked timespec64 struct's tv_sec and tv_nsec range before calling ptp->info->settime64(). As the man manual of clock_settime() said, if tp.tv_sec is negative or tp.tv_nsec is outside the range [0..999,999,999], it should return EINVAL, which include dynamic clocks which handles PTP clock, and the condition is consistent with timespec64_valid(). As Thomas suggested, timespec64_valid() only check the timespec is valid, but not ensure that the time is in a valid range, so check it ahead using timespec64_valid_strict() in pc_clock_settime() and return -EINVAL if not valid. There are some drivers that use tp->tv_sec and tp->tv_nsec directly to write registers without validity checks and assume that the higher layer has checked it, which is dangerous and will benefit from this, such as hclge_ptp_settime(), igb_ptp_settime_i210(), _rcar_gen4_ptp_settime(), and some drivers can remove the checks of itself. Cc: [email protected] Fixes: 0606f422b453 ("posix clocks: Introduce dynamic clocks") Acked-by: Richard Cochran <[email protected]> Suggested-by: Andrew Lunn <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Jinjie Ruan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-10-14	sched/fair: Fix external p->on_rq users	Peter Zijlstra	1	-0/+6
	Sean noted that ever since commit 152e11f6df29 ("sched/fair: Implement delayed dequeue") KVM's preemption notifiers have started mis-classifying preemption vs blocking. Notably p->on_rq is no longer sufficient to determine if a task is runnable or blocked -- the aforementioned commit introduces tasks that remain on the runqueue even through they will not run again, and should be considered blocked for many cases. Add the task_is_runnable() helper to classify things and audit all external users of the p->on_rq state. Also add a few comments. Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue") Reported-by: Sean Christopherson <[email protected]> Tested-by: Sean Christopherson <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2024-09-27	[tree-wide] finally take no_llseek out	Al Viro	1	-1/+0
	no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek \| grep -v porting.rst \| while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-09-19	Merge tag 'sched-core-2024-09-19' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Implement the SCHED_DEADLINE server infrastructure - Daniel Bristot de Oliveira's last major contribution to the kernel: "SCHED_DEADLINE servers can help fixing starvation issues of low priority tasks (e.g., SCHED_OTHER) when higher priority tasks monopolize CPU cycles. Today we have RT Throttling; DEADLINE servers should be able to replace and improve that." (Daniel Bristot de Oliveira, Peter Zijlstra, Joel Fernandes, Youssef Esmat, Huang Shijie) - Preparatory changes for sched_ext integration: - Use set_next_task(.first) where required - Fix up set_next_task() implementations - Clean up DL server vs. core sched - Split up put_prev_task_balance() - Rework pick_next_task() - Combine the last put_prev_task() and the first set_next_task() - Rework dl_server - Add put_prev_task(.next) (Peter Zijlstra, with a fix by Tejun Heo) - Complete the EEVDF transition and refine EEVDF scheduling: - Implement delayed dequeue - Allow shorter slices to wakeup-preempt - Use sched_attr::sched_runtime to set request/slice suggestion - Document the new feature flags - Remove unused and duplicate-functionality fields - Simplify & unify pick_next_task_fair() - Misc debuggability enhancements (Peter Zijlstra, with fixes/cleanups by Dietmar Eggemann, Valentin Schneider and Chuyi Zhou) - Initialize the vruntime of a new task when it is first enqueued, resulting in significant decrease in latency of newly woken tasks (Zhang Qiao) - Introduce SM_IDLE and an idle re-entry fast-path in __schedule() (K Prateek Nayak, Peter Zijlstra) - Clean up and clarify the usage of Clean up usage of rt_task() (Qais Yousef) - Preempt SCHED_IDLE entities in strict cgroup hierarchies (Tianchen Ding) - Clarify the documentation of time units for deadline scheduler parameters (Christian Loehle) - Remove the HZ_BW chicken-bit feature flag introduced a year ago, the original change seems to be working fine (Phil Auld) - Misc fixes and cleanups (Chen Yu, Dan Carpenter, Huang Shijie, Peilin He, Qais Yousefm and Vincent Guittot) * tag 'sched-core-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) sched/cpufreq: Use NSEC_PER_MSEC for deadline task cpufreq/cppc: Use NSEC_PER_MSEC for deadline task sched/deadline: Clarify nanoseconds in uapi sched/deadline: Convert schedtool example to chrt sched/debug: Fix the runnable tasks output sched: Fix sched_delayed vs sched_core kernel/sched: Fix util_est accounting for DELAY_DEQUEUE kthread: Fix task state in kthread worker if being frozen sched/pelt: Use rq_clock_task() for hw_pressure sched/fair: Move effective_cpu_util() and effective_cpu_util() in fair.c sched/core: Introduce SM_IDLE and an idle re-entry fast-path in __schedule() sched: Add put_prev_task(.next) sched: Rework dl_server sched: Combine the last put_prev_task() and the first set_next_task() sched: Rework pick_next_task() sched: Split up put_prev_task_balance() sched: Clean up DL server vs core sched sched: Fixup set_next_task() implementations sched: Use set_next_task(.first) where required sched/fair: Properly deactivate sched_delayed task upon class change ...
2024-09-17	Merge tag 'timers-core-2024-09-16' of ↵	Linus Torvalds	10	-194/+202
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Core: - Overhaul of posix-timers in preparation of removing the workaround for periodic timers which have signal delivery ignored. - Remove the historical extra jiffie in msleep() msleep() adds an extra jiffie to the timeout value to ensure minimal sleep time. The timer wheel ensures minimal sleep time since the large rewrite to a non-cascading wheel, but the extra jiffie in msleep() remained unnoticed. Remove it. - Make the timer slack handling correct for realtime tasks. The procfs interface is inconsistent and does neither reflect reality nor conforms to the man page. Show the correct 0 slack for real time tasks and enforce it at the core level instead of having inconsistent individual checks in various timer setup functions. - The usual set of updates and enhancements all over the place. Drivers: - Allow the ACPI PM timer to be turned off during suspend - No new drivers - The usual updates and enhancements in various drivers" * tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) ntp: Make sure RTC is synchronized when time goes backwards treewide: Fix wrong singular form of jiffies in comments cpu: Use already existing usleep_range() timers: Rename next_expiry_recalc() to be unique platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function clocksource/drivers/jcore: Use request_percpu_irq() clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init() clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended clocksource: acpi_pm: Add external callback for suspend/resume clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped() dt-bindings: timer: rockchip: Add rk3576 compatible timers: Annotate possible non critical data race of next_expiry timers: Remove historical extra jiffie for timeout in msleep() hrtimer: Use and report correct timerslack values for realtime tasks hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse. timers: Add sparse annotation for timer_sync_wait_running(). signal: Replace BUG_ON()s ...
2024-09-17	Merge tag 'irq-core-2024-09-16' of ↵	Linus Torvalds	2	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Core: - Remove a global lock in the affinity setting code The lock protects a cpumask for intermediate results and the lock causes a bottleneck on simultaneous start of multiple virtual machines. Replace the lock and the static cpumask with a per CPU cpumask which is nicely serialized by raw spinlock held when executing this code. - Provide support for giving a suffix to interrupt domain names. That's required to support devices with subfunctions so that the domain names are distinct even if they originate from the same device node. - The usual set of cleanups and enhancements all over the place Drivers: - Support for longarch AVEC interrupt chip - Refurbishment of the Armada driver so it can be extended for new variants. - The usual set of cleanups and enhancements all over the place" * tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits) genirq: Use cpumask_intersects() genirq/cpuhotplug: Use cpumask_intersects() irqchip/apple-aic: Only access system registers on SoCs which provide them irqchip/apple-aic: Add a new "Global fast IPIs only" feature level irqchip/apple-aic: Skip unnecessary enabling of use_fast_ipi dt-bindings: apple,aic: Document A7-A11 compatibles irqdomain: Use IS_ERR_OR_NULL() in irq_domain_trim_hierarchy() genirq/msi: Use kmemdup_array() instead of kmemdup() genirq/proc: Change the return value for set affinity permission error genirq/proc: Use irq_move_pending() in show_irq_affinity() genirq/proc: Correctly set file permissions for affinity control files genirq: Get rid of global lock in irq_do_set_affinity() genirq: Fix typo in struct comment irqchip/loongarch-avec: Add AVEC irqchip support irqchip/loongson-pch-msi: Prepare get_pch_msi_handle() for AVECINTC irqchip/loongson-eiointc: Rename CPUHP_AP_IRQ_LOONGARCH_STARTING LoongArch: Architectural preparation for AVEC irqchip LoongArch: Move irqchip function prototypes to irq-loongson.h irqchip/loongson-pch-msi: Switch to MSI parent domains softirq: Remove unused 'action' parameter from action callback ...
2024-09-17	Merge tag 'timers-clocksource-2024-09-16' of ↵	Linus Torvalds	1	-13/+32
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull clocksource watchdog updates from Thomas Gleixner: - Make the uncertainty margin handling more robust to prevent false positives - Clarify comments * tag 'timers-clocksource-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin clocksource: Fix comments on WATCHDOG_THRESHOLD & WATCHDOG_MAX_SKEW clocksource: Improve comments for watchdog skew bounds
2024-09-10	ntp: Make sure RTC is synchronized when time goes backwards	Benjamin ROBIN	3	-4/+14
	sync_hw_clock() is normally called every 11 minutes when time is synchronized. This issue is that this periodic timer uses the REALTIME clock, so when time moves backwards (the NTP server jumps into the past), the timer expires late. If the timer expires late, which can be days later, the RTC will no longer be updated, which is an issue if the device is abruptly powered OFF during this period. When the device will restart (when powered ON), it will have the date prior to the ADJ_SETOFFSET call. A normal NTP server should not jump in the past like that, but it is possible... Another way of reproducing this issue is to use phc2sys to synchronize the REALTIME clock with, for example, an IRIG timecode with the source always starting at the same date (not synchronized). Also, if the time jump in the future by less than 11 minutes, the RTC may not be updated immediately (minor issue). Consider the following scenario: - Time is synchronized, and sync_hw_clock() was just called (the timer expires in 11 minutes). - A time jump is realized in the future by a couple of minutes. - The time is synchronized again. - Users may expect that RTC to be updated as soon as possible, and not after 11 minutes (for the same reason, if a power loss occurs in this period). Cancel periodic timer on any time jump (ADJ_SETOFFSET) greater than or equal to 1s. The timer will be relaunched at the end of do_adjtimex() if NTP is still considered synced. Otherwise the timer will be relaunched later when NTP is synced. This way, when the time is synchronized again, the RTC is updated after less than 2 seconds. Signed-off-by: Benjamin ROBIN <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-09-10	Merge branch 'linus' into timers/core	Thomas Gleixner	4	-8/+8
	To update with the latest fixes.
2024-09-08	treewide: Fix wrong singular form of jiffies in comments	Anna-Maria Behnsen	5	-11/+11
	There are several comments all over the place, which uses a wrong singular form of jiffies. Replace 'jiffie' by 'jiffy'. No functional change. Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> # m68k Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-08	timers: Rename next_expiry_recalc() to be unique	Anna-Maria Behnsen	1	-3/+3
	next_expiry_recalc is the name of a function as well as the name of a struct member of struct timer_base. This might lead to confusion. Rename next_expiry_recalc() to timer_recalc_next_expiry(). No functional change. Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-1-e98760256370@linutronix.de
2024-09-04	timers: Annotate possible non critical data race of next_expiry	Anna-Maria Behnsen	1	-5/+37
	Global timers could be expired remotely when the target CPU is idle. After a remote timer expiry, the remote timer_base->next_expiry value is updated while holding the timer_base->lock. When the formerly idle CPU becomes active at the same time and checks whether timers need to expire, this check is done lockless as it is on the local CPU. This could lead to a data race, which was reported by sysbot: https://lore.kernel.org/r/[email protected] When the value is read lockless but changed by the remote CPU, only two non critical scenarios could happen: 1) The already update value is read -> everything is perfect 2) The old value is read -> a superfluous timer soft interrupt is raised The same situation could happen when enqueueing a new first pinned timer by a remote CPU also with non critical scenarios: 1) The already update value is read -> everything is perfect 2) The old value is read -> when the CPU is idle, an IPI is executed nevertheless and when the CPU isn't idle, the updated value will be visible on the next tick and the timer might be late one jiffie. As this is very unlikely to happen, the overhead of doing the check under the lock is a way more effort, than a superfluous timer soft interrupt or a possible 1 jiffie delay of the timer. Document and annotate this non critical behavior in the code by using READ/WRITE_ONCE() pair when accessing timer_base->next_expiry. Reported-by: [email protected] Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/all/[email protected] Closes: https://lore.kernel.org/lkml/[email protected]
2024-08-29	timers: Remove historical extra jiffie for timeout in msleep()	Anna-Maria Behnsen	1	-2/+2
	msleep() and msleep_interruptible() add a jiffie to the requested timeout. This extra jiffie was introduced to ensure that the timeout will not happen earlier than specified. Since the rework of the timer wheel, the enqueue path already takes care of this. So the extra jiffie added by msleep*() is pointless now. Remove this extra jiffie in msleep() and msleep_interruptible(). Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-23	hrtimer: Use and report correct timerslack values for realtime tasks	Felix Moessbauer	1	-15/+3
	The timerslack_ns setting is used to specify how much the hardware timers should be delayed, to potentially dispatch multiple timers in a single interrupt. This is a performance optimization. Timers of realtime tasks (having a realtime scheduling policy) should not be delayed. This logic was inconsitently applied to the hrtimers, leading to delays of realtime tasks which used timed waits for events (e.g. condition variables). Due to the downstream override of the slack for rt tasks, the procfs reported incorrect (non-zero) timerslack_ns values. This is changed by setting the timer_slack_ns task attribute to 0 for all tasks with a rt policy. By that, downstream users do not need to specially handle rt tasks (w.r.t. the slack), and the procfs entry shows the correct value of "0". Setting non-zero slack values (either via procfs or PR_SET_TIMERSLACK) on tasks with a rt policy is ignored, as stated in "man 2 PR_SET_TIMERSLACK": Timer slack is not applied to threads that are scheduled under a real-time scheduling policy (see sched_setscheduler(2)). The special handling of timerslack on rt tasks in downstream users is removed as well. Signed-off-by: Felix Moessbauer <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-20	softirq: Remove unused 'action' parameter from action callback	Caleb Sander Mateos	2	-2/+2
	When soft interrupt actions are called, they are passed a pointer to the struct softirq action which contains the action's function pointer. This pointer isn't useful, as the action callback already knows what function it is. And since each callback handles a specific soft interrupt, the callback also knows which soft interrupt number is running. No soft interrupt action callback actually uses this parameter, so remove it from the function pointer signature. This clarifies that soft interrupt actions are global routines and makes it slightly cheaper to call them. Signed-off-by: Caleb Sander Mateos <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-14	hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.	Sebastian Andrzej Siewior	1	-0/+2
	The two hrtimer_cpu_base_.*_expiry() functions are wrappers around the locking functions and sparse complains about the missing counterpart. Add sparse annotation to denote that this bevaviour is expected. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-14	timers: Add sparse annotation for timer_sync_wait_running().	Sebastian Andrzej Siewior	1	-0/+2
	timer_sync_wait_running() first releases two locks and then acquires them again. This is unexpected and sparse complains about it. Add sparse annotation for timer_sync_wait_running() to note that the locking is expected. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-07	sched/rt: Rename realtime_{prio, task}() to rt_or_dl_{prio, task}()	Qais Yousef	1	-3/+3
	Some find the name realtime overloaded. Use rt_or_dl() as an alternative, hopefully better, name. Suggested-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Qais Yousef <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-08-07	sched/rt: Clean up usage of rt_task()	Qais Yousef	1	-3/+3
	rt_task() checks if a task has RT priority. But depends on your dictionary, this could mean it belongs to RT class, or is a 'realtime' task, which includes RT and DL classes. Since this has caused some confusion already on discussion [1], it seemed a clean up is due. I define the usage of rt_task() to be tasks that belong to RT class. Make sure that it returns true only for RT class and audit the users and replace the ones required the old behavior with the new realtime_task() which returns true for RT and DL classes. Introduce similar realtime_prio() to create similar distinction to rt_prio() and update the users that required the old behavior to use the new function. Move MAX_DL_PRIO to prio.h so it can be used in the new definitions. Document the functions to make it more obvious what is the difference between them. PI-boosted tasks is a factor that must be taken into account when choosing which function to use. Rename task_is_realtime() to realtime_task_policy() as the old name is confusing against the new realtime_task(). No functional changes were intended. [1] https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Qais Yousef <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Phil Auld <[email protected]> Reviewed-by: "Steven Rostedt (Google)" <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-08-05	timekeeping: Fix bogus clock_was_set() invocation in do_adjtimex()	Thomas Gleixner	1	-1/+1
	The addition of the bases argument to clock_was_set() fixed up all call sites correctly except for do_adjtimex(). This uses CLOCK_REALTIME instead of CLOCK_SET_WALL as argument. CLOCK_REALTIME is 0. As a result the effect of that clock_was_set() notification is incomplete and might result in timers expiring late because the hrtimer code does not re-evaluate the affected clock bases. Use CLOCK_SET_WALL instead of CLOCK_REALTIME to tell the hrtimers code which clock bases need to be re-evaluated. Fixes: 17a1b8826b45 ("hrtimer: Add bases argument to clock_was_set()") Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/877ccx7igo.ffs@tglx
2024-08-05	ntp: Safeguard against time_constant overflow	Justin Stitt	1	-3/+2
	Using syzkaller with the recently reintroduced signed integer overflow sanitizer produces this UBSAN report: UBSAN: signed-integer-overflow in ../kernel/time/ntp.c:738:18 9223372036854775806 + 4 cannot be represented in type 'long' Call Trace: handle_overflow+0x171/0x1b0 __do_adjtimex+0x1236/0x1440 do_adjtimex+0x2be/0x740 The user supplied time_constant value is incremented by four and then clamped to the operating range. Before commit eea83d896e31 ("ntp: NTP4 user space bits update") the user supplied value was sanity checked to be in the operating range. That change removed the sanity check and relied on clamping after incrementing which does not work correctly when the user supplied value is in the overflow zone of the '+ 4' operation. The operation requires CAP_SYS_TIME and the side effect of the overflow is NTP getting out of sync. Similar to the fixups for time_maxerror and time_esterror, clamp the user space supplied value to the operating range. [ tglx: Switch to clamping ] Fixes: eea83d896e31 ("ntp: NTP4 user space bits update") Signed-off-by: Justin Stitt <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected] Closes: https://github.com/KSPP/linux/issues/352
2024-08-05	ntp: Clamp maxerror and esterror to operating range	Justin Stitt	1	-2/+2
	Using syzkaller alongside the newly reintroduced signed integer overflow sanitizer spits out this report: UBSAN: signed-integer-overflow in ../kernel/time/ntp.c:461:16 9223372036854775807 + 500 cannot be represented in type 'long' Call Trace: handle_overflow+0x171/0x1b0 second_overflow+0x2d6/0x500 accumulate_nsecs_to_secs+0x60/0x160 timekeeping_advance+0x1fe/0x890 update_wall_time+0x10/0x30 time_maxerror is unconditionally incremented and the result is checked against NTP_PHASE_LIMIT, but the increment itself can overflow, resulting in wrap-around to negative space. Before commit eea83d896e31 ("ntp: NTP4 user space bits update") the user supplied value was sanity checked to be in the operating range. That change removed the sanity check and relied on clamping in handle_overflow() which does not work correctly when the user supplied value is in the overflow zone of the '+ 500' operation. The operation requires CAP_SYS_TIME and the side effect of the overflow is NTP getting out of sync. Miroslav confirmed that the input value should be clamped to the operating range and the same applies to time_esterror. The latter is not used by the kernel, but the value still should be in the operating range as it was before the sanity check got removed. Clamp them to the operating range. [ tglx: Changed it to clamping and included time_esterror ] Fixes: eea83d896e31 ("ntp: NTP4 user space bits update") Signed-off-by: Justin Stitt <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Miroslav Lichvar <[email protected]> Link: https://lore.kernel.org/all/[email protected] Closes: https://github.com/KSPP/linux/issues/354
2024-08-02	clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin	Paul E. McKenney	1	-4/+5
	Right now, cs_watchdog_read() does clocksource sanity checks based on WATCHDOG_MAX_SKEW, which sets a floor on any clocksource's .uncertainty_margin. These sanity checks can therefore act inappropriately for clocksources with large uncertainty margins. One reason for a clocksource to have a large .uncertainty_margin is when that clocksource has long read-out latency, given that it does not make sense for the .uncertainty_margin to be smaller than the read-out latency. With the current checks, cs_watchdog_read() could reject all normal reads from a clocksource with long read-out latencies, such as those from legacy clocksources that are no longer implemented in hardware. Therefore, recast the cs_watchdog_read() checks in terms of the .uncertainty_margin values of the clocksources involved in the timespan in question. The first covers two watchdog reads and one cs read, so use twice the watchdog .uncertainty_margin plus that of the cs. The second covers only a pair of watchdog reads, so use twice the watchdog .uncertainty_margin. Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-02	clocksource: Fix comments on WATCHDOG_THRESHOLD & WATCHDOG_MAX_SKEW	Paul E. McKenney	1	-1/+7
	The WATCHDOG_THRESHOLD macro is no longer used to supply a default value for ->uncertainty_margin, but WATCHDOG_MAX_SKEW now is. Therefore, update the comments to reflect this change. Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-02	clocksource: Improve comments for watchdog skew bounds	Borislav Petkov	1	-8/+20
	Add more detail on the rationale for bounding the clocksource ->uncertainty_margin below at about 500ppm. Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
2024-08-02	clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()	Paul E. McKenney	1	-1/+1
	The current "nretries > 1 \|\| nretries >= max_retries" check in cs_watchdog_read() will always evaluate to true, and thus pr_warn(), if nretries is greater than 1. The intent is instead to never warn on the first try, but otherwise warn if the successful retry was the last retry. Therefore, change that "\|\|" to "&&". Fixes: db3a34e17433 ("clocksource: Retry clock read if long delays detected") Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected]
2024-07-31	tick/broadcast: Move per CPU pointer access into the atomic section	Thomas Gleixner	1	-1/+2
	The recent fix for making the take over of the broadcast timer more reliable retrieves a per CPU pointer in preemptible context. This went unnoticed as compilers hoist the access into the non-preemptible region where the pointer is actually used. But of course it's valid that the compiler keeps it at the place where the code puts it which rightfully triggers: BUG: using smp_processor_id() in preemptible [00000000] code: caller is hotplug_cpu__broadcast_tick_pull+0x1c/0xc0 Move it to the actual usage site which is in a non-preemptible region. Fixes: f7d43dd206e7 ("tick/broadcast: Make takeover of broadcast hrtimer reliable") Reported-by: David Wang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Yu Liao <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/87ttg56ers.ffs@tglx
2024-07-29	posix-timers: Consolidate signal queueing	Thomas Gleixner	4	-19/+15
	Rename posix_timer_event() to posix_timer_queue_signal() as this is what the function is about. Consolidate the requeue pending and deactivation updates into that function as there is no point in doing this in all incarnations of posix timers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Make k_itimer::it_active consistent	Thomas Gleixner	1	-0/+4
	Posix CPU timers are not updating k_itimer::it_active which makes it impossible to base decisions in the common posix timer code on it. Update it when queueing or dequeueing posix CPU timers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-timers: Consolidate timer setup	Thomas Gleixner	3	-20/+21
	hrtimer based and CPU timers have their own way to install the new interval and to reset overrun and signal handling related data. Create a helper function and do the same operation for all variants. This also makes the handling of the interval consistent. It's only stored when the timer is actually armed, i.e. timer->it_value != 0. Before that it was stored unconditionally for posix CPU timers and conditionally for the other posix timers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-timers: Convert timer list to hlist	Thomas Gleixner	1	-11/+8
	No requirement for a real list. Spare a few bytes. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-timers: Clear overrun in common_timer_set()	Thomas Gleixner	1	-0/+1
	Keeping the overrun count of the previous setup around is just wrong. The new setting has nothing to do with the previous one and has to start from a clean slate. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-timers: Retrieve interval in common timer_settime() code	Thomas Gleixner	2	-9/+6
	No point in doing this all over the place. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Simplify posix_cpu_timer_set()	Thomas Gleixner	1	-27/+17
	Avoid the late sighand lock/unlock dance when a timer is not armed to enforce reevaluation of the timer base so that the process wide CPU timer sampling can be disabled. Do it right at the point where the arming decision is made which already has sighand locked. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Remove incorrect comment in posix_cpu_timer_set()	Thomas Gleixner	1	-6/+1
	A leftover from historical code which describes fiction. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Use @now instead of @val for clarity	Thomas Gleixner	1	-13/+9
	posix_cpu_timer_set() uses @val as variable for the current time. That's confusing at best. Use @now as anywhere else and rewrite the confusing comment about clock sampling. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Do not arm SIGEV_NONE timers	Thomas Gleixner	1	-16/+13
	There is no point in arming SIGEV_NONE timers as they never deliver a signal. timer_gettime() is handling the expiry time correctly and that's all SIGEV_NONE timers care about. Prevent arming them and remove the expiry handler code which just disarms them. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Replace old expiry retrieval in posix_cpu_timer_set()	Thomas Gleixner	1	-31/+6
	Reuse the split out __posix_cpu_timer_get() function which does already the right thing. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Handle SIGEV_NONE timers correctly in timer_set()	Thomas Gleixner	1	-1/+11
	Expired SIGEV_NONE oneshot timers must return 0 nsec for the expiry time in timer_get(), but the posix CPU timer implementation returns 1 nsec. Add the missing conditional. This will be cleaned up in a follow up patch. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Handle SIGEV_NONE timers correctly in timer_get()	Thomas Gleixner	1	-5/+9
	Expired SIGEV_NONE oneshot timers must return 0 nsec for the expiry time in timer_get(), but the posix CPU timer implementation returns 1 nsec. Add the missing conditional. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Handle interval timers correctly in timer_get()	Thomas Gleixner	1	-1/+17
	timer_gettime() must return the remaining time to the next expiry of a timer or 0 if the timer is not armed and no signal pending, but posix CPU timers fail to forward a timer which is already expired. Add the required logic to address that. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Save interval only for armed timers	Thomas Gleixner	1	-8/+6
	There is no point to return the interval for timers which have been disarmed. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-29	posix-cpu-timers: Split up posix_cpu_timer_get()	Thomas Gleixner	1	-27/+24
	In preparation for addressing issues in the timer_get() and timer_set() functions of posix CPU timers. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Reviewed-by: Anna-Maria Behnsen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2024-07-27	Merge tag 'timers-urgent-2024-07-26' of ↵	Linus Torvalds	2	-205/+215
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer migration updates from Thomas Gleixner: "Fixes and minor updates for the timer migration code: - Stop testing the group->parent pointer as it is not guaranteed to be stable over a chain of operations by design. This includes a warning which would be nice to have but it produces false positives due to the racy nature of the check. - Plug a race between CPUs going in and out of idle and a CPU hotplug operation. The latter can create and connect a new hierarchy level which is missed in the concurrent updates of CPUs which go into idle. As a result the events of such a CPU might not be processed and timers go stale. Cure it by splitting the hotplug operation into a prepare and online callback. The prepare callback is guaranteed to run on an online and therefore active CPU. This CPU updates the hierarchy and being online ensures that there is always at least one migrator active which handles the modified hierarchy correctly when going idle. The online callback which runs on the incoming CPU then just marks the CPU active and brings it into operation. - Improve tracing and polish the code further so it is more obvious what's going on" * tag 'timers-urgent-2024-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Fix grammar in comment timers/migration: Spare write when nothing changed timers/migration: Rename childmask by groupmask to make naming more obvious timers/migration: Read childmask and parent pointer in a single place timers/migration: Use a single struct for hierarchy walk data timers/migration: Improve tracing timers/migration: Move hierarchy setup into cpuhotplug prepare callback timers/migration: Do not rely always on group->parent
2024-07-24	sysctl: treewide: constify the ctl_table argument of proc_handlers	Joel Granados	1	-1/+1
	const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer\|interval)_handler\|sched_(rt\|rr)_handler\|rds_tcp_skbuf_handler\|proc_sctp_do_(hmac_alg\|rto_min\|rto_max\|udp_port\|alpha_beta\|auth\|probe_interval)"; @@ int func( - struct ctl_table ctl + const struct ctl_table ctl ,int write, void buffer, size_t lenp, loff_t ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table ctl + const struct ctl_table ctl ,int write, void buffer, size_t lenp, loff_t ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void , size_t , loff_t ); @r4@ identifier func, ctl; @@ int func( - struct ctl_table ctl + const struct ctl_table ctl ,int , void , size_t , loff_t ); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void buffer, size_t lenp, loff_t ppos); ``` Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <[email protected]> Signed-off-by: Thomas Weißschuh <[email protected]> Co-developed-by: Joel Granados <[email protected]> Signed-off-by: Joel Granados <[email protected]>
2024-07-22	timers/migration: Fix grammar in comment	Anna-Maria Behnsen	1	-1/+1
	Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-07-22	timers/migration: Spare write when nothing changed	Anna-Maria Behnsen	1	-6/+5
	The wakeup value is written unconditionally in tmigr_cpu_new_timer(). When there was no new next timer expiry that needs to be propagated, then the value that was read before is written. This is not required. Move the write to the place where wakeup value is changed changed. Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/[email protected]