aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2019-08-28posix-cpu-timers: Utilize timerqueue for storageThomas Gleixner1-93/+96
Using a linear O(N) search for timer insertion affects execution time and D-cache footprint badly with a larger number of timers. Switch the storage to a timerqueue which is already used for hrtimers and alarmtimers. It does not affect the size of struct k_itimer as it.alarm is still larger. The extra list head for the expiry list will go away later once the expiry is moved into task work context. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Move state tracking to struct posix_cputimersThomas Gleixner2-39/+40
Put it where it belongs and clean up the ifdeffery in fork completely. Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Deduplicate rlimit handlingThomas Gleixner1-42/+25
Both thread and process expiry functions have the same functionality for sending signals for soft and hard RLIMITs duplicated in 4 different ways. Split it out into a common function and cleanup the callsites. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Remove pointless comparisonsThomas Gleixner1-9/+7
The soft RLIMIT expiry code checks whether the soft limit is greater than the hard limit. That's pointless because if the soft RLIMIT is greater than the hard RLIMIT then that code cannot be reached as the hard RLIMIT check is before that and already killed the process. Remove it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Get rid of 64bit divisionsThomas Gleixner1-10/+14
Instead of dividing A to match the units of B it's more efficient to multiply B to match the units of A. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Consolidate timer expiry furtherThomas Gleixner1-33/+30
With the array based samples and expiry cache, the expiry function can use a loop to collect timers from the clock specific lists. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Get rid of zero checksThomas Gleixner2-32/+15
Deactivation of the expiry cache is done by setting all clock caches to 0. That requires to have a check for zero in all places which update the expiry cache: if (cache == 0 || new < cache) cache = new; Use U64_MAX as the deactivated value, which allows to remove the zero checks when updating the cache and reduces it to the obvious check: if (new < cache) cache = new; This also removes the weird workaround in do_prlimit() which was required to convert a RLIMIT_CPU value of 0 (immediate expiry) to 1 because handing in 0 to the posix CPU timer code would have effectively disarmed it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28rlimit: Rewrite non-sensical RLIMIT_CPU commentThomas Gleixner1-4/+3
The comment above the function which arms RLIMIT_CPU in the posix CPU timer code makes no sense at all. It claims that the kernel does not return an error code when it rejected the attempt to set RLIMIT_CPU. That's clearly bogus as the code does an error check and the rlimit is only set and activated when the permission checks are ok. In case of a rejection an appropriate error code is returned. This is a historical and outdated comment which got dragged along even when the rlimit handling code was rewritten. Replace it with an explanation why the setup function is not called when the rlimit value is RLIM_INFINITY and how the 'disarming' is handled. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Respect INFINITY for hard RTTIME limitThomas Gleixner1-1/+1
The RTIME limit expiry code does not check the hard RTTIME limit for INFINITY, i.e. being disabled. Add it. While this could be considered an ABI breakage if something would depend on this behaviour. Though it's highly unlikely to have an effect because RLIM_INFINITY is at minimum INT_MAX and the RTTIME limit is in seconds, so the timer would fire after ~68 years. Adding this obvious correct limit check also allows further consolidation of that code and is a prerequisite for cleaning up the 0 based checks and the rlimit setter code. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Switch thread group sampling to arrayThomas Gleixner2-67/+48
That allows more simplifications in various places. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Restructure expiry arrayThomas Gleixner1-49/+56
Now that the abused struct task_cputime is gone, it's more natural to bundle the expiry cache and the list head of each clock into a struct and have an array of those structs. Follow the hrtimer naming convention of 'bases' and rename the expiry cache to 'nextevt' and adapt all usage sites. Generates also better code .text size shrinks by 80 bytes. Suggested-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Remove cputime_expiresThomas Gleixner1-10/+0
The last users of the magic struct cputime based expiry cache are gone. Remove the leftovers. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Make expiry checks array basedThomas Gleixner1-49/+36
The expiry cache is an array indexed by clock ids. The new sample functions allow to retrieve a corresponding array of samples. Convert the fastpath expiry checks to make use of the new sample functions and do the comparisons on the sample and the expiry array. Make the check for the expiry array being zero array based as well. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide array based sample functionsThomas Gleixner1-0/+26
Instead of using task_cputime and doing the addition of utime and stime at all call sites, it's way simpler to have a sample array which allows indexed based checks against the expiry cache array. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Switch check_*_timers() to array cacheThomas Gleixner1-15/+11
Use the array based expiry cache in check_thread_timers() and convert the store in check_process_timers() for consistency. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Simplify set_process_cpu_timer()Thomas Gleixner1-16/+8
The expiry cache can now be accessed as an array. Replace the per clock checks with a simple comparison of the clock indexed array member. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Simplify timer queueingThomas Gleixner1-34/+21
Now that the expiry cache can be accessed as an array, the per clock checking can be reduced to just comparing the corresponding array elements. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide array based access to expiry cacheThomas Gleixner1-1/+11
Using struct task_cputime for the expiry cache is a pretty odd choice and comes with magic defines to rename the fields for usage in the expiry cache. struct task_cputime is basically a u64 array with 3 members, but it has distinct members. The expiry cache content is different than the content of task_cputime because expiry[PROF] = task_cputime.stime + task_cputime.utime expiry[VIRT] = task_cputime.utime expiry[SCHED] = task_cputime.sum_exec_runtime So there is no direct mapping between task_cputime and the expiry cache and the #define based remapping is just a horrible hack. Having the expiry cache array based allows further simplification of the expiry code. To avoid an all in one cleanup which is hard to review add a temporary anonymous union into struct task_cputime which allows array based access to it. That requires to reorder the members. Add a build time sanity check to validate that the members are at the same place. The union and the build time checks will be removed after conversion. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Move expiry cache into struct posix_cputimersThomas Gleixner3-42/+34
The expiry cache belongs into the posix_cputimers container where the other cpu timers information is. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Create a container structThomas Gleixner2-17/+14
Per task/process data of posix CPU timers is all over the place which makes the code hard to follow and requires ifdeffery. Create a container to hold all this information in one place, so data is consolidated and the ifdeffery can be confined to the posix timer header file and removed from places like fork. As a first step, move the cpu_timers list head array into the new struct and clean up the initializers and simplify fork. The remaining #ifdef in fork will be removed later. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Move prof/virt_ticks into callerThomas Gleixner1-21/+9
The functions have only one caller left. No point in having them. Move the almost duplicated code into the caller and simplify it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Sample task times once in expiry checkThomas Gleixner1-4/+6
Sampling the task times twice does not make sense. Do it once. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Get rid of pointer indirectionThomas Gleixner1-28/+22
Now that the sample functions have no return value anymore, the result can simply be returned instead of using pointer indirection. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Simplify sample functionsThomas Gleixner1-15/+13
All callers hand in a valdiated clock id. Remove the return value which was unchecked in most places anyway. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Remove pointless return value checkThomas Gleixner1-3/+2
set_process_cpu_timer() checks already whether the clock id is valid. No point in checking the return value of the sample function. That allows to simplify the sample function later. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_rearm()Thomas Gleixner1-2/+3
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_get()Thomas Gleixner1-2/+3
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_set()Thomas Gleixner1-5/+6
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Consolidate thread group sample codeThomas Gleixner1-39/+20
cpu_clock_sample_group() and cpu_timer_sample_group() are almost the same. Before the rename one called thread_group_cputimer() and the other thread_group_cputime(). Really intuitive function names. Consolidate the functions and also avoid the thread traversal when the thread group's accounting is already active. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Rename thread_group_cputimer() and make it staticThomas Gleixner1-2/+15
thread_group_cputimer() is a complete misnomer. The function does two things: - For arming process wide timers it makes sure that the atomic time storage is up to date. If no cpu timer is armed yet, then the atomic time storage is not updated by the scheduler for performance reasons. In that case a full summing up of all threads needs to be done and the update needs to be enabled. - Samples the current time into the caller supplied storage. Rename it to thread_group_start_cputime(), make it static and fixup the callsite. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Sample directly in timer checkThomas Gleixner1-3/+4
The thread group accounting is active, otherwise the expiry function would not be running. Sample the thread group time directly. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28itimers: Use quick sample functionThomas Gleixner1-1/+1
get_itimer() locks sighand lock and checks whether the timer is already expired. If it is not expired then the thread group cputime accounting is already enabled. Use the sampling function not the one which is meant for starting a timer. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide quick sample function for itimerThomas Gleixner1-0/+21
get_itimer() needs a sample of the current thread group cputime. It invokes thread_group_cputimer() - which is a misnomer. That function also starts eventually the group cputime accouting which is bogus because the accounting is already active when a timer is armed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Use common permission check in posix_cpu_timer_create()Thomas Gleixner1-32/+3
Yet another copy of the same thing gone... Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Use common permission check in posix_cpu_clock_get()Thomas Gleixner1-43/+14
Replace the next slightly different copy of permission checks. That also removes the necessarity to check the return value of the sample functions because the clock id is already validated. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide task validation functionsThomas Gleixner1-21/+44
The code contains three slightly different copies of validating whether a given clock resolves to a valid task and whether the current caller has permissions to access it. Create central functions. Replace check_clock() as a first step and rename it to something sensible. Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28perf: Allow normal events to output AUX dataAlexander Shishkin1-0/+93
In some cases, ordinary (non-AUX) events can generate data for AUX events. For example, PEBS events can come out as records in the Intel PT stream instead of their usual DS records, if configured to do so. One requirement for such events is to consistently schedule together, to ensure that the data from the "AUX output" events isn't lost while their corresponding AUX event is not scheduled. We use grouping to provide this guarantee: an "AUX output" event can be added to a group where an AUX event is a group leader, and provided that the former supports writing to the latter. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-08-28sched/cpufreq: Align trace event behavior of fast switchingDouglas RAILLARD1-1/+6
Fast switching path only emits an event for the CPU of interest, whereas the regular path emits an event for all the CPUs that had their frequency changed, i.e. all the CPUs sharing the same policy. With the current behavior, looking at cpu_frequency event for a given CPU that is using the fast switching path will not give the correct frequency signal. Signed-off-by: Douglas RAILLARD <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-08-28bpf: introduce verifier internal test flagAlexei Starovoitov2-1/+5
Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller12-40/+89
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <[email protected]>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2-16/+23
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the ↵Marc Zyngier1-2/+4
first symbol An arm64 kernel configured with CONFIG_KPROBES=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set CONFIG_KALLSYMS_BASE_RELATIVE=y reports the following kprobe failure: [ 0.032677] kprobes: failed to populate blacklist: -22 [ 0.033376] Please take care of using kprobes. It appears that kprobe fails to retrieve the symbol at address 0xffff000010081000, despite this symbol being in System.map: ffff000010081000 T __exception_text_start This symbol is part of the first group of aliases in the kallsyms_offsets array (symbol names generated using ugly hacks in scripts/kallsyms.c): kallsyms_offsets: .long 0x1000 // do_undefinstr .long 0x1000 // efi_header_end .long 0x1000 // _stext .long 0x1000 // __exception_text_start .long 0x12b0 // do_cp15instr Looking at the implementation of get_symbol_pos(), it returns the lowest index for aliasing symbols. In this case, it return 0. But kallsyms_lookup_size_offset() considers 0 as a failure, which is obviously wrong (there is definitely a valid symbol living there). In turn, the kprobe blacklisting stops abruptly, hence the original error. A CONFIG_KALLSYMS_ALL kernel wouldn't fail as there is always some random symbols at the beginning of this array, which are never looked up via kallsyms_lookup_size_offset. Fix it by considering that get_symbol_pos() is always successful (which is consistent with the other uses of this function). Fixes: ffc5089196446 ("[PATCH] Create kallsyms_lookup_size_offset()") Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-08-27genirq/affinity: Spread vectors on node according to nr_cpu ratioMing Lei1-39/+200
Now __irq_build_affinity_masks() spreads vectors evenly per node, but there is a case that not all vectors have been spread when each numa node has a different number of CPUs which triggers the warning in the spreading code. Improve the spreading algorithm by - assigning vectors according to the ratio of the number of CPUs on a node to the number of remaining CPUs. - running the assignment from smaller nodes to bigger nodes to guarantee that every active node gets allocated at least one vector. This ensures that all vectors are spread out. Asided of that the spread becomes more fair if the nodes have different number of CPUs. For example, on the following machine: CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 ... NUMA node0 CPU(s): 0,1,3,5-9,11,13-15 NUMA node1 CPU(s): 2,4,10,12 When a driver requests to allocate 8 vectors, the following spread results: irq 31, cpu list 2,4 irq 32, cpu list 10,12 irq 33, cpu list 0-1 irq 34, cpu list 3,5 irq 35, cpu list 6-7 irq 36, cpu list 8-9 irq 37, cpu list 11,13 irq 38, cpu list 14-15 So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original algorithm assigned 4 vectors on each node which was unfair versus Node 0. [ tglx: Massaged changelog ] Reported-by: Jon Derrick <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Jon Derrick <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-27genirq/affinity: Improve __irq_build_affinity_masks()Ming Lei1-8/+18
One invariant of __irq_build_affinity_masks() is that all CPUs in the specified masks (cpu_mask AND node_to_cpumask for each node) should be covered during the spread. Even though all requested vectors have been reached, it's still required to spread vectors among remained CPUs. A similar policy has been taken in case of 'numvecs <= nodes' already. So remove the following check inside the loop: if (done >= numvecs) break; Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors have been handled already. Also, if the specified cpumask for one numa node is empty, simply do not spread vectors on this node. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-26bpf: handle 32-bit zext during constant blindingNaveen N. Rao1-2/+6
Since BPF constant blinding is performed after the verifier pass, the ALU32 instructions inserted for doubleword immediate loads don't have a corresponding zext instruction. This is causing a kernel oops on powerpc and can be reproduced by running 'test_cgroup_storage' with bpf_jit_harden=2. Fix this by emitting BPF_ZEXT during constant blinding if prog->aux->verifier_zext is set. Fixes: a4b1d3c1ddf6cb ("bpf: verifier: insert zero extension according to analysis result") Reported-by: Michael Ellerman <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Reviewed-by: Jiong Wang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-08-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2-9/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping fix from Thomas Gleixner: "A single fix for a regression caused by the generic VDSO implementation where a math overflow causes CLOCK_BOOTTIME to become a random number generator" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
2019-08-25Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "Handle the worker management in situations where a task is scheduled out on a PI lock contention correctly and schedule a new worker if possible" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Schedule new worker even if PI-blocked
2019-08-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two small fixes for kprobes and perf: - Prevent a deadlock in kprobe_optimizer() causes by reverse lock ordering - Fix a comment typo" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes: Fix potential deadlock in kprobe_optimizer() perf/x86: Fix typo in comment
2019-08-25Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for a imbalanced kobject operation in the irq decriptor code which was unearthed by the new warnings in the kobject code" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Properly pair kobject_del() with kobject_add()
2019-08-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+8
Mergr misc fixes from Andrew Morton: "11 fixes" Mostly VM fixes, one psi polling fix, and one parisc build fix. * emailed patches from Andrew Morton <[email protected]>: mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y mm/zsmalloc.c: fix race condition in zs_destroy_pool mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely mm, page_owner: handle THP splits correctly userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx psi: get poll_work to run when calling poll syscall next time mm: memcontrol: flush percpu vmevents before releasing memcg mm: memcontrol: flush percpu vmstats before releasing memcg parisc: fix compilation errrors mm, page_alloc: move_freepages should not examine struct page of reserved memory mm/z3fold.c: fix race between migration and destruction