aboutsummaryrefslogtreecommitdiff
path: root/kernel/sched.c
AgeCommit message (Collapse)AuthorFilesLines
2008-05-29revert ("sched: fair-group: SMP-nice for group scheduling")Ingo Molnar1-399/+31
Yanmin Zhang reported: Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1. It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito. With bisect, I located the following patch: | 18d95a2832c1392a2d63227a7a6d433cb9f2037e is first bad commit | commit 18d95a2832c1392a2d63227a7a6d433cb9f2037e | Author: Peter Zijlstra <[email protected]> | Date: Sat Apr 19 19:45:00 2008 +0200 | | sched: fair-group: SMP-nice for group scheduling Revert it so that we get v2.6.25 behavior. Bisected-by: Yanmin Zhang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-29sched: cleanupIngo Molnar1-2/+2
Signed-off-by: Ingo Molnar <[email protected]>
2008-05-29sched: unite unlikely pairs in rt_policy() and schedule_debug()Roel Kluin1-2/+2
Removes obfuscation and may improve assembly. Signed-off-by: Roel Kluin <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-29revert ("sched: fair: weight calculations")Ingo Molnar1-3/+6
Yanmin Zhang reported: Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has many regressions with 2.6.26-rc1: 1) 8-core stoakley: 28%; 2) 16-core tigerton: 20%; 3) Itanium Montvale: 50%. Bisect located this patch: | 8f1bc385cfbab474db6c27b5af1e439614f3025c is first bad commit | commit 8f1bc385cfbab474db6c27b5af1e439614f3025c | Author: Peter Zijlstra <[email protected]> | Date: Sat Apr 19 19:45:00 2008 +0200 | | sched: fair: weight calculations Revert it to the 2.6.25 state. Bisected-by: Yanmin Zhang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-14cgroups: fix compile warningMirco Tischler1-1/+1
Return type of cpu_rt_runtime_write() should be int instead of ssize_t. Signed-off-by: Mirco Tischler <[email protected]> Acked-by: Paul Menage <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-11Add new 'cond_resched_bkl()' helper functionLinus Torvalds1-2/+0
It acts exactly like a regular 'cond_resched()', but will not get optimized away when CONFIG_PREEMPT is set. Normal kernel code is already preemptable in the presense of CONFIG_PREEMPT, so cond_resched() is optimized away (see commit 02b67cc3ba36bdba351d6c3a00593f4ec550d9d3 "sched: do not do cond_resched() when CONFIG_PREEMPT"). But when wanting to conditionally reschedule while holding a lock, you need to use "cond_sched_lock(lock)", and the new function is the BKL equivalent of that. Also make fs/locks.c use it. Signed-off-by: Linus Torvalds <[email protected]>
2008-05-10BKL: revert back to the old spinlock implementationLinus Torvalds1-23/+4
The generic semaphore rewrite had a huge performance regression on AIM7 (and potentially other BKL-heavy benchmarks) because the generic semaphores had been rewritten to be simple to understand and fair. The latter, in particular, turns a semaphore-based BKL implementation into a mess of scheduling. The attempt to fix the performance regression failed miserably (see the previous commit 00b41ec2611dc98f87f30753ee00a53db648d662 'Revert "semaphore: fix"'), and so for now the simple and sane approach is to instead just go back to the old spinlock-based BKL implementation that never had any issues like this. This patch also has the advantage of being reported to fix the regression completely according to Yanmin Zhang, unlike the semaphore hack which still left a couple percentage point regression. As a spinlock, the BKL obviously has the potential to be a latency issue, but it's not really any different from any other spinlock in that respect. We do want to get rid of the BKL asap, but that has been the plan for several years. These days, the biggest users are in the tty layer (open/release in particular) and Alan holds out some hope: "tty release is probably a few months away from getting cured - I'm afraid it will almost certainly be the very last user of the BKL in tty to get fixed as it depends on everything else being sanely locked." so while we're not there yet, we do have a plan of action. Tested-by: Yanmin Zhang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-05sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCKPeter Zijlstra1-152/+13
this replaces the rq->clock stuff (and possibly cpu_clock()). - architectures that have an 'imperfect' hardware clock can set CONFIG_HAVE_UNSTABLE_SCHED_CLOCK - the 'jiffie' window might be superfulous when we update tick_gtod before the __update_sched_clock() call in sched_clock_tick() - cpu_clock() might be implemented as: sched_clock_cpu(smp_processor_id()) if the accuracy proves good enough - how far can TSC drift in a single jiffie when considering the filtering and idle hooks? [ [email protected]: various fixes and cleanups ] Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: fix cpu clockIngo Molnar1-9/+15
David Miller pointed it out that nothing in cpu_clock() sets prev_cpu_time. This caused __sync_cpu_clock() to be called all the time - against the intention of this code. The result was that in practice we hit a global spinlock every time cpu_clock() is called - which - even though cpu_clock() is used for tracing and debugging, is suboptimal. While at it, also: - move the irq disabling to the outest layer, this should make cpu_clock() warp-free when called with irqs enabled. - use long long instead of cycles_t - for platforms where cycles_t is 32-bit. Reported-by: David Miller <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: fair-group: fix a Div0 error of the fair group schedulerMiao Xie1-6/+11
When I echoed 0 into the "cpu.shares" file, a Div0 error occured. We found it is caused by the following calling. sched_group_set_shares(tg, shares) set_se_shares(tg->se[i], shares/nr_cpu_ids) __set_se_shares(se, shares) div64_64((1ULL<<32), shares) When the echoed value was less than the number of processores, the result of the sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide the result, the Div0 error occured. It is unnecessary that the shares value is divided by nr_cpu_ids, I think. Because in the function __update_group_shares_cpu() and init_tg_cfs_entry(), the shares value isn't divided by nr_cpu_ids when setting shares of the sched entity. This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also causes Div0 error, so we set a macro MAX_SHARES to limit the max value of shares. Signed-off-by: Miao Xie <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: fix missing locking in sched_domains codeHeiko Carstens1-17/+12
Concurrent calls to detach_destroy_domains and arch_init_sched_domains were prevented by the old scheduler subsystem cpu hotplug mutex. When this got converted to get_online_cpus() the locking got broken. Unlike before now several processes can concurrently enter the critical sections that were protected by the old lock. So use the already present doms_cur_mutex to protect these sections again. Cc: Gautham R Shenoy <[email protected]> Cc: Paul Jackson <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: make clock sync tunable by architecture codeIngo Molnar1-1/+1
make time_sync_thresh tunable to architecture code. Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: fix sched_info_switch not being called according to documentationDavid Simner1-2/+2
http://bugzilla.kernel.org/show_bug.cgi?id=10545 sched_stats.h says that __sched_info_switch is "called when prev != next" in the comment. sched.c should therefore do that. Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: fix hrtick_start_fair and CPU-HotplugPeter Zijlstra1-1/+65
Gautham R Shenoy reported: > While running the usual CPU-Hotplug stress tests on linux-2.6.25, > I noticed the following in the console logs. > > This is a wee bit difficult to reproduce. In the past 10 runs I hit this > only once. > > ------------[ cut here ]------------ > > WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65() > > Just wondering if we are doing a good job at handling the cancellation > of any per-cpu scheduler timers during CPU-Hotplug. This looks like its indeed not cancelled at all and migrates the it to another cpu. Fix it via a proper hotplug notifier mechanism. Reported-by: Gautham R Shenoy <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: add statics, don't return void expressionsHarvey Harrison1-4/+6
Noticed by sparse: kernel/sched.c:760:20: warning: symbol 'sched_feat_names' was not declared. Should it be static? kernel/sched.c:767:5: warning: symbol 'sched_feat_open' was not declared. Should it be static? kernel/sched_fair.c:845:3: warning: returning void-valued expression kernel/sched.c:4386:3: warning: returning void-valued expression Signed-off-by: Harvey Harrison <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: add debug checks to idle functionsAndrew Morton1-0/+2
Cc: Venkatesh Pallipadi <[email protected]> Cc: "Justin Mattock" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-05sched: optimize calc_delta_mine()Peter Zijlstra1-4/+4
Joel noticed that the !lw->inv_weight contition isn't unlikely anymore so remove the unlikely annotation. Also, remove the two div64_u64() inv_weight calculations, which makes them rely on the calc_delta_mine() path as well. Signed-off-by: Peter Zijlstra <[email protected]> CC: Joel Schopp <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-05-01rename div64_64 to div64_u64Roman Zippel1-3/+3
Rename div64_64 to div64_u64 to make it consistent with the other divide functions, so it clearly includes the type of the divide. Move its definition to math64.h as currently no architecture overrides the generic implementation. They can still override it of course, but the duplicated declarations are avoided. Signed-off-by: Roman Zippel <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Russell King <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: David Howells <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Patrick McHardy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29CGroups _s64 files: use read_s64/write_s64 in CFS cgroup for rt_runtime filePaul Menage1-40/+6
This removes some filesystem boilerplate from the CFS cgroup subsystem. Signed-off-by: Paul Menage <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29CGroup API files: rename read/write_uint methods to read_write_u64Paul Menage1-8/+8
Several people have justifiably complained that the "_uint" suffix is inappropriate for functions that handle u64 values, so this patch just renames all these functions and their users to have the suffic _u64. [[email protected]: build fix] Signed-off-by: Paul Menage <[email protected]> Cc: "Li Zefan" <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Paul Jackson <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: "YAMAMOTO Takashi" <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-25Merge branch 'release' of ↵Linus Torvalds1-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [PATCH] Build fix for CONFIG_NUMA=y && CONFIG_SMP=n [IA64] fix bootmem regression on Altix
2008-04-25Merge branch 'for-linus' of ↵Linus Torvalds1-45/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes: sched: fix share (re)distribution softlockup: fix NOHZ wakeup seqlock: livelock fix
2008-04-25sched: use alloc_bootmem() instead of alloc_bootmem_low()David Miller1-1/+1
There is no guarantee that there is physical ram below 4GB, and in fact many boxes don't have exactly that. Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-25sched: fix share (re)distributionPeter Zijlstra1-45/+2
fix __aggregate_redistribute_shares() related lockup reported by David S. Miller. The problem this code tries to solve is 'accurately' calculating the 'fair' share of the group weight for each cpu. The current code falls back to a global group rebalance in case the sched_domain's span it looks at has no shares, but does have tasks. The reason it gets stuck here, is because its inherently racy - if someone steals the last task after we compute the agg->rq_weight, but before we rebalance, we'll never get out of the loop. We could of course go fix that, but while looking at this issue I found that this 'fallback' wasn't nearly as rare as I'd hoped it to be. In fact its quite common - and given it walks the whole machine, thats very bad. The new approach is simple (why didn't I think of it before?), we set the aggregate shares to the full task group weight, and each larger sched domain that encounters an aggregate shares larger than the weight, clips it (it already re-distributes anyway). This nicely converges to the desired global picture where the sum of all shares equals the task group weight. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-24[PATCH] Build fix for CONFIG_NUMA=y && CONFIG_SMP=nMike Travis1-5/+0
Regression caused by 434d53b00d6bb7be0a1d3dcc0d0d5df6c042e164 Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2008-04-24[IA64] fix bootmem regression on AltixRuss Anderson1-1/+1
A recent change prevents SGI Altix from booting. This patch fixes the problem. The regresson was introduced in commit 434d53b00d6bb7be0a1d3dcc0d0d5df6c042e164 Signed-off-by: Russ Anderson <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2008-04-22kernel-doc: fix sched.c missing parameterRandy Dunlap1-0/+1
Add missing kernel-doc in kernel/sched.c: Warning(linux-2.6.25-git3//kernel/sched.c:7044): No description found for parameter 'span' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-19sched: features fixIngo Molnar1-13/+2
Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: /debug/sched_featuresPeter Zijlstra1-22/+140
provide a text based interface to the scheduler features; this saves the 'user' from setting bits using decimal arithmetic. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: add SCHED_FEAT_DEADLINEIngo Molnar1-1/+3
unused at the moment. Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: fair: weight calculationsPeter Zijlstra1-6/+3
In order to level the hierarchy, we need to calculate load based on the root view. That is, each task's load is in the same unit. A / \ B 1 / \ 2 3 To compute 1's load we do: weight(1) -------------- rq_weight(A) To compute 2's load we do: weight(2) weight(B) ------------ * ----------- rq_weight(B) rw_weight(A) This yields load fractions in comparable units. The consequence is that it changes virtual time. We used to have: time_{i} vtime_{i} = ------------ weight_{i} vtime = \Sum vtime_{i} = time / rq_weight. But with the new way of load calculation we get that vtime equals time. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: fair-group: de-couple load-balancing from the rb-treesPeter Zijlstra1-2/+8
De-couple load-balancing from the rb-trees, so that I can change their organization. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: fair-group: SMP-nice for group schedulingPeter Zijlstra1-34/+463
Implement SMP nice support for the full group hierarchy. On each load-balance action, compile a sched_domain wide view of the full task_group tree. We compute the domain wide view when walking down the hierarchy, and readjust the weights when walking back up. After collecting and readjusting the domain wide view, we try to balance the tasks within the task_groups. The current approach is a naively balance each task group until we've moved the targeted amount of load. Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP paper. XXX: there will be some numerical issues due to the limited nature of SCHED_LOAD_SCALE wrt to representing a task_groups influence on the total weight. When the tree is deep enough, or the task weight small enough, we'll run out of bits. Signed-off-by: Peter Zijlstra <[email protected]> CC: Abhishek Chandra <[email protected]> CC: Srivatsa Vaddagiri <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched, cpuset: customize sched domains, coreHidetoshi Seto1-5/+73
[rebased for sched-devel/latest] - Add a new cpuset file, having levels: sched_relax_domain_level - Modify partition_sched_domains() and build_sched_domains() to take attributes parameter passed from cpuset. - Fill newidle_idx for node domains which currently unused but might be required if sched_relax_domain_level become higher. - We can change the default level by boot option 'relax_domain_level='. Signed-off-by: Hidetoshi Seto <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: rt: multi level group constraintsPeter Zijlstra1-0/+33
multi level rt constraints Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: task_group hierarchyPeter Zijlstra1-0/+20
Add the full parent<->child relation thing into task_groups as well. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: fix the task_group hierarchy for UID groupingPeter Zijlstra1-2/+41
UID grouping doesn't actually have a task_group representing the root of the task_group tree. Add one. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: allow the group scheduler to have multiple levelsDhaval Giani1-32/+53
This patch makes the group scheduler multi hierarchy aware. [[email protected]: rt-parts and assorted fixes] Signed-off-by: Dhaval Giani <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: mix tasks and groupsDhaval Giani1-2/+49
This patch allows tasks and groups to exist in the same cfs_rq. With this change the CFS group scheduling follows a 1/(M+N) model from a 1/(1+N) fairness model where M tasks and N groups exist at the cfs_rq level. [[email protected]: rt bits and assorted fixes] Signed-off-by: Dhaval Giani <[email protected]> Signed-off-by: Srivatsa Vaddagiri <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: fix checksIngo Molnar1-4/+6
Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: old sleeper bonusPeter Zijlstra1-1/+3
Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: add new set_cpus_allowed_ptr functionMike Travis1-8/+8
Add a new function that accepts a pointer to the "newly allowed cpus" cpumask argument. int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask) The current set_cpus_allowed() function is modified to use the above but this does not result in an ABI change. And with some compiler optimization help, it may not introduce any additional overhead. Additionally, to enforce the read only nature of the new_mask arg, the "const" property is migrated to sub-functions called by set_cpus_allowed. This silences compiler warnings. Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19init: move setup of nr_cpu_ids to as early as possibleMike Travis1-7/+0
Move the setting of nr_cpu_ids from sched_init() to start_kernel() so that it's available as early as possible. Note that an arch has the option of setting it even earlier if need be, but it should not result in a different value than the setup_nr_cpu_ids() function. Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: remove another cpumask_t variable from stackMike Travis1-9/+6
* Remove another cpumask_t variable from stack that was missed in the last kernel_sched_c updates. Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19cpumask: reduce stack usage in SD_x_INIT initializersMike Travis1-120/+248
* Remove empty cpumask_t (and all non-zero/non-null) variables in SD_*_INIT macros. Use memset(0) to clear. Also, don't inline the initializer functions to save on stack space in build_sched_domains(). * Merge change to include/linux/topology.h that uses the new node_to_cpumask_ptr function in the nr_cpus_node macro into this patch. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: H. Peter Anvin <[email protected]> Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19nodemask: use new node_to_cpumask_ptr functionMike Travis1-15/+14
* Use new node_to_cpumask_ptr. This creates a pointer to the cpumask for a given node. This definition is in mm patch: asm-generic-add-node_to_cpumask_ptr-macro.patch * Use new set_cpus_allowed_ptr function. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function [x86/latest]: x86: add cpus_scnprintf function Cc: Greg Kroah-Hartman <[email protected]> Cc: Greg Banks <[email protected]> Cc: H. Peter Anvin <[email protected]> Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19generic: reduce stack pressure in sched_affinityMike Travis1-2/+3
* Modify sched_affinity functions to pass cpumask_t variables by reference instead of by value. * Use new set_cpus_allowed_ptr function. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: Paul Jackson <[email protected]> Cc: Cliff Wickman <[email protected]> Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19cpuset: modify cpuset_set_cpus_allowed to use cpumask pointerMike Travis1-3/+5
* Modify cpuset_cpus_allowed to return the currently allowed cpuset via a pointer argument instead of as the function return value. * Use new set_cpus_allowed_ptr function. * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: remove fixed NR_CPUS sized arrays in kernel_sched_cMike Travis1-28/+52
* Change fixed size arrays to per_cpu variables or dynamically allocated arrays in sched_init() and sched_init_smp(). (1) static struct sched_entity *init_sched_entity_p[NR_CPUS]; (1) static struct cfs_rq *init_cfs_rq_p[NR_CPUS]; (1) static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS]; (1) static struct rt_rq *init_rt_rq_p[NR_CPUS]; static struct sched_group **sched_group_nodes_bycpu[NR_CPUS]; (1) - these arrays are allocated via alloc_bootmem_low() * Change sched_domain_debug_one() to use cpulist_scnprintf instead of cpumask_scnprintf. This reduces the output buffer required and improves readability when large NR_CPU count machines arrive. * In sched_create_group() we allocate new arrays based on nr_cpu_ids. Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-04-19sched: allow cpuacct stats to be resetDhaval Giani1-0/+24
Currently the schedstats implementation does not allow the statistics to be reset. This patch aims to allow that. echo 0 > cpuacct.usage resets the usage. Any other value is not allowed and returns -EINVAL. Signed-off-by: Dhaval Giani <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>