aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBen Segall <[email protected]>2015-04-06 15:28:10 -0700
committerIngo Molnar <[email protected]>2015-06-07 15:57:44 +0200
commit54d27365cae88fbcc853b391dcd561e71acb81fa (patch)
tree741039e6e1d7eafc331d5ec61fb8177ba79e14a7
parent9a92e3dc6ad02208a014d0d8404ebbd697e3d5ef (diff)
sched/fair: Prevent throttling in early pick_next_task_fair()
The optimized task selection logic optimistically selects a new task to run without first doing a full put_prev_task(). This is so that we can avoid a put/set on the common ancestors of the old and new task. Similarly, we should only call check_cfs_rq_runtime() to throttle eligible groups if they're part of the common ancestry, otherwise it is possible to end up with no eligible task in the simple task selection. Imagine: /root /prev /next /A /B If our optimistic selection ends up throttling /next, we goto simple and our put_prev_task() ends up throttling /prev, after which we're going to bug out in set_next_entity() because there aren't any tasks left. Avoid this scenario by only throttling common ancestors. Reported-by: Mohammed Naser <[email protected]> Reported-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Ben Segall <[email protected]> [ munged Changelog ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: 678d5718d8d0 ("sched/fair: Optimize cgroup pick_next_task_fair()") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
-rw-r--r--kernel/sched/fair.c25
1 files changed, 14 insertions, 11 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d4632f7799b..84ada054c6a8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5322,18 +5322,21 @@ again:
* entity, update_curr() will update its vruntime, otherwise
* forget we've ever seen it.
*/
- if (curr && curr->on_rq)
- update_curr(cfs_rq);
- else
- curr = NULL;
+ if (curr) {
+ if (curr->on_rq)
+ update_curr(cfs_rq);
+ else
+ curr = NULL;
- /*
- * This call to check_cfs_rq_runtime() will do the throttle and
- * dequeue its entity in the parent(s). Therefore the 'simple'
- * nr_running test will indeed be correct.
- */
- if (unlikely(check_cfs_rq_runtime(cfs_rq)))
- goto simple;
+ /*
+ * This call to check_cfs_rq_runtime() will do the
+ * throttle and dequeue its entity in the parent(s).
+ * Therefore the 'simple' nr_running test will indeed
+ * be correct.
+ */
+ if (unlikely(check_cfs_rq_runtime(cfs_rq)))
+ goto simple;
+ }
se = pick_next_entity(cfs_rq, curr);
cfs_rq = group_cfs_rq(se);