diff options
| author | Andy Lutomirski <[email protected]> | 2016-09-15 22:45:49 -0700 | 
|---|---|---|
| committer | Ingo Molnar <[email protected]> | 2016-09-16 09:18:54 +0200 | 
| commit | ac496bf48d97f2503eaa353996a4dd5e4383eaf0 (patch) | |
| tree | d57eb8f649cc8268a3d3595f04bbe4bab72264ef /drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | |
| parent | 68f24b08ee892d47bdef925d676e1ae1ccc316f8 (diff) | |
fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y
vmalloc() is a bit slow, and pounding vmalloc()/vfree() will eventually
force a global TLB flush.
To reduce pressure on them, if CONFIG_VMAP_STACK=y, cache two thread
stacks per CPU.  This will let us quickly allocate a hopefully
cache-hot, TLB-hot stack under heavy forking workloads (shell script style).
On my silly pthread_create() benchmark, it saves about 2 µs per
pthread_create()+join() with CONFIG_VMAP_STACK=y.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/94811d8e3994b2e962f88866290017d498eb069c.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c')
0 files changed, 0 insertions, 0 deletions