diff options
| author | Prakash Surya <[email protected]> | 2016-09-18 16:37:30 -0400 |
|---|---|---|
| committer | Greg Kroah-Hartman <[email protected]> | 2016-09-19 09:40:35 +0200 |
| commit | 5d770fe899bdff805fe174a9a0c946c3072b315e (patch) | |
| tree | 3cd09aa58498ea94d17d2b208a9de4a29761967d /tools/perf/scripts/python | |
| parent | 5b8a39c53a16c992fe0d99cd21c141153ea3774b (diff) | |
staging: lustre: vvp: Use lockless __generic_file_aio_write
Testing multi-threaded single shard file write performance has shown
the inode mutex to be a limiting factor when using the
generic_file_write_iter function. To work around this bottle neck, this
change replaces the locked version of that call with the lock less
version, specifically, __generic_file_write_iter.
In order to maintain posix consistency, Lustre must now employ it's
own locking mechanism in the higher layers. Currently writes are
protected using the lli_write_mutex in the ll_inode_info structure.
To protect against simultaneous write and truncate operations, since
we no longer take the inode mutex during writes, we must down the
lli_trunc_sem semaphore.
Unfortunately, this change by itself does not garner any performance
benefits. Using FIO on a single machine with 32 GB of RAM, write
performance tests were ran with and without this change applied; the
results are below:
+---------+-----------+---------+--------+--------+
| fio v2.0.13 | Write Bandwidth (KB/s) |
+---------+-----------+---------+--------+--------+
| # Tasks | GB / Task | Test 1 | Test 2 | Test 3 |
+---------+-----------+---------+--------+--------+
| 1 | 64 | 452446 | 454623 | 457653 |
| 2 | 32 | 850318 | 565373 | 602498 |
| 4 | 16 | 1058900 | 463546 | 529107 |
| 8 | 8 | 1026300 | 468190 | 576451 |
| 16 | 4 | 1065500 | 503160 | 462902 |
| 32 | 2 | 1068600 | 462228 | 466963 |
| 64 | 1 | 991830 | 556618 | 557863 |
+---------+-----------+---------+--------+--------+
* Test 1: Lustre client running 04ec54f. File per process write
workload. This test was used as a baseline for what we
_could_ achieve in the single shared file tests if the
bottle necks were removed.
* Test 2: Lustre client running 04ec54f. Single shared file
workload, each task writing to a unique region.
* Test 3: Lustre client running 04ec54f + this patch. Single shared
file workload, each task writing to a unique region.
In order to garner any real performance benefits out of a single
shared file workload, the lli_write_mutex needs to be broken up into a
range lock. That would allow write operations to unique regions of a
file to be executed concurrently. This work is left to be done in a
follow up patch.
Signed-off-by: Prakash Surya <[email protected]>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1669
Reviewed-on: http://review.whamcloud.com/6672
Reviewed-by: Lai Siyao <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Jinshan Xiong <[email protected]>
Reviewed-by: Oleg Drokin <[email protected]>
Signed-off-by: James Simmons <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python')
0 files changed, 0 insertions, 0 deletions