aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-07-09libfcoe: Fix meaningless log statementRobert Love1-4/+2
ctlr_dev was initialized to NULL, and never re-assigned. This caused the log statement to always report failure. This patch removes the unused variable and fixes the log statement to always report 'success', as that is what should be logged if the code reaches this point. Signed-off-by: Robert Love <[email protected]> Tested-by: Jack Morgan <[email protected]> Acked-by: Neil Horman <[email protected]>
2013-07-09libfc: Differentiate echange timer cancellation debug statementsRobert Love1-1/+1
There are two debug statements with the same output string regarding echange timer cancellation. This patch simply changes the output of one string so that they can be differentiated. Signed-off-by: Robert Love <[email protected]> Tested-by: Jack Morgan <[email protected]> Acked-by: Neil Horman <[email protected]>
2013-07-09libfc: Remove extra space in fc_exch_timer_cancel definitionRobert Love1-1/+1
Simply remove an extra space that violates coding style. Signed-off-by: Robert Love <[email protected]> Tested-by: Jack Morgan <[email protected]> Acked-by: Neil Horman <[email protected]>
2013-07-09fcoe: fix the link error status block sparse warningsYi Zou1-18/+4
Both fcoe_fc_els_lesb and fc_els_lesb are in __be32 already, and both are exactly the same size in bytes, with somewhat different member names to reflect the fact the former is for Ethernet media the latter is for Fiber Channel, so, remove conversion and use __be32 directly. This fixes the warning from sparse check. Signed-off-by: Yi Zou <[email protected]> Reported-by: Fengguang Wu <[email protected]> Tested-by: Jack Morgan <[email protected]> Signed-off-by: Robert Love <[email protected]>
2013-07-09lib/scatterlist: error handling in __sg_alloc_table()Dan Carpenter1-2/+4
I was reviewing code which I suspected might allocate a zero size SG table. That will cause memory corruption. Also we can't return before doing the memset or we could end up using uninitialized memory in the cleanup path. Signed-off-by: Dan Carpenter <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Imre Deak <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Maxim Levitsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09scsi_debug: fix do_device_access() with wrap around rangeAkinobu Mita1-11/+37
do_device_access() is a function that abstracts copying SG list from/to ramdisk storage (fake_storep). It must deal with the ranges exceeding actual fake_storep size, because such ranges are valid if virtual_gb is set greater than zero, and they should be treated as fake_storep is repeatedly mirrored up to virtual size. Unfortunately, it can't deal with the range which wraps around the end of fake_storep. A wrap around range is copied by two sg_copy_{from,to}_buffer() calls, but sg_copy_{from,to}_buffer() can't copy from/to in the middle of SG list, therefore the second call can't copy correctly. This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to the middle of SG list. This also simplifies the assignment of sdb->resid in fill_from_dev_buffer(). Because fill_from_dev_buffer() is now only called once per command execution cycle. So it is not necessary to take care to decrease sdb->resid if fill_from_dev_buffer() is called more than once. Signed-off-by: Akinobu Mita <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Douglas Gilbert <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Horia Geanta <[email protected]> Cc: Imre Deak <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09crypto: talitos: use sg_pcopy_to_buffer()Akinobu Mita1-59/+1
Use sg_pcopy_to_buffer() which is better than the function previously used. Because it doesn't do kmap/kunmap for skipped pages. Signed-off-by: Akinobu Mita <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Douglas Gilbert <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Horia Geanta <[email protected]> Cc: Imre Deak <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer()Akinobu Mita2-5/+88
The only difference between sg_pcopy_{from,to}_buffer() and sg_copy_{from,to}_buffer() is an additional argument that specifies the number of bytes to skip the SG list before copying. Signed-off-by: Akinobu Mita <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Douglas Gilbert <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Horia Geanta <[email protected]> Cc: Imre Deak <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next()Akinobu Mita1-15/+24
This patchset introduces sg_pcopy_from_buffer() and sg_pcopy_to_buffer(), which copy data between a linear buffer and an SG list. The only difference between sg_pcopy_{from,to}_buffer() and sg_copy_{from,to}_buffer() is an additional argument that specifies the number of bytes to skip the SG list before copying. The main reason for introducing these functions is to fix a problem in scsi_debug module. And there is a local function in crypto/talitos module, which can be replaced by sg_pcopy_to_buffer(). This patch: sg_miter_get_next_page() is used to proceed page iterator to the next page if necessary, and will be used to implement the variants of sg_copy_{from,to}_buffer() later. Signed-off-by: Akinobu Mita <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Imre Deak <[email protected]> Cc: Herbert Xu <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Douglas Gilbert <[email protected]> Cc: Horia Geanta <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09crypto: add lz4 Cryptographic APIChanho Min4-0/+230
Add support for lz4 and lz4hc compression algorithm using the lib/lz4/* codebase. [[email protected]: fix warnings] Signed-off-by: Chanho Min <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Bob Pearson <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Yann Collet <[email protected]> Cc: Kyungsik Lee <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09lib: add lz4 compressor moduleChanho Min7-2/+1092
This patchset is for supporting LZ4 compression and the crypto API using it. As shown below, the size of data is a little bit bigger but compressing speed is faster under the enabled unaligned memory access. We can use lz4 de/compression through crypto API as well. Also, It will be useful for another potential user of lz4 compression. lz4 Compression Benchmark: Compiler: ARM gcc 4.6.4 ARMv7, 1 GHz based board Kernel: linux 3.4 Uncompressed data Size: 101 MB Compressed Size compression Speed LZO 72.1MB 32.1MB/s, 33.0MB/s(UA) LZ4 75.1MB 30.4MB/s, 35.9MB/s(UA) LZ4HC 59.8MB 2.4MB/s, 2.5MB/s(UA) - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch: Add support for LZ4 compression in the Linux Kernel. LZ4 Compression APIs for kernel are based on LZ4 implementation by Yann Collet and were changed for kernel coding style. LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository : http://code.google.com/p/lz4/ svn revision : r90 Two APIs are added: lz4_compress() support basic lz4 compression whereas lz4hc_compress() support high compression or CPU performance get lower but compression ratio get higher. Also, we require the pre-allocated working memory with the defined size and destination buffer must be allocated with the size of lz4_compressbound. [[email protected]: make lz4_compresshcctx() static] Signed-off-by: Chanho Min <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Bob Pearson <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Yann Collet <[email protected]> Cc: Kyungsik Lee <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09arm: add support for LZ4-compressed kernelKyungsik Lee9-5/+28
Integrates the LZ4 decompression code to the arm pre-boot code. Signed-off-by: Kyungsik Lee <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Russell King <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Yann Collet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09lib: add support for LZ4-compressed kernelKyungsik Lee10-2/+243
Add support for extracting LZ4-compressed kernel images, as well as LZ4-compressed ramdisk images in the kernel boot process. Signed-off-by: Kyungsik Lee <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Russell King <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Yann Collet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09decompressor: add LZ4 decompressor moduleKyungsik Lee3-0/+471
Add support for LZ4 decompression in the Linux Kernel. LZ4 Decompression APIs for kernel are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <[email protected]> Signed-off-by: Yann Collet <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Russell King <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Florian Fainelli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09lib: add weak clz/ctz functionsChanho Min2-1/+59
Some architectures need __c[lt]z[sd]i2() for __builtin_c[lt]z[ll] and that causes a build failure. They can be implemented using the fls()/__ffs() and overridden by linking arch-specific versions may not be implemented yet. This is required by "lib: add lz4 compressor module". Reference: https://lkml.org/lkml/2013/4/18/603 Signed-off-by: Chanho Min <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Bob Pearson <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Yann Collet <[email protected]> Cc: Kyungsik Lee <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: move arch/x86 reboot= handling to generic kernelRobin Holt8-145/+107
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Russell King <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Guan Xuetao <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: arm: change reboot_mode to use enum reboot_modeRobin Holt125-166/+282
Preparing to move the parsing of reboot= to generic kernel code forces the change in reboot_mode handling to use the enum. [[email protected]: fix arch/arm/mach-socfpga/socfpga.c] Signed-off-by: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Guan Xuetao <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: arm: prepare reboot_mode for moving to generic kernel codeRobin Holt4-9/+10
Prepare for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Guan Xuetao <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: arm: remove unused restart_mode fields from some arm subarchsRobin Holt4-6/+0
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Guan Xuetao <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: unicore32: prepare reboot_mode for moving to generic kernel codeRobin Holt4-7/+9
Prepare for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: H. Peter Anvin <[email protected]> Acked-by: Guan Xuetao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: x86: prepare reboot_mode for moving to generic kernel codeRobin Holt2-5/+12
Prepare for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Miguel Boton <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Guan Xuetao <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: checkpatch.pl the new kernel/reboot.c fileRobin Holt2-15/+14
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: move shutdown/reboot related functions to kernel/reboot.cRobin Holt3-332/+347
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09reboot: remove -stable friendly PF_THREAD_BOUND defineRobin Holt1-5/+0
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Robin Holt <[email protected]> Cc: Russell King <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09partitions/msdos: enumerate also AIX LVM partitionsPhilippe De Muyter1-0/+5
Graft AIX partitions enumeration into partitions/msdos.c There is already a AIX disks detection logic in msdos.c. When an AIX disk has been found, and if configured to, call the aix partitions recognizer. This avoids removal of AIX disks protection from msdos.c, avoids code duplication, and ensures that AIX partitions enumeration is called before plain msdos partitions enumeration. Signed-off-by: Philippe De Muyter <[email protected]> Cc: Karel Zak <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09partitions: add aix lvm partition support filesPhilippe De Muyter4-0/+306
Add partitions/aix.h and partitions/aix.c. AIX LVM permits to make "logical volumes" which are made of multiple slices of multiple disks. The new code allows only access to the "logical volumes" which are made of one slice on the probed disk, a slice being a contiguous disk area. The code also detects "logical volumes" made of multiple slices on the probed disk, but can not describe them to the partition layer, because the partition layer generic code does not support that. When such non-contiguous "logical volumes" are detected, a diagnostic message is printed. [[email protected]: coding-style fixes] Signed-off-by: Philippe De Muyter <[email protected]> Cc: Karel Zak <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09partitions/msdos.c: end-of-line whitespace and semicolon cleanupPhilippe De Muyter1-6/+6
Signed-off-by: Philippe De Muyter <[email protected]> Cc: Karel Zak <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09mwave: fix info leak in mwave_ioctl()Dan Carpenter1-0/+1
Smatch complains that on 64 bit systems, there is a hole in the MW_ABILITIES struct between ->component_count and ->component_list[]. It leaks stack information from the mwave_ioctl() function. I've added a memset() to initialize the struct to zero. Signed-off-by: Dan Carpenter <[email protected]> Cc: Greg KH <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/sem.c: rename try_atomic_semop() to perform_atomic_semop(), docu updateManfred Spraul1-11/+21
Cleanup: Some minor points that I noticed while writing the previous patches 1) The name try_atomic_semop() is misleading: The function performs the operation (if it is possible). 2) Some documentation updates. No real code change, a rename and documentation changes. Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/sem.c: replace shared sem_otime with per-semaphore valueManfred Spraul2-7/+31
sem_otime contains the time of the last semaphore operation that completed successfully. Every operation updates this value, thus access from multiple cpus can cause thrashing. Therefore the patch replaces the variable with a per-semaphore variable. The per-array sem_otime is only calculated when required. No performance improvement on a single-socket i3 - only important for larger systems. Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/sem.c: always use only one queue for alter operationsManfred Spraul1-40/+88
There are two places that can contain alter operations: - the global queue: sma->pending_alter - the per-semaphore queues: sma->sem_base[].pending_alter. Since one of the queues must be processed first, this causes an odd priorization of the wakeups: complex operations have priority over simple ops. The patch restores the behavior of linux <=3.0.9: The longest waiting operation has the highest priority. This is done by using only one queue: - if there are complex ops, then sma->pending_alter is used. - otherwise, the per-semaphore queues are used. As a side effect, do_smart_update_queue() becomes much simpler: no more goto logic. Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/sem: separate wait-for-zero and alter tasks into seperate queuesManfred Spraul2-61/+155
Introduce separate queues for operations that do not modify the semaphore values. Advantages: - Simpler logic in check_restart(). - Faster update_queue(): Right now, all wait-for-zero operations are always tested, even if the semaphore value is not 0. - wait-for-zero gets again priority, as in linux <=3.0.9 Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/sem.c: cacheline align the semaphore structuresManfred Spraul1-1/+1
As now each semaphore has its own spinlock and parallel operations are possible, give each semaphore its own cacheline. On a i3 laptop, this gives up to 28% better performance: #semscale 10 | grep "interleave 2" - before: Cpus 1, interleave 2 delay 0: 36109234 in 10 secs Cpus 2, interleave 2 delay 0: 55276317 in 10 secs Cpus 3, interleave 2 delay 0: 62411025 in 10 secs Cpus 4, interleave 2 delay 0: 81963928 in 10 secs -after: Cpus 1, interleave 2 delay 0: 35527306 in 10 secs Cpus 2, interleave 2 delay 0: 70922909 in 10 secs <<< + 28% Cpus 3, interleave 2 delay 0: 80518538 in 10 secs Cpus 4, interleave 2 delay 0: 89115148 in 10 secs <<< + 8.7% i3, with 2 cores and with hyperthreading enabled. Interleave 2 in order use first the full cores. HT partially hides the delay from cacheline trashing, thus the improvement is "only" 8.7% if 4 threads are running. Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/util.c, ipc_rcu_alloc: cacheline align allocationManfred Spraul1-6/+6
Enforce that ipc_rcu_alloc returns a cacheline aligned pointer on SMP. Rationale: The SysV sem code tries to move the main spinlock into a seperate cacheline (____cacheline_aligned_in_smp). This works only if ipc_rcu_alloc returns cacheline aligned pointers. vmalloc and kmalloc return cacheline algined pointers, the implementation of ipc_rcu_alloc breaks that. [[email protected]: coding-style fixes] Signed-off-by: Manfred Spraul <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc: remove unused functionsDavidlohr Bueso2-26/+0
We can now drop the msg_lock and msg_lock_check functions along with a bogus comment introduced previously in semctl_down. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: shorten critical region in msgrcvDavidlohr Bueso1-26/+32
do_msgrcv() is the last msg queue function that abuses the ipc lock Take it only when needed when actually updating msq. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: shorten critical region in msgsndDavidlohr Bueso1-13/+24
do_msgsnd() is another function that does too many things with the ipc object lock acquired. Take it only when needed when actually updating msq. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: make msgctl_nolock locklessDavidlohr Bueso1-10/+17
While the INFO cmd doesn't take the ipc lock, the STAT commands do acquire it unnecessarily. We can do the permissions and security checks only holding the rcu lock. This function now mimics semctl_nolock(). Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: introduce lockless functions to obtain the ipc objectDavidlohr Bueso1-0/+21
Add msq_obtain_object() and msq_obtain_object_check(), which will allow us to get the ipc object without acquiring the lock. Just as with semaphores, these functions are basically wrappers around ipc_obtain_object*(). Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: introduce msgctl_nolockDavidlohr Bueso1-15/+34
Similar to semctl, when calling msgctl, the *_INFO and *_STAT commands can be performed without acquiring the ipc object. Add a msgctl_nolock() function and move the logic of *_INFO and *_STAT out of msgctl(). This change still takes the lock and it will be properly lockless in the next patch Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc,msg: shorten critical region in msgctl_downDavidlohr Bueso1-5/+7
Instead of holding the ipc lock for the entire function, use the ipcctl_pre_down_nolock and only acquire the lock for specific commands: RMID and SET. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc: move locking out of ipcctl_pre_down_nolockDavidlohr Bueso4-39/+56
This function currently acquires both the rw_mutex and the rcu lock on successful lookups, leaving the callers to explicitly unlock them, creating another two level locking situation. Make the callers (including those that still use ipcctl_pre_down()) explicitly lock and unlock the rwsem and rcu lock. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc: close open coded spin lock callsDavidlohr Bueso4-12/+12
Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc: introduce ipc object locking helpersDavidlohr Bueso1-5/+15
Simple helpers around the (kern_ipc_perm *)->lock spinlock. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc: move rcu lock out of ipc_addidDavidlohr Bueso3-7/+8
This patchset continues the work that began in the sysv ipc semaphore scaling series, see https://lkml.org/lkml/2013/3/20/546 Just like semaphores used to be, sysv shared memory and msg queues also abuse the ipc lock, unnecessarily holding it for operations such as permission and security checks. This patchset mostly deals with mqueues, and while shared mem can be done in a very similar way, I want to get these patches out in the open first. It also does some pending cleanups, mostly focused on the two level locking we have in ipc code, taking care of ipc_addid() and ipcctl_pre_down_nolock() - yes there are still functions that need to be updated as well. This patch: Make all callers explicitly take and release the RCU read lock. This addresses the two level locking seen in newary(), newseg() and newqueue(). For the last two, explicitly unlock the ipc object and the rcu lock, instead of calling the custom shm_unlock and msg_unlock functions. The next patch will deal with the open coded locking for ->perm.lock Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ipc/shmc.c: eliminate ugly 80-col tricksAndrew Morton2-4/+4
Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ptrace/x86: flush_ptrace_hw_breakpoint() shoule clear the virtual debug ↵Oleg Nesterov1-0/+3
registers flush_ptrace_hw_breakpoint() destroys the counters set by ptrace, but "leaks" ->debugreg6 and ->ptrace_dr7. The problem is minor, but still it doesn't look right and flush_thread() did this until commit 66cb59172959 ("hw-breakpoints: use the new wrapper routines to access debug registers in process/thread code"). Now that PTRACE_DETACH does flush_ too this makes even more sense. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ptrace: PTRACE_DETACH should do flush_ptrace_hw_breakpoint(child)Oleg Nesterov1-0/+1
Change ptrace_detach() to call flush_ptrace_hw_breakpoint(child). This frees the slots for non-ptrace PERF_TYPE_BREAKPOINT users, and this ensures that the tracee won't be killed by SIGTRAP triggered by the active breakpoints. Test-case: unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void func(void) { } int main(void) { int pid, stat; unsigned long dr7; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGHUP); func(); return 0x13; } assert(pid == waitpid(-1, &stat, 0)); assert(WSTOPSIG(stat) == SIGHUP); assert(write_dr(pid, 0, (long)func) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0); assert(pid == waitpid(-1, &stat, 0)); assert(stat == 0x1300); return 0; } Before this patch the child is killed after PTRACE_DETACH. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ptrace/x86: cleanup ptrace_set_debugreg()Oleg Nesterov1-18/+8
ptrace_set_debugreg() is trivial but looks horrible. Kill the unnecessary goto's and return's to cleanup the code. This matches ptrace_get_debugreg() which also needs the trivial whitespace cleanups. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09ptrace/x86: ptrace_write_dr7() should create bp if !disabledOleg Nesterov1-7/+10
Commit 24f1e32c60c4 ("hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events") introduced the minor regression. Before this commit PTRACE_POKEUSER DR7, enableDR0 PTRACE_POKEUSER DR0, address was perfectly valid, now PTRACE_POKEUSER(DR7) fails if DR0 was not previously initialized by PTRACE_POKEUSER(DR0). Change ptrace_write_dr7() to do ptrace_register_breakpoint(addr => 0) if !bp && !disabled. This fixes watchpoint-zeroaddr from ptrace-tests, see https://bugzilla.redhat.com/show_bug.cgi?id=660204. Signed-off-by: Oleg Nesterov <[email protected]> Reported-by: Jan Kratochvil <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>