Blaster4385/linux-IllusionX

Author	SHA1	Message	Date
Dan Williams	83544ae9f3	dmaengine, async_tx: support alignment checks Some engines have transfer size and address alignment restrictions. Add a per-operation alignment property to struct dma_device that the async routines and dmatest can use to check alignment capabilities. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:53 -07:00
Dan Williams	138f4c359d	dmaengine, async_tx: add a "no channel switch" allocator Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:51 -07:00
Dan Williams	0403e38277	dmaengine: add fence support Some engines optimize operation by reading ahead in the descriptor chain such that descriptor2 may start execution before descriptor1 completes. If descriptor2 depends on the result from descriptor1 then a fence is required (on descriptor2) to disable this optimization. The async_tx api could implicitly identify dependencies via the 'depend_tx' parameter, but that would constrain cases where the dependency chain only specifies a completion order rather than a data dependency. So, provide an ASYNC_TX_FENCE to explicitly identify data dependencies. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:50 -07:00
Dan Williams	f9dd213437	Merge branch 'md-raid6-accel' into ioat3.2 Conflicts: include/linux/dmaengine.h	2009-09-08 17:42:29 -07:00
Dan Williams	cb3c82992f	async_tx: raid6 recovery self test Port drivers/md/raid6test/test.c to use the async raid6 recovery routines. This is meant as a unit test for raid6 acceleration drivers. In addition to the 16-drive test case this implements tests for the 4-disk and 5-disk special cases (dma devices can not generically handle less than 2 sources), and adds a test for the D+Q case. Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:28 -07:00
Dan Williams	0a82a6239b	async_tx: add support for asynchronous RAID6 recovery operations async_raid6_2data_recov() recovers two data disk failures async_raid6_datap_recov() recovers a data disk and the P disk These routines are a port of the synchronous versions found in drivers/md/raid6recov.c. The primary difference is breaking out the xor operations into separate calls to async_xor. Two helper routines are introduced to perform scalar multiplication where needed. async_sum_product() multiplies two sources by scalar coefficients and then sums (xor) the result. async_mult() simply multiplies a single source by a scalar. This implemention also includes, in contrast to the original synchronous-only code, special case handling for the 4-disk and 5-disk array cases. In these situations the default N-disk algorithm will present 0-source or 1-source operations to dma devices. To cover for dma devices where the minimum source count is 2 we implement 4-disk and 5-disk handling in the recovery code. [ Impact: asynchronous raid6 recovery routines for 2data and datap cases ] Cc: Yuri Tikhonov <yur@emcraft.com> Cc: Ilya Yanok <yanok@emcraft.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:27 -07:00
Dan Williams	b2f46fd8ef	async_tx: add support for asynchronous GF multiplication [ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}p + {01}q + {00}q + {10}src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Ilya Yanok <yanok@emcraft.com> Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:27 -07:00
Dan Williams	95475e5711	async_tx: remove walk of tx->parent chain in dma_wait_for_async_tx We currently walk the parent chain when waiting for a given tx to complete however this walk may race with the driver cleanup routine. The routines in async_raid6_recov.c may fall back to the synchronous path at any point so we need to be prepared to call async_tx_quiesce() (which calls dma_wait_for_async_tx). To remove the ->parent walk we guarantee that every time a dependency is attached ->issue_pending() is invoked, then we can simply poll the initial descriptor until completion. This also allows for a lighter weight 'issue pending' implementation as there is no longer a requirement to iterate through all the channels' ->issue_pending() routines as long as operations have been submitted in an ordered chain. async_tx_issue_pending() is added for this case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:27 -07:00
Dan Williams	af1f951eb6	async_tx: kill needless module_{init\|exit} If module_init and module_exit are nops then neither need to be defined. [ Impact: pure cleanup ] Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:26 -07:00
Dan Williams	ad283ea4a3	async_tx: add sum check flags Replace the flat zero_sum_result with a collection of flags to contain the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed solomon syndrome) zero-sum result. Use the SUM_CHECK_ namespace instead of DMA_ since these flags will be used on non-dma-zero-sum enabled platforms. Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-08-29 19:09:26 -07:00
Dan Williams	04ce9ab385	async_xor: permit callers to pass in a 'dma/page scribble' region async_xor() needs space to perform dma and page address conversions. In most cases the code can simply reuse the struct page * array because the size of the native pointer matches the size of a dma/page address. In order to support archs where sizeof(dma_addr_t) is larger than sizeof(struct page *), or to preserve the input parameters, we utilize a memory region passed in by the caller. Since the code is now prepared to handle the case where it cannot perform address conversions on the stack, we no longer need the !HIGHMEM64G dependency in drivers/dma/Kconfig. [ Impact: don't clobber input buffers for address conversions ] Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-06-03 14:22:28 -07:00
Dan Williams	a08abd8ca8	async_tx: structify submission arguments, add scribble Prepare the api for the arrival of a new parameter, 'scribble'. This will allow callers to identify scratchpad memory for dma address or page address conversions. As this adds yet another parameter, take this opportunity to convert the common submission parameters (flags, dependency, callback, and callback argument) into an object that is passed by reference. Also, take this opportunity to fix up the kerneldoc and add notes about the relevant ASYNC_TX_* flags for each routine. [ Impact: moves api pass-by-value parameters to a pass-by-reference struct ] Signed-off-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-06-03 14:07:35 -07:00
Dan Williams	88ba2aa586	async_tx: kill ASYNC_TX_DEP_ACK flag In support of inter-channel chaining async_tx utilizes an ack flag to gate whether a dependent operation can be chained to another. While the flag is not set the chain can be considered open for appending. Setting the ack flag closes the chain and flags the descriptor for garbage collection. The ASYNC_TX_DEP_ACK flag essentially means "close the chain after adding this dependency". Since each operation can only have one child the api now implicitly sets the ack flag at dependency submission time. This removes an unnecessary management burden from clients of the api. [ Impact: clean up and enforce one dependency per operation ] Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-06-03 14:07:34 -07:00
Herbert Xu	d315a0e09f	crypto: hash - Fix handling of sg entry that crosses page boundary A quirk that we've always supported is having an sg entry that's bigger than a page, or more generally an sg entry that crosses page boundaries. Even though it would be better to explicitly have to sg entries for this, we need to support it for the existing users, in particular, IPsec. The new ahash sg walking code did try to handle this, but there was a bug where we didn't increment the page so kept on walking on the first page over an dover again. This patch fixes it. Tested-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-05-31 23:09:22 +10:00
Linus Torvalds	cd208bcc7c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: padlock - Revert aes-all alias to aes crypto: api - Fix algorithm module auto-loading crypto: eseqiv - Fix IV generation for sync algorithms crypto: ixp4xx - check firmware for crypto support	2009-05-17 15:48:05 -07:00
Herbert Xu	37fc334cc8	crypto: api - Fix algorithm module auto-loading The commit `a760a6656e` (crypto: api - Fix module load deadlock with fallback algorithms) broke the auto-loading of algorithms that require fallbacks. The problem is that the fallback mask check is missing an and which cauess bits that should be considered to interfere with the result. Reported-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-04-21 13:27:16 +08:00
Steffen Klassert	abe5fa7899	crypto: eseqiv - Fix IV generation for sync algorithms If crypto_ablkcipher_encrypt() returns synchronous, eseqiv_complete2() is called even if req->giv is already the pointer to the generated IV. The generated IV is overwritten with some random data in this case. This patch fixes this by calling eseqiv_complete2() just if the generated IV has to be copied to req->giv. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-04-15 20:45:03 +08:00
Dan Williams	099f53cb50	async_tx: rename zero_sum to val 'zero_sum' does not properly describe the operation of generating parity and checking that it validates against an existing buffer. Change the name of the operation to 'val' (for 'validate'). This is in anticipation of the p+q case where it is a requirement to identify the target parity buffers separately from the source buffers, because the target parity buffers will not have corresponding pq coefficients. Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-04-08 14:28:37 -07:00
Dan Williams	fd74ea6588	Merge branch 'dmaengine' into async-tx-raid6	2009-04-08 14:28:13 -07:00
Linus Torvalds	133e2a3164	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: dma: Add SoF and EoF debugging to ipu_idmac.c, minor cleanup dw_dmac: add cyclic API to DW DMA driver dmaengine: Add privatecnt to revert DMA_PRIVATE property dmatest: add dma interrupts and callbacks dmatest: add xor test dmaengine: allow dma support for async_tx to be toggled async_tx: provide __async_inline for HAS_DMA=n archs dmaengine: kill some unused headers dmaengine: initialize tx_list in dma_async_tx_descriptor_init dma: i.MX31 IPU DMA robustness improvements dma: improve section assignment in i.MX31 IPU DMA driver dma: ipu_idmac driver cosmetic clean-up dmaengine: fail device registration if channel registration fails	2009-04-03 12:13:45 -07:00
Linus Torvalds	c54c4dec61	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ixp4xx - Fix handling of chained sg buffers crypto: shash - Fix unaligned calculation with short length hwrng: timeriomem - Use phys address rather than virt	2009-04-03 09:45:53 -07:00
Linus Torvalds	223cdea4c4	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: (53 commits) md/raid5 revise rules for when to update metadata during reshape md/raid5: minor code cleanups in make_request. md: remove CONFIG_MD_RAID_RESHAPE config option. md/raid5: be more careful about write ordering when reshaping. md: don't display meaningless values in sysfs files resync_start and sync_speed md/raid5: allow layout and chunksize to be changed on active array. md/raid5: reshape using largest of old and new chunk size md/raid5: prepare for allowing reshape to change layout md/raid5: prepare for allowing reshape to change chunksize. md/raid5: clearly differentiate 'before' and 'after' stripes during reshape. Documentation/md.txt update md: allow number of drives in raid5 to be reduced md/raid5: change reshape-progress measurement to cope with reshaping backwards. md: add explicit method to signal the end of a reshape. md/raid5: enhance raid5_size to work correctly with negative delta_disks md/raid5: drop qd_idx from r6_state md/raid6: move raid6 data processing to raid6_pq.ko md: raid5 run(): Fix max_degraded for raid level 4. md: 'array_size' sysfs attribute md: centralize ->array_sectors modifications ...	2009-04-03 09:08:19 -07:00
NeilBrown	bff61975b3	md: move lots of #include lines out of .h files and into .c This makes the includes more explicit, and is preparation for moving md_k.h to drivers/md/md.h Remove include/raid/md.h as its only remaining use was to #include other files. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:33:13 +11:00
Yehuda Sadeh	f4f689933c	crypto: shash - Fix unaligned calculation with short length When the total length is shorter than the calculated number of unaligned bytes, the call to shash->update breaks. For example, calling crc32c on unaligned buffer with length of 1 can result in a system crash. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-27 13:03:51 +08:00
Linus Torvalds	562f477a54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits) crypto: sha512-s390 - Add missing block size hwrng: timeriomem - Breaks an allyesconfig build on s390: nlattr: Fix build error with NET off crypto: testmgr - add zlib test crypto: zlib - New zlib crypto module, using pcomp crypto: testmgr - Add support for the pcomp interface crypto: compress - Add pcomp interface netlink: Move netlink attribute parsing support to lib crypto: Fix dead links hwrng: timeriomem - New driver crypto: chainiv - Use kcrypto_wq instead of keventd_wq crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq crypto: api - Use dedicated workqueue for crypto subsystem crypto: testmgr - Test skciphers with no IVs crypto: aead - Avoid infinite loop when nivaead fails selftest crypto: skcipher - Avoid infinite loop when cipher fails selftest crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention crypto: api - crypto_alg_mod_lookup either tested or untested crypto: amcc - Add crypt4xx driver crypto: ansi_cprng - Add maintainer ...	2009-03-26 11:04:34 -07:00
Dan Williams	729b5d1b8e	dmaengine: allow dma support for async_tx to be toggled Provide a config option for blocking the allocation of dma channels to the async_tx api. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-03-25 09:13:25 -07:00
Dan Williams	06164f3194	async_tx: provide __async_inline for HAS_DMA=n archs To allow an async_tx routine to be compiled away on HAS_DMA=n arch it needs to be declared __always_inline otherwise the compiler may emit code and cause a link error. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-03-25 09:13:25 -07:00
Geert Uytterhoeven	0c01aed50d	crypto: testmgr - add zlib test Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:42:15 +08:00
Geert Uytterhoeven	bf68e65ec9	crypto: zlib - New zlib crypto module, using pcomp Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:16:19 +08:00
Geert Uytterhoeven	8064efb874	crypto: testmgr - Add support for the pcomp interface Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:16:18 +08:00
Geert Uytterhoeven	a1d2f09544	crypto: compress - Add pcomp interface The current "comp" crypto interface supports one-shot (de)compression only, i.e. the whole data buffer to be (de)compressed must be passed at once, and the whole (de)compressed data buffer will be received at once. In several use-cases (e.g. compressed file systems that store files in big compressed blocks), this workflow is not suitable. Furthermore, the "comp" type doesn't provide for the configuration of (de)compression parameters, and always allocates workspace memory for both compression and decompression, which may waste memory. To solve this, add a "pcomp" partial (de)compression interface that provides the following operations: - crypto_compress_{init,update,final}() for compression, - crypto_decompress_{init,update,final}() for decompression, - crypto_{,de}compress_setup(), to configure (de)compression parameters (incl. allocating workspace memory). The (de)compression methods take a struct comp_request, which was mimicked after the z_stream object in zlib, and contains buffer pointer and length pairs for input and output. The setup methods take an opaque parameter pointer and length pair. Parameters are supposed to be encoded using netlink attributes, whose meanings depend on the actual (name of the) (de)compression algorithm. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:05:33 +08:00
Adrian-Ken Rueegsegger	8c882f6413	crypto: Fix dead links Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 14:43:52 +08:00
Herbert Xu	a760a6656e	crypto: api - Fix module load deadlock with fallback algorithms With the mandatory algorithm testing at registration, we have now created a deadlock with algorithms requiring fallbacks. This can happen if the module containing the algorithm requiring fallback is loaded first, without the fallback module being loaded first. The system will then try to test the new algorithm, find that it needs to load a fallback, and then try to load that. As both algorithms share the same module alias, it can attempt to load the original algorithm again and block indefinitely. As algorithms requiring fallbacks are a special case, we can fix this by giving them a different module alias than the rest. Then it's just a matter of using the right aliases according to what algorithms we're trying to find. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-26 14:06:31 +08:00
Lee Nipper	bb402f16ec	crypto: ahash - Fix digest size in /proc/crypto crypto_ahash_show changed to use cra_ahash for digestsize reference. Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:46:26 +08:00
Huang Ying	0a2e821d62	crypto: chainiv - Use kcrypto_wq instead of keventd_wq keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:44:02 +08:00
Huang Ying	254eff7714	crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq Original cryptd thread implementation has scalability issue, this patch solve the issue with a per-CPU thread implementation. struct cryptd_queue is defined to be a per-CPU queue, which holds one struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a struct crypto_queue holds all requests for the CPU, a struct work_struct is used to run all requests for the CPU. Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine shows 19.2% performance gain. The testing script is as follow: -------------------- script begin --------------------------- #!/bin/sh dmc_create() { # Create a crypt device using dmsetup dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain babebabebabebabebabebabebabebabe 0 $1 0" } dmsetup remove crypt0 dmsetup remove crypt1 dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null dmc_create /dev/ram0 crypt0 dmc_create /dev/ram1 crypt1 cat >tr.sh <<EOF #!/bin/sh for n in \$(seq 10); do dd if=/dev/dm-0 of=/dev/null >& /dev/null & dd if=/dev/dm-1 of=/dev/null >& /dev/null & done wait EOF for n in $(seq 10); do /usr/bin/time sh tr.sh done rm tr.sh -------------------- script end --------------------------- The separator of dm-crypt parameter is changed from "-" to "?", because "-" is used in some cipher driver name too, and cryptds need to specify cipher driver name instead of cipher name. The test result on an Intel Core2 E6400 (two cores) is as follow: without patch: -----------------wo begin -------------------------- 0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6566minor)pagefaults 0swaps 0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6567minor)pagefaults 0swaps 0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6607minor)pagefaults 0swaps 0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6594minor)pagefaults 0swaps 0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6597minor)pagefaults 0swaps 0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6571minor)pagefaults 0swaps 0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6581minor)pagefaults 0swaps 0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6600minor)pagefaults 0swaps -----------------wo end -------------------------- with patch: ------------------w begin -------------------------- 0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6554minor)pagefaults 0swaps 0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6606minor)pagefaults 0swaps 0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6559minor)pagefaults 0swaps 0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6603minor)pagefaults 0swaps 0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6586minor)pagefaults 0swaps 0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6594minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps ------------------w end -------------------------- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.192. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:42:19 +08:00
Huang Ying	25c38d3fb9	crypto: api - Use dedicated workqueue for crypto subsystem Use dedicated workqueue for crypto subsystem A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:33:40 +08:00
Herbert Xu	6fe4a28d88	crypto: testmgr - Test skciphers with no IVs As it is an skcipher with no IV escapes testing altogether because we only test givcipher objects. This patch fixes the bypass logic to test these algorithms. Conversely, we're currently testing nivaead algorithms with IVs, which would have deadlocked had it not been for the fact that no nivaead algorithms have any test vectors. This patch also fixes that case. Both fixes are ugly as hell, but this ugliness should hopefully disappear once we move them into the per-type code (i.e., the AEAD test would live in aead.c and the skcipher stuff in ablkcipher.c). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 21:41:29 +08:00
Herbert Xu	5852ae4242	crypto: aead - Avoid infinite loop when nivaead fails selftest When an aead constructed through crypto_nivaead_default fails its selftest, we'll loop forever trying to construct new aead objects but failing because it already exists. The crux of the issue is that once an aead fails the selftest, we'll ignore it on the next run through crypto_aead_lookup and attempt to construct a new aead. We should instead return an error to the caller if we find an an that has failed the test. This bug hasn't manifested itself yet because we don't have any test vectors for the existing nivaead algorithms. They're tested through the underlying algorithms only. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 21:21:24 +08:00
Herbert Xu	b170a137f4	crypto: skcipher - Avoid infinite loop when cipher fails selftest When an skcipher constructed through crypto_givcipher_default fails its selftest, we'll loop forever trying to construct new skcipher objects but failing because it already exists. The crux of the issue is that once a givcipher fails the selftest, we'll ignore it on the next run through crypto_skcipher_lookup and attempt to construct a new givcipher. We should instead return an error to the caller if we find a givcipher that has failed the test. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 21:20:06 +08:00
Herbert Xu	3f683d6175	crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention This is based on a report and patch by Geert Uytterhoeven. The functions crypto_alloc_tfm and create_create_tfm return a pointer that needs to be adjusted by the caller when successful and otherwise an error value. This means that the caller has to check for the error and only perform the adjustment if the pointer returned is valid. Since all callers want to make the adjustment and we know how to adjust it ourselves, it's much easier to just return adjusted pointer directly. The only caveat is that we have to return a void * instead of struct crypto_tfm *. However, this isn't that bad because both of these functions are for internal use only (by types code like shash.c, not even algorithms code). This patch also moves crypto_alloc_tfm into crypto/internal.h (crypto_create_tfm is already there) to reflect this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:56:59 +08:00
Herbert Xu	ff753308d2	crypto: api - crypto_alg_mod_lookup either tested or untested As it stands crypto_alg_mod_lookup will search either tested or untested algorithms, but never both at the same time. However, we need exactly that when constructing givcipher and aead so this patch adds support for that by setting the tested bit in type but clearing it in mask. This combination is currently unused. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:49:43 +08:00
Neil Horman	c5b1e545a5	crypto: ansi_cprng - Panic on CPRNG test failure when in FIPS mode FIPS 140-2 specifies that all access to various cryptographic modules be prevented in the event that any of the provided self tests fail on the various implemented algorithms. We already panic when any of the test in testmgr.c fail when we are operating in fips mode. The continuous test in the cprng here was missed when that was implmented. This code simply checks for the fips_enabled flag if the test fails, and warns us via syslog or panics the box accordingly. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:07 +08:00
Neil Horman	d7992f42c6	crypto: ansi_cprng - Force reset on allocation Pseudo RNGs provide predictable outputs based on input parateters {key, V, DT}, the idea behind them is that only the user should know what the inputs are. While its nice to have default known values for testing purposes, it seems dangerous to allow the use of those default values without some sort of safety measure in place, lest an attacker easily guess the output of the cprng. This patch forces the NEED_RESET flag on when allocating a cprng context, so that any user is forced to reseed it before use. The defaults can still be used for testing, but this will prevent their inadvertent use, and be more secure. Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:06 +08:00
Huang Ying	54b6a1bd53	crypto: aes-ni - Add support to Intel AES-NI instructions for x86_64 platform Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) instructions that are going to be introduced in the next generation of Intel processor, as of 2009. These instructions enable fast and secure data encryption and decryption, using the Advanced Encryption Standard (AES), defined by FIPS Publication number 197. The architecture introduces six instructions that offer full hardware support for AES. Four of them support high performance data encryption and decryption, and the other two instructions support the AES key expansion procedure. The white paper can be downloaded from: http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf AES may be used in soft_irq context, but MMX/SSE context can not be touched safely in soft_irq context. So in_interrupt() is checked, if in IRQ or soft_irq context, the general x86_64 implementation are used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:06 +08:00
Huang Ying	1cac2cbc76	crypto: cryptd - Add support to access underlying blkcipher cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via cryptd_ablkcipher_child(). Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:05 +08:00
Herbert Xu	1693531e9e	crypto: shash - Remove superfluous check in init_tfm We're currently checking the frontend type in init_tfm. This is completely pointless because the fact that we're called at all means that the frontend is ours so the type must match as well. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:05 +08:00
Herbert Xu	8eb2dfac41	crypto: lrw - Fix big endian support It turns out that LRW has never worked properly on big endian. This was never discussed because nobody actually used it that way. In fact, it was only discovered when Geert Uytterhoeven loaded it through tcrypt which failed the test on it. The fix is straightforward, on big endian the to find the nth bit we should be grouping them by words instead of bytes. So setbit128_bbe should xor with 128 - BITS_PER_LONG instead of 128 - BITS_PER_BYTE == 0x78. Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-17 20:00:11 +08:00
Herbert Xu	4f3e797ad0	crypto: scatterwalk - Avoid flush_dcache_page on slab pages It's illegal to call flush_dcache_page on slab pages on a number of architectures. So this patch avoids doing so if PageSlab is true. In future we can move the flush_dcache_page call to those page cache users that actually need it. Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-09 14:30:25 +11:00
Herbert Xu	7b2cd92adc	crypto: api - Fix zeroing on free Geert Uytterhoeven pointed out that we're not zeroing all the memory when freeing a transform. This patch fixes it by calling ksize to ensure that we zero everything in sight. Reported-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-05 16:48:53 +11:00

1 2 3 4 5 ...

497 commits