aboutsummaryrefslogtreecommitdiff
path: root/lib/idr.c
AgeCommit message (Collapse)AuthorFilesLines
2013-02-27idr: remove length restriction from idr_layer->bitmapTejun Heo1-17/+17
Currently, idr->bitmap is declared as an unsigned long which restricts the number of bits an idr_layer can contain. All bitops can handle arbitrary positive integer bit number and there's no reason for this restriction. Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single unsigned long. * idr_layer->bitmap is now an array. '&' dropped from params to bitops. * Replaced "== IDR_FULL" tests with bitmap_full() and removed IDR_FULL. * Replaced find_next_bit() on ~bitmap with find_next_zero_bit(). * Replaced "bitmap = 0" with bitmap_clear(). This patch doesn't (or at least shouldn't) introduce any behavior changes. [[email protected]: checkpatch fixes] Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.cTejun Heo1-7/+17
MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_*() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_* constants to lib/idr.c. Signed-off-by: Tejun Heo <[email protected]> Cc: Jean Delvare <[email protected]> Cc: Roland Dreier <[email protected]> Cc: Sean Hefty <[email protected]> Cc: Hal Rosenstock <[email protected]> Cc: "Marciniszyn, Mike" <[email protected]> Cc: Jack Morgenstein <[email protected]> Cc: Or Gerlitz <[email protected]> Cc: Al Viro <[email protected]> Acked-by: Wolfram Sang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: fix top layer handlingTejun Heo1-15/+23
Most functions in idr fail to deal with the high bits when the idr tree grows to the maximum height. * idr_get_empty_slot() stops growing idr tree once the depth reaches MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to cover the whole range. The function doesn't even notice that it didn't grow the tree enough and ends up allocating the wrong ID given sufficiently high @starting_id. For example, on 64 bit, if the starting id is 0x7fffff01, idr_get_empty_slot() will grow the tree 5 layer deep, which only covers the 30 bits and then proceed to allocate as if the bit 30 wasn't specified. It ends up allocating 0x3fffff01 without the bit 30 but still returns 0x7fffff01. * __idr_remove_all() will not remove anything if the tree is fully grown. * idr_find() can't find anything if the tree is fully grown. * idr_for_each() and idr_get_next() can't iterate anything if the tree is fully grown. Fix it by introducing idr_max() which returns the maximum possible ID given the depth of tree and replacing the id limit checks in all affected places. As the idr_layer pointer array pa[] needs to be 1 larger than the maximum depth, enlarge pa[] arrays by one. While this plugs the discovered issues, the whole code base is horrible and in desparate need of rewrite. It's fragile like hell, Signed-off-by: Tejun Heo <[email protected]> Cc: Rusty Russell <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: implement idr_preload[_end]() and idr_alloc()Tejun Heo1-8/+166
The current idr interface is very cumbersome. * For all allocations, two function calls - idr_pre_get() and idr_get_new*() - should be made. * idr_pre_get() doesn't guarantee that the following idr_get_new*() will not fail from memory shortage. If idr_get_new*() returns -EAGAIN, the caller is expected to retry pre_get and allocation. * idr_get_new*() can't enforce upper limit. Upper limit can only be enforced by allocating and then freeing if above limit. * idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping around MAX_IDR_FREE idr_layers. The memory consumed per idr is under two pages but it makes it difficult to make idr_layer larger. This patch implements the following new set of allocation functions. * idr_preload[_end]() - Similar to radix preload but doesn't fail. The first idr_alloc() inside preload section can be treated as if it were called with @gfp_mask used for idr_preload(). * idr_alloc() - Allocate an ID w/ lower and upper limits. Takes @gfp_flags and can be used w/o preloading. When used inside preloaded section, the allocation mask of preloading can be assumed. If idr_alloc() can be called from a context which allows sufficiently relaxed @gfp_mask, it can be used by itself. If, for example, idr_alloc() is called inside spinlock protected region, preloading can be used like the following. idr_preload(GFP_KERNEL); spin_lock(lock); id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT); spin_unlock(lock); idr_preload_end(); if (id < 0) error; which is much simpler and less error-prone than idr_pre_get and idr_get_new*() loop. The new interface uses per-pcu idr_layer buffer and thus the number of idr's in the system doesn't affect the amount of memory used for preloading. idr_layer_alloc() is introduced to handle idr_layer allocations for both old and new ID allocation paths. This is a bit hairy now but the new interface is expected to replace the old and the internal implementation eventually will become simpler. Signed-off-by: Tejun Heo <[email protected]> Cc: Rusty Russell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: refactor idr_get_new_above()Tejun Heo1-18/+12
Move slot filling to idr_fill_slot() from idr_get_new_above_int() and make idr_get_new_above() directly call it. idr_get_new_above_int() is no longer needed and removed. This will be used to implement a new ID allocation interface. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: remove _idr_rc_to_errno() hackTejun Heo1-12/+23
idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate exception conditions internally. The return value is later translated to errno values using _idr_rc_to_errno(). This is confusing. Drop the custom ones and consistently use -EAGAIN for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for "ran out of ID space". Due to the weird memory preloading mechanism, [ra]_get_new*() return -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/ -EAGAIN on those interface functions. They'll eventually be cleaned up and the translations will go away. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new()Tejun Heo1-49/+0
* Move idr_for_each_entry() definition next to other idr related definitions. * Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above(). This changes the implementation of idr_get_new() but the new implementation is trivial. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: deprecate idr_remove_all()Tejun Heo1-7/+3
There was only one legitimate use of idr_remove_all() and a lot more of incorrect uses (or lack of it). Now that idr_destroy() implies idr_remove_all() and all the in-kernel users updated not to use it, there's no reason to keep it around. Mark it deprecated so that we can later unexport it. idr_remove_all() is made an inline function calling __idr_remove_all() to avoid triggering deprecated warning on EXPORT_SYMBOL(). Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: make idr_destroy() imply idr_remove_all()Tejun Heo1-9/+11
idr is silly in quite a few ways, one of which is how it's supposed to be destroyed - idr_destroy() doesn't release IDs and doesn't even whine if the idr isn't empty. If the caller forgets idr_remove_all(), it simply leaks memory. Even ida gets this wrong and leaks memory on destruction. There is absoltely no reason not to call idr_remove_all() from idr_destroy(). Nobody is abusing idr_destroy() for shrinking free layer buffer and continues to use idr after idr_destroy(), so it's safe to do remove_all from destroy. In the whole kernel, there is only one place where idr_remove_all() is legitimiately used without following idr_destroy() while there are quite a few places where the caller forgets either idr_remove_all() or idr_destroy() leaking memory. This patch makes idr_destroy() call idr_destroy_all() and updates the function description accordingly. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-02-27idr: fix a subtle bug in idr_get_next()Tejun Heo1-1/+8
The iteration logic of idr_get_next() is borrowed mostly verbatim from idr_for_each(). It walks down the tree looking for the slot matching the current ID. If the matching slot is not found, the ID is incremented by the distance of single slot at the given level and repeats. The implementation assumes that during the whole iteration id is aligned to the layer boundaries of the level closest to the leaf, which is true for all iterations starting from zero or an existing element and thus is fine for idr_for_each(). However, idr_get_next() may be given any point and if the starting id hits in the middle of a non-existent layer, increment to the next layer will end up skipping the same offset into it. For example, an IDR with IDs filled between [64, 127] would look like the following. [ 0 64 ... ] /----/ | | | NULL [ 64 ... 127 ] If idr_get_next() is called with 63 as the starting point, it will try to follow down the pointer from 0. As it is NULL, it will then try to proceed to the next slot in the same level by adding the slot distance at that level which is 64 - making the next try 127. It goes around the loop and finds and returns 127 skipping [64, 126]. Note that this bug also triggers in idr_for_each_entry() loop which deletes during iteration as deletions can make layers go away leaving the iteration with unaligned ID into missing layers. Fix it by ensuring proceeding to the next slot doesn't carry over the unaligned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo <[email protected]> Reported-by: David Teigland <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06idr: rename MAX_LEVEL to MAX_IDR_LEVELFengguang Wu1-16/+16
To avoid name conflicts: drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined While at it, also make the other names more consistent and add parentheses. [[email protected]: repair fallout] [[email protected]: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change] Signed-off-by: Fengguang Wu <[email protected]> Cc: Bernd Petrovitsch <[email protected]> Cc: walter harms <[email protected]> Cc: Glauber Costa <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Cc: Roland Dreier <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-24Merge tag 'module-for-3.4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull cleanup of fs/ and lib/ users of module.h from Paul Gortmaker: "Fix up files in fs/ and lib/ dirs to only use module.h if they really need it. These are trivial in scope vs the work done previously. We now have things where any few remaining cleanups can be farmed out to arch or subsystem maintainers, and I have done so when possible. What is remaining here represents the bits that don't clearly lie within a single arch/subsystem boundary, like the fs dir and the lib dir. Some duplicate includes arising from overlapping fixes from independent subsystem maintainer submissions are also quashed." Fix up trivial conflicts due to clashes with other include file cleanups (including some due to the previous bug.h cleanup pull). * tag 'module-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: lib: reduce the use of module.h wherever possible fs: reduce the use of module.h wherever possible includecheck: delete any duplicate instances of module.h
2012-03-21idr: make idr_get_next() good for rcu_read_lock()Hugh Dickins1-3/+5
Make one small adjustment to idr_get_next(): take the height from the top layer (stable under RCU) instead of from the root (unprotected by RCU), as idr_find() does: so that it can be used with RCU locking. Copied comment on RCU locking from idr_find(). Signed-off-by: Hugh Dickins <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Eric Dumazet <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-07lib: reduce the use of module.h wherever possiblePaul Gortmaker1-1/+1
For files only using THIS_MODULE and/or EXPORT_SYMBOL, map them onto including export.h -- or if the file isn't even using those, then just delete the include. Fix up any implicit include dependencies that were being masked by module.h along the way. Signed-off-by: Paul Gortmaker <[email protected]>
2011-11-02ida: make ida_simple_get/put() IRQ safeTejun Heo1-4/+7
It's often convenient to be able to release resource from IRQ context. Make ida_simple_*() use irqsave/restore spin ops so that they are IRQ safe. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Rusty Russell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-10-31lib/idr.c: fix comment for ida_get_new_above()Wang Sheng-Hui1-2/+2
Signed-off-by: Wang Sheng-Hui <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-09-15Merge branch 'master' into for-nextJiri Kosina1-0/+67
Fast-forward merge with Linus to be able to merge patches based on more recent version of the tree.
2011-08-04Fix kernel-doc comment typo '@id'Paul Bolle1-1/+1
Signed-off-by: Paul Bolle <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2011-08-03ida: simplified functions for id allocationRusty Russell1-0/+67
The current hyper-optimized functions are overkill if you simply want to allocate an id for a device. Create versions which use an internal lock. In followup patches, numerous drivers are converted to use this interface. Thanks to Tejun for feedback. Signed-off-by: Rusty Russell <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Jonathan Cameron <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-10-26docbook: add idr/ida to kernel-api docbookRandy Dunlap1-24/+25
Add idr/ida to kernel-api docbook. Fix typos and kernel-doc notation. Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Naohiro Aota <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-10-26idr: fix idr_pre_get() locking descriptionNaohiro Aota1-11/+13
Despite the idr_pre_get() kernel-doc, there are some cases where you can call idr_pre_get() from within locked regions. Add a description for such cases. See also: http://lkml.org/lkml/2010/9/16/462 [[email protected]: cleanups, grammatical fixes] Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-08-31idr: describe how nextidp works in idr_get_next().Naohiro Aota1-1/+2
It was unclear in original kernel-doc how nextidp worked in idr_get_next(). Let's describe it. Signed-off-by: Naohiro Aota <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2010-08-31idr: fix kernel-doc warnings.Naohiro Aota1-5/+5
Fix the following kernel-doc warnings. % perl scripts/kernel-doc lib/idr.c > /dev/null Warning(lib/idr.c:300): No description found for parameter 'starting_id' Warning(lib/idr.c:300): Excess function parameter 'start_id' description in 'idr_get_new_above' Warning(lib/idr.c:485): No description found for parameter 'idp' Warning(lib/idr.c:596): No description found for parameter 'nextidp' Warning(lib/idr.c:596): Excess function parameter 'id' description in 'idr_get_next' Warning(lib/idr.c:774): No description found for parameter 'starting_id' Warning(lib/idr.c:774): Excess function parameter 'staring_id' description in 'ida_get_new_above' Warning(lib/idr.c:918): No description found for parameter 'ida' Signed-off-by: Naohiro Aota <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2010-06-23idr: fix RCU lockdep splat in idr_get_next()Paul E. McKenney1-2/+2
Convert to rcu_dereference_raw() given that many callers may have many different locking models. Located-by: Miles Lane <[email protected]> Tested-by: Miles Lane <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2010-05-27idr: fix backtrack logic in idr_remove_allImre Deak1-1/+4
Currently idr_remove_all will fail with a use after free error if idr::layers is bigger than 2, which on 32 bit systems corresponds to items more than 1024. This is due to stepping back too many levels during backtracking. For simplicity let's assume that IDR_BITS=1 -> we have 2 nodes at each level below the root node and each leaf node stores two IDs. (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root level and 32 IDs in each leaf node). The sequence of freeing the nodes at the moment is as follows: layer 1 -> a(7) 2 -> b(3) c(5) 3 -> d(1) e(2) f(4) g(6) Until step 4 things go fine, but then node c is freed, whereas node g should be freed first. Since node c contains the pointer to node g we'll have a use after free error at step 6. How many levels we step back after visiting the leaf nodes is currently determined by the msb of the id we are currently visiting: Step 1. node d with IDs 0,1 is freed, current ID is advanced to 2. msb of the current ID bit 1. This means we need to step back 1 level to node b and take the next sibling, node e. 2-3. node e with IDs 2,3 is freed, current ID is 4, msb is bit 2. This means we need to step back 2 levels to node a, freeing node b on the way. 4-5. node f with IDs 4,5 is freed, current ID is 6, msb is still bit 2. This means we again need to step back 2 levels to node a and free c on the way. 6. We should visit node g, but its pointer is not available as node c was freed. The fix changes how we determine the number of levels to step back. Instead of deducting this merely from the msb of the current ID, we should really check if advancing the ID causes an overflow to a bit position corresponding to a given layer. In the above example overflow from bit 0 to bit 1 should mean stepping back 1 level. Overflow from bit 1 to bit 2 should mean stepping back 2 levels and so on. The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers on 32 bit systems. Signed-off-by: Imre Deak <[email protected]> Reviewed-by: Tejun Heo <[email protected]> Cc: Eric Paris <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: <[email protected]> [2.6.34.1] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-03-26Merge branch 'master' of ↵David Woodhouse1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/mtd/nand/sh_flctl.c Maxim's patch to initialise sysfs attributes depends on the patch which actually adds sysfs_attr_init().
2010-02-26Merge branch 'master' of ↵David Woodhouse1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 The SmartMedia FTL code depends on new kfifo bits from 2.6.33
2010-02-25idr: export idr_get_next()Ben Hutchings1-1/+1
idr_get_next() was accidentally not exported when added. It is about to be used by mtdcore, which may be built as a module. Signed-off-by: Ben Hutchings <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Artem Bityutskiy <[email protected]> Signed-off-by: David Woodhouse <[email protected]>
2010-02-25idr: Apply lockdep-based diagnostics to rcu_dereference() usesPaul E. McKenney1-4/+4
Because idr can be used with any of a number of locks or with any flavor of RCU, just disable the lockdep-based diagnostics. If idr needs diagnostics, the check expression will need to be passed into the relevant idr primitives as an additional argument. Signed-off-by: Paul E. McKenney <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-02-22idr: fix a critical misallocation bug, take#2Tejun Heo1-1/+3
This is retry of reverted 859ddf09743a8cc680af33f7259ccd0fd36bfe9d ("idr: fix a critical misallocation bug") which contained two bugs. * pa[idp->layers] should be cleared even if it's not used by sub_alloc() because it's used by mark idr_mark_full(). * The original condition check also assigned pa[l] to p which the new code didn't do thus leaving p pointing at the wrong layer. Both problems have been fixed and the idr code has received good amount testing using userland testing setup where simple bitmap allocator is run parallel to verify the result of idr allocation. The bug this patch fixes is caused by sub_alloc() optimization path bypassing out-of-room condition check and restarting allocation loop with starting value higher than maximum allowed value. For detailed description, please read commit message of 859ddf09. Signed-off-by: Tejun Heo <[email protected]> Based-on-patch-from: Eric Paris <[email protected]> Reported-by: Eric Paris <[email protected]> Tested-by: Stefan Lippers-Hollmann <[email protected]> Tested-by: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-02-04idr: revert misallocation bug fixTejun Heo1-3/+4
Commit 859ddf09743a8cc680af33f7259ccd0fd36bfe9d tried to fix misallocation bug but broke full bit marking by not clearing pa[idp->layers] and also is causing X failures due to lookup failure in drm code. The cause of the latter hasn't been found yet. Revert the fix for now. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-02-02idr: fix a critical misallocation bugTejun Heo1-4/+3
Eric Paris located a bug in idr. With IDR_BITS of 6, it grows to three layers when id 4096 is first allocated. When that happens, idr wraps incorrectly and searches the idr array ignoring the high bits. The following test code from Eric demonstrates the bug nicely. #include <linux/idr.h> #include <linux/kernel.h> #include <linux/module.h> static DEFINE_IDR(test_idr); int init_module(void) { int ret, forty95, forty96; void *addr; /* add 2 entries both with 4095 as the start address */ again1: if (!idr_pre_get(&test_idr, GFP_KERNEL)) return -ENOMEM; ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95); if (ret) { if (ret == -EAGAIN) goto again1; return ret; } if (forty95 != 4095) printk(KERN_ERR "hmmm, forty95=%d\n", forty95); again2: if (!idr_pre_get(&test_idr, GFP_KERNEL)) return -ENOMEM; ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96); if (ret) { if (ret == -EAGAIN) goto again2; return ret; } if (forty96 != 4096) printk(KERN_ERR "hmmm, forty96=%d\n", forty96); /* try to find the 2 entries, noticing that 4096 broke */ addr = idr_find(&test_idr, forty95); if ((int)addr != forty95) printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr); addr = idr_find(&test_idr, forty96); if ((int)addr != forty96) printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr); /* really weird, the entry which should be at 4096 is actually at 0!! */ addr = idr_find(&test_idr, 0); if ((int)addr) printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr); idr_remove(&test_idr, forty95); idr_remove(&test_idr, forty96); return 0; } void cleanup_module(void) { } MODULE_AUTHOR("Eric Paris <[email protected]>"); MODULE_DESCRIPTION("Simple idr test"); MODULE_LICENSE("GPL"); This happens because when sub_alloc() back tracks it doesn't always do it step-by-step while the over-the-limit detection assumes step-by-step backtracking. The logic in sub_alloc() looks like the following. restart: clear pa[top level + 1] for end cond detection l = top level while (true) { search for empty slot at this level if (not found) { push id to the next possible value l++ A: if (pa[l] is clear) failed, return asking caller to grow the tree if (going up 1 level gives more slots to search) continue the while loop above with the incremented l else C: goto restart } adjust id accordingly to the found slot if (l == 0) return found id; create lower level if not there yet record pa[l] and l-- } Test A is the fail exit condition but this assumes that failure is propagated upwared one level at a time but the B optimization path breaks the assumption and restarts the whole thing with a start value which is above the possible limit with the current layers. sub_alloc() assumes the start id value is inside the limit when called and test A is the only exit condition check, so it ends up searching for empty slot while ignoring high set bit. So, for 4095->4096 test, level0 search fails but pa[1] contains a valid pointer. However, going up 1 level wouldn't give any more empty slot so it takes C and when the whole thing restarts nobody notices the high bit set beyond the top level. This patch fixes the bug by changing the fail exit condition check to full id limit check. Based-on-patch-from: Eric Paris <[email protected]> Reported-by: Eric Paris <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-12-04tree-wide: fix some typos and punctuation in commentsThadeu Lima de Souza Cascardo1-2/+2
fix some typos and punctuation in comments Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2009-04-02cgroup: CSS ID supportKAMEZAWA Hiroyuki1-0/+46
Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code. This patch attaches unique ID to each css and provides following. - css_lookup(subsys, id) returns pointer to struct cgroup_subysys_state of id. - css_get_next(subsys, id, rootid, depth, foundid) returns the next css under "root" by scanning When cgroup_subsys->use_id is set, an id for css is maintained. The cgroup framework only parepares - css_id of root css for subsys - id is automatically attached at creation of css. - id is *not* freed automatically. Because the cgroup framework don't know lifetime of cgroup_subsys_state. free_css_id() function is provided. This must be called by subsys. There are several reasons to develop this. - Saving space .... For example, memcg's swap_cgroup is array of pointers to cgroup. But it is not necessary to be very fast. By replacing pointers(8bytes per ent) to ID (2byes per ent), we can reduce much amount of memory usage. - Scanning without lock. CSS_ID provides "scan id under this ROOT" function. By this, scanning css under root can be written without locks. ex) do { rcu_read_lock(); next = cgroup_get_next(subsys, id, root, &found); /* check sanity of next here */ css_tryget(); rcu_read_unlock(); id = found + 1 } while(...) Characteristics: - Each css has unique ID under subsys. - Lifetime of ID is controlled by subsys. - css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy - Allowed ID is 1-65535, ID 0 is UNUSED ID. Design Choices: - scan-by-ID v.s. scan-by-tree-walk. As /proc's pid scan does, scan-by-ID is robust when scanning is done by following kind of routine. scan -> rest a while(release a lock) -> conitunue from interrupted memcg's hierarchical reclaim does this. - When subsys->use_id is set, # of css in the system is limited to 65535. [[email protected]: remove rcu_read_lock() from css_get_next()] Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Paul Menage <[email protected]> Cc: Li Zefan <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Daisuke Nishimura <[email protected]> Signed-off-by: Bharata B Rao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-03-10idr: make idr_remove_all() do removal -before- free_layer()Paul E. McKenney1-1/+1
Fix a problem in the IDR system, where an idr_remove_all() hands a data element to call_rcu() (via free_layer()) before making that data element inaccessible to new readers. This is very bad, and results in readers still having a reference to this data element at the end of the grace period. Tests on large machines that concurrently map and unmap user-space memory within the same multithreaded process result in crashes within about five minutes. Applying this patch increases the kernel's longevity to the three-to-eight-hour range. There appear to be other similar problems in idr_get_empty_slot() and sub_remove(), but I fixed the easy one in idr_remove_all() first. It is therefore no surprise that failures still occur. Located-by: Milton Miller II <[email protected]> Tested-by: Milton Miller II <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-01-15lib/idr.c: use kmem_cache_zalloc() for the idr_layer cacheAndrew Morton1-8/+2
David points out that the idr_remove_all() function returns unused slabs to the kmem cache, but needs to zero them first or else they will be uninitialized upon next use. This causes crashes which have been observed in the firewire subsystem. He fixed this by zeroing the object before freeing it in idr_remove_all(). But we agree that simply removing the constructor and zeroing the object at allocation time is simpler than relying upon slab constructor machinery and might even be faster. This problem was introduced by "idr: make idr_remove rcu-safe" (commit cf481c20c476ad2c0febdace9ce23f5a4db19582), which was first released in 2.6.27. There are no known codesites which trigger this bug in 2.6.27 or 2.6.28. The post-2.6.28 firewire changes are the only known triggerer. There might of course be not-yet-discovered triggerers in 2.6.27 and 2.6.28, and there might be out-of-tree triggerers which are added to those kernel versions. I'll let the -stable guys decide whether they want to backport this fix. Reported-by: David Moore <[email protected]> Cc: Stefan Richter <[email protected]> Cc: Nadia Derbey <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Kristian Hgsberg <[email protected]> Acked-by: Pekka Enberg <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-01-15idr: fix wrong kernel-docLi Zefan1-2/+2
idr_get_new_above() and ida_get_new_above() return an id in the range of @staring_id ... 0x7fffffff, not 0 ... 0x7fffffff. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-12-10lib/idr.c: Fix bug introduced by RCU fixManfred Spraul1-1/+7
The last patch to lib/idr.c caused a bug if idr_get_new_above() was called on an empty idr. Usually, nodes stay on the same layer. New layers are added to the top of the tree. The exception is idr_get_new_above() on an empty tree: In this case, the new root node is first added on layer 0, then moved upwards. p->layer was not updated. As usual: You shall never rely on the source code comments, they will only mislead you. Signed-off-by: Manfred Spraul <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-12-01lib/idr.c: fix rcu related race with idr_findManfred Spraul1-2/+12
2nd part of the fixes needed for http://bugzilla.kernel.org/show_bug.cgi?id=11796. When the idr tree is either grown or shrunk, then the update to the number of layers and the top pointer were not atomic. This race caused crashes. The attached patch fixes that by replicating the layers counter in each layer, thus idr_find doesn't need idp->layers anymore. Signed-off-by: Manfred Spraul <[email protected]> Cc: Clement Calmels <[email protected]> Cc: Nadia Derbey <[email protected]> Cc: Pierre Peiffer <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-26SL*B: drop kmem cache argument from constructorAlexey Dobriyan1-1/+1
Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <[email protected]> Acked-by: Pekka Enberg <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Jon Tollefson <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Matt Mackall <[email protected]> [[email protected]: fix arch/powerpc/mm/hugetlbpage.c] [[email protected]: fix mm/slab.c] [[email protected]: fix ubifs] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: make idr_remove rcu-safeNadia Derbey1-14/+43
Introduce the free_layer() routine: it is the one that actually frees memory after a grace period has elapsed. Signed-off-by: Nadia Derbey <[email protected]> Reviewed-by: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: make idr_find rcu-safeNadia Derbey1-5/+6
Make idr_find rcu-safe: it can now be called inside an rcu_read critical section. Signed-off-by: Nadia Derbey <[email protected]> Reviewed-by: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: make idr_get_new* rcu-safeNadia Derbey1-5/+9
Make the idr_get_new* routines rcu-safe. Signed-off-by: Nadia Derbey <[email protected]> Reviewed-by: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: error checking factorizationNadia Derbey1-21/+9
Do some code factorization in the return code analysis. Signed-off-by: Nadia Derbey <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: fix a printk callNadia Derbey1-1/+2
Fix the incomplete printk call. Signed-off-by: Nadia Derbey <[email protected]> Reviewed-by: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-25idr: rename some of the idr APIs internal routinesNadia Derbey1-15/+16
This is a trivial patch that renames: . alloc_layer to get_from_free_list since it idr_pre_get that actually allocates memory. . free_layer to move_to_free_list since memory is not actually freed there. This makes things more clear for the next patches. Signed-off-by: Nadia Derbey <[email protected]> Reviewed-by: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Cc: Pierre Peiffer <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-01idr: fix idr_remove()Nadia Derbey1-1/+1
The return inside the loop makes us free only a single layer. Signed-off-by: Nadia Derbey <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Jim Houston <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29idr: create idr_layer_cache at boot timeAkinobu Mita1-6/+4
Avoid a possible kmem_cache_create() failure by creating idr_layer_cache unconditionary at boot time rather than creating it on-demand when idr_init() is called the first time. This change also enables us to eliminate the check every time idr_init() is called. [[email protected]: rename init_id_cache() to idr_init_cache()] [[email protected]: fix alpha build] Signed-off-by: Akinobu Mita <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-17Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter1-2/+1
Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [[email protected]: coupla fixes] Signed-off-by: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-14more low-hanging fruits - kernel, fs, lib signednessAl Viro1-1/+1
Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>