aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-09-11ipc: drop ipcctl_pre_downDavidlohr Bueso2-23/+4
Now that sem, msgque and shm, through *_down(), all use the lockless variant of ipcctl_pre_down(), go ahead and delete it. [[email protected]: fix function name in kerneldoc, cleanups] Signed-off-by: Davidlohr Bueso <[email protected]> Tested-by: Sedat Dilek <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11ipc,shm: shorten critical region in shmctl_downDavidlohr Bueso1-4/+6
Instead of holding the ipc lock for the entire function, use the ipcctl_pre_down_nolock and only acquire the lock for specific commands: RMID and SET. Signed-off-by: Davidlohr Bueso <[email protected]> Tested-by: Sedat Dilek <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11ipc,shm: introduce lockless functions to obtain the ipc objectDavidlohr Bueso1-0/+20
This is the third and final patchset that deals with reducing the amount of contention we impose on the ipc lock (kern_ipc_perm.lock). These changes mostly deal with shared memory, previous work has already been done for semaphores and message queues: http://lkml.org/lkml/2013/3/20/546 (sems) http://lkml.org/lkml/2013/5/15/584 (mqueues) With these patches applied, a custom shm microbenchmark stressing shmctl doing IPC_STAT with 4 threads a million times, reduces the execution time by 50%. A similar run, this time with IPC_SET, reduces the execution time from 3 mins and 35 secs to 27 seconds. Patches 1-8: replaces blindly taking the ipc lock for a smarter combination of rcu and ipc_obtain_object, only acquiring the spinlock when updating. Patch 9: renames the ids rw_mutex to rwsem, which is what it already was. Patch 10: is a trivial mqueue leftover cleanup Patch 11: adds a brief lock scheme description, requested by Andrew. This patch: Add shm_obtain_object() and shm_obtain_object_check(), which will allow us to get the ipc object without acquiring the lock. Just as with other forms of ipc, these functions are basically wrappers around ipc_obtain_object*(). Signed-off-by: Davidlohr Bueso <[email protected]> Tested-by: Sedat Dilek <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11initmpfs: use initramfs if rootfstype= or root= specifiedRob Landley2-4/+15
Command line option rootfstype=ramfs to obtain old initramfs behavior, and use ramfs instead of tmpfs for stub when root= defined (for cosmetic reasons). [[email protected]: coding-style fixes] Signed-off-by: Rob Landley <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Jim Cromie <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11initmpfs: make rootfs use tmpfs when CONFIG_TMPFS enabledRob Landley2-2/+12
Conditionally call the appropriate fs_init function and fill_super functions. Add a use once guard to shmem_init() to simply succeed on a second call. (Note that IS_ENABLED() is a compile time constant so dead code elimination removes unused function calls when CONFIG_TMPFS is disabled.) Signed-off-by: Rob Landley <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Jim Cromie <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11initmpfs: move rootfs code from fs/ramfs/ to init/Rob Landley5-33/+36
When the rootfs code was a wrapper around ramfs, having them in the same file made sense. Now that it can wrap another filesystem type, move it in with the init code instead. This also allows a subsequent patch to access rootfstype= command line arg. Signed-off-by: Rob Landley <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Jim Cromie <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11initmpfs: move bdi setup from init_rootfs to init_ramfsRob Landley1-6/+19
Even though ramfs hasn't got a backing device, commit e0bf68ddec4f ("mm: bdi init hooks") added one anyway, and put the initialization in init_rootfs() since that's the first user, leaving it out of init_ramfs() to avoid duplication. But initmpfs uses init_tmpfs() instead, so move the init into the filesystem's init function, add a "once" guard to prevent duplicate initialization, and call the filesystem init from rootfs init. This goes part of the way to allowing ramfs to be built as a module. [[email protected]; using bit 1 was odd] Signed-off-by: Rob Landley <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Jim Cromie <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11initmpfs: replace MS_NOUSER in initramfsRob Landley1-1/+6
Mounting MS_NOUSER prevents --bind mounts from rootfs. Prevent new rootfs mounts with a different mechanism that doesn't affect bind mounts. Signed-off-by: Rob Landley <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Jim Cromie <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interruptJan Kara7-8/+46
With users of radix_tree_preload() run from interrupt (block/blk-ioc.c is one such possible user), the following race can happen: radix_tree_preload() ... radix_tree_insert() radix_tree_node_alloc() if (rtp->nr) { ret = rtp->nodes[rtp->nr - 1]; <interrupt> ... radix_tree_preload() ... radix_tree_insert() radix_tree_node_alloc() if (rtp->nr) { ret = rtp->nodes[rtp->nr - 1]; And we give out one radix tree node twice. That clearly results in radix tree corruption with different results (usually OOPS) depending on which two users of radix tree race. We fix the problem by making radix_tree_node_alloc() always allocate fresh radix tree nodes when in interrupt. Using preloading when in interrupt doesn't make sense since all the allocations have to be atomic anyway and we cannot steal nodes from process-context users because some users rely on radix_tree_insert() succeeding after radix_tree_preload(). in_interrupt() check is somewhat ugly but we cannot simply key off passed gfp_mask as that is acquired from root_gfp_mask() and thus the same for all preload users. Another part of the fix is to avoid node preallocation in radix_tree_preload() when passed gfp_mask doesn't allow waiting. Again, preallocation in such case doesn't make sense and when preallocation would happen in interrupt we could possibly leak some allocated nodes. However, some users of radix_tree_preload() require following radix_tree_insert() to succeed. To avoid unexpected effects for these users, radix_tree_preload() only warns if passed gfp mask doesn't allow waiting and we provide a new function radix_tree_maybe_preload() for those users which get different gfp mask from different call sites and which are prepared to handle radix_tree_insert() failure. Signed-off-by: Jan Kara <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11drivers/w1/masters/mxc_w1.c: remove unnecessary platform_set_drvdata()Jingoo Han1-2/+0
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <[email protected]> Cc: Evgeniy Polyakov <[email protected]> Cc: Greg KH <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11drivers/w1/w1.c: replace strict_strtol() with kstrtol()Jingoo Han1-4/+8
The usage of strict_strtol() is not preferred, because strict_strtol() is obsolete. Thus, kstrtol() should be used. Signed-off-by: Jingoo Han <[email protected]> Cc: Evgeniy Polyakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11memstick: add support for legacy memorysticksMaxim Levitsky5-1/+2693
Based partially on MS standard spec quotes from Alex Dubov. As any code that works with user data this driver isn't recommended to use to write cards that contain valuable data. It tries its best though to avoid data corruption and possible damage to the card. Tested on MS DUO 64 MB card on Ricoh R592 card reader. Signed-off-by: Maxim Levitsky <[email protected]> Cc: Valdis Kletnieks <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Alex Dubov <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11drivers/memstick/host/rtsx_pci_ms.c: remove unnecessary platform_set_drvdata()Jingoo Han1-2/+0
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <[email protected]> Cc: Maxim Levitsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11drivers/pps/clients/pps-gpio.c: remove unnecessary platform_set_drvdata()Jingoo Han1-1/+0
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <[email protected]> Cc: Rodolfo Giometti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: fix defective misuses of pkt_<level>Joe Perches1-3/+3
Fix thinkos where pkt_<level> needs a valid pktcdvd_device * and the pointer is known to be NULL. Signed-off-by: Joe Perches <[email protected]> Reported-by: Dan Carpenter <[email protected]> (go smatch!) Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: add struct pktcdvd_device * to pkt_dump_sense()Joe Perches1-15/+16
Allow the device name to be emitted with pkt_err when logging the sense data. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: convert pr_info to pkt_infoJoe Perches1-6/+8
Add a new pkt_info macro to prefix the name to the logging output. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: convert pr_notice to pkt_noticeJoe Perches1-10/+12
Add a new pkt_notice macro to prefix the name to the logging output. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: add struct pktcdvd_device.name to pr_err logging where possibleJoe Perches1-20/+24
Add a new pkt_err macro to prefix the name to the logging output. Convert pr_err where there is a non-null struct pktcdvd_device. Includes improvements from Andy Shevchenko. Signed-off-by: Joe Perches <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: add struct pktcdvd_device * to pkt_dbgJoe Perches1-48/+42
Add pd->name to output for these debugging messages. Remove normally compiled out pkt_dbg(2, ...) function entry tracing equivalents as it's better done via the function tracer. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: consolidate DPRINTK and VPRINTK macrosJoe Perches1-54/+53
Use the more common pkt_dbg(level, fmt, ...) form. These messages are emitted at KERN_NOTICE. Always emit function name with pkt_dbg(2, ...) uses and remove the sometimes abbreviated embedded function name. This form always verifies the format and arguments. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: convert printk to pr_<level>Joe Perches1-61/+61
Use a more current logging style and add messages levels to the logging messages. Simplify pkt_dump_sense by using %*ph and adding a simple function to emit the sense string. Includes improvements from Andy Shevchenko and Dan Carpenter. Signed-off-by: Joe Perches <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11pktcdvd: convert ZONE macro to static function get_zone()Joe Perches1-10/+10
Macros should be converted to functions where feasible to verify arguments and the like. Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11panic: call panic handlers before kmsg_dumpKees Cook1-2/+6
Since the panic handlers may produce additional information (via printk) for the kernel log, it should be reported as part of the panic output saved by kmsg_dump(). Without this re-ordering, nothing that adds information to a panic will show up in pstore's view when kmsg_dump runs, and is therefore not visible to crash reporting tools that examine pstore output. Signed-off-by: Kees Cook <[email protected]> Cc: Anton Vorontsov <[email protected]> Cc: Colin Cross <[email protected]> Acked-by: Tony Luck <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Vikram Mulukutla <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rusty Russell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11affs: use loff_t in affs_truncate()Dan Carpenter1-1/+1
It seems pretty unlikely that AFFS supports files over 4GB but we may as well leave use loff_t just for cleanness sake instead of truncating it to 32 bits. Signed-off-by: Dan Carpenter <[email protected]> Cc: Marco Stornelli <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: remove do-nothing NAME="%k" term from example udev rulesEd Cashin1-1/+1
When the example udev rules in the documentation are used without modification, warnings like the one shown below appear in the system logs: /var/log/messages:Aug 22 11:09:11 kung udevd[445]: NAME="%k" \ is superfluous and breaks kernel supplied names, please remove \ it from /etc/udev/rules.d/60-aoe.rules:26 Removing the term does not cause any problems with the creation of the special character and block device nodes. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: do not BUG if memory pressure prevented debugfs file creationEd Cashin1-1/+0
If the system has trouble allocating memory for the creation of the aoe debugfs directory or of a file inside it, the debugfs member of an aoedev can be NULL. Do not treat a NULL debugfs pointer as a BUG on aoedev shutdown, avoiding the user impact of an unecessary panic. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: suppress compiler warningsAndy Shevchenko1-4/+0
This patch fixes following compiler warnings: drivers/block/aoe/aoecmd.c: In function `aoecmd_ata_rw': drivers/block/aoe/aoecmd.c:383:17: warning: variable `t' set but not used [-Wunused-but-set-variable] struct aoetgt *t; ^ drivers/block/aoe/aoecmd.c: In function `resend': drivers/block/aoe/aoecmd.c:488:21: warning: variable `ah' set but not used [-Wunused-but-set-variable] struct aoe_atahdr *ah; ^ Signed-off-by: Andy Shevchenko <[email protected]> Cc: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: remove custom implementation of kbasename()Andy Shevchenko1-6/+3
In the kernel we have a nice helper that may be used here. This patch substitutes the custom implementation by the native function call. Signed-off-by: Andy Shevchenko <[email protected]> Cc: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: update internal version number to 85Ed Cashin1-1/+1
Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11[SCSI] fnic: fnic Driver Tuneables Exposed through CLIHiral Patel4-18/+34
Introduced module params to provide dynamic way of configuring queue depth. Added support to get max io throttle count through UCSM to configure maximum outstanding IOs supported by fnic and push that value to scsi mid-layer. Supported IO throttle values: UCSM IO THROTTLE VALUE FNIC MAX OUTSTANDING IOS ------------------------------------------------------ 16 (Default) 2048 <= 256 256 > 256 <ucsm value> Signed-off-by: Hiral Patel <[email protected]> Signed-off-by: James Bottomley <[email protected]>
2013-09-11aoe: update copyright dateEd Cashin1-1/+1
Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: fill in per-AoE-target information for debugfs fileEd Cashin1-1/+32
This information is presented in a compact format that has evolved for easy routine scanning by expert humans, mostly developers and support technicians helping to troubleshoot or test AoE-based systems. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: provide file operations for debugfs filesEd Cashin1-1/+24
The place holder in the file contents is filled out in the following patch. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: add AoE-target files to debugfsEd Cashin3-0/+38
Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11aoe: create and destroy debugfs directory for aoeEd Cashin1-1/+9
This series adds the debugging information that the coraid.com-distributed aoe driver exports via sysfs, but instead of sysfs, it uses debugfs. With these patches applied, even without AoE targets on the network, KEDR reports new possible memory leaks, but these are from callers outside the aoe driver that have used aoe_devnode to get the name of the character devices through the aoe_class->devnode callback, and I believe they're responsible for freeing that memory. This patch: Create and destroy the debugfs directory. Signed-off-by: Ed Cashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11mm/zswap: use postorder iteration when destroying rbtreeCody P Schafer1-14/+2
Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Seth Jennings <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11rbtree: allow tests to run as builtinCody P Schafer1-1/+1
No reason require rbtree test code to be a module, allow it to be builtin (streamlines my development process) Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Seth Jennings <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11rbtree_test: add test for postorder iterationCody P Schafer1-0/+12
Just check that we examine all nodes in the tree for the postorder iteration. Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Seth Jennings <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11rbtree: add rbtree_postorder_for_each_entry_safe() helperCody P Schafer1-0/+18
Because deletion (of the entire tree) is a relatively common use of the rbtree_postorder iteration, and because doing it safely means fiddling with temporary storage, provide a helper to simplify postorder rbtree iteration. Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Seth Jennings <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11rbtree: add postorder iteration functionsCody P Schafer2-0/+44
Postorder iteration yields all of a node's children prior to yielding the node itself, and this particular implementation also avoids examining the leaf links in a node after that node has been yielded. In what I expect will be its most common usage, postorder iteration allows the deletion of every node in an rbtree without modifying the rbtree nodes (no _requirement_ that they be nulled) while avoiding referencing child nodes after they have been "deleted" (most commonly, freed). I have only updated zswap to use this functionality at this point, but numerous bits of code (most notably in the filesystem drivers) use a hand rolled postorder iteration that NULLs child links as it traverses the tree. Each of those instances could be replaced with this common implementation. 1 & 2 add rbtree postorder iteration functions. 3 adds testing of the iteration to the rbtree runtime tests 4 allows building the rbtree runtime tests as builtins 5 updates zswap. This patch: Add postorder iteration functions for rbtree. These are useful for safely freeing an entire rbtree without modifying the tree at all. Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Seth Jennings <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11block/partitions/efi.c: consistently use pr_foo()Andrew Morton1-26/+19
Cc: Davidlohr Bueso <[email protected]> Cc: Karel Zak <[email protected]> Cc: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: some style cleanupsDavidlohr Bueso1-11/+8
Trivial coding style cleanups - still plenty left. [[email protected]: coding-style fixes] Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: delete annoying emacs style commentsDavidlohr Bueso1-19/+0
I love emacs, but these settings for coding style are annoying when trying to open the efi.h file. More important, we already have checkpatch for that. Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: compare first and last usable LBAsDavidlohr Bueso1-1/+6
When verifying GPT header integrity, make sure that first usable LBA is smaller than last usable LBA. Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: account for pmbr size in lbaDavidlohr Bueso1-3/+18
The partition that has the 0xEE (GPT protective), must have the size in lba field set to the lesser of the size of the disk minus one or 0xFFFFFFFF for larger disks. Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: detect hybrid MBRsDavidlohr Bueso2-21/+56
One of the biggest problems with GPT is compatibility with older, non-GPT systems. The problem is addressed by creating hybrid mbrs, an extension, or variant, of the traditional protective mbr. This contains, apart from the 0xEE partition, up three additional primary partitions that point to the same space marked by up to three GPT partitions. The result is that legacy OSs can see the three required MBR partitions and at the same time ignore the GPT-aware partitions that protect the GPT structures. While hybrid MBRs are hacks, workarounds and simply not part of the GPT standard, they do exist and we have no way around them. For instance, by default, OSX creates a hybrid scheme when using multi-OS booting. In order for Linux to properly discover protective MBRs, it must be made aware of devices that have hybrid MBRs. No functionality is changed by this patch, just a debug message informing the user of the MBR scheme that is being used. [[email protected]: coding-style fixes] Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: do not require gpt partition to begin at sector 1Davidlohr Bueso1-3/+0
When detecting a valid protective MBR, the Linux kernel isn't picky about the partition (1-4) the 0xEE is at, but, unlike other operating systems, it does require it to begin at the second sector (sector 1). This check, apart from it not being enforced by UEFI, and causing Linux to potentially fail to detect any *valid* partitions on the disk, can present problems when dealing with hybrid MBRs[1]. For compatibility reasons, if the first partition is hybridized, the 0xEE partition must be small enough to ensure that it only protects the GPT data structures - as opposed to the the whole disk in a protective MBR. This problem is very well described by Rod Smith[1]: where MBR-only partitioning programs (such as older versions of fdisk) can see some of the disk space as unallocated, thus loosing the purpose of the 0xEE partition's protection of GPT data structures. By dropping this check, this patch enables Linux to be more flexible when probing for GPT disklabels. [1] http://www.rodsbooks.com/gdisk/hybrid.html#reactions Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: check pmbr record's starting lbaDavidlohr Bueso1-4/+13
Per the UEFI Specs 2.4, June 2013, the starting lba of the partition that has the EFI GPT (0xEE) must be set to 0x00000001 - this is obviously the LBA of the GPT Partition Header. [[email protected]: coding-style fixes] Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11partitions/efi: use lba-aware partition recordsDavidlohr Bueso2-6/+19
The kernel's GPT implementation currently uses the generic 'struct partition' type for dealing with legacy MBR partition records. While this is is useful for disklabels that we designed for CHS addressing, such as msdos, it doesn't adapt well to newer standards that use LBA instead, such as GUID partition tables. Furthermore, these generic partition structures do not have all the required fields to properly follow the UEFI specs. While a CHS address can be translated to LBA, it's much simpler and cleaner to just replace the partition type. This patch adds a new 'gpt_record' type that is fully compliant with EFI and will allow, in the next patches, to add more checks to properly verify a protective MBR, which is paramount to probing a device that makes use of GPT. [[email protected]: coding-style fixes] Signed-off-by: Davidlohr Bueso <[email protected]> Reviewed-by: Karel Zak <[email protected]> Acked-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>