aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2007-10-18VFS: allow filesystems to implement atomic open+truncateMiklos Szeredi1-0/+1
Add a new attribute flag ATTR_OPEN, with the meaning: "truncation was initiated by open() due to the O_TRUNC flag". This way filesystems wanting to implement truncation within their ->open() method can ignore such truncate requests. This is a quick & dirty hack, but it comes for free. Signed-off-by: Miklos Szeredi <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Andreas Dilger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18fuse: add file handle to getattr operationMiklos Szeredi1-2/+21
Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [[email protected]: coding-style fixes] Signed-off-by: Miklos Szeredi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18ext3: support large blocksize up to PAGESIZETakashi Sato1-2/+2
This patch set supports large block size(>4k, <=64k) in ext3 just enlarging the block size limit. But it is NOT possible to have 64kB blocksize on ext3 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is treat 64k rec_len with a an impossible value like rec_len = 0xffff to handle this. The Patch-set consists of the following 2 patches. [1/2] ext3: enlarge blocksize - Allow blocksize up to pagesize [2/2] ext3: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Now on 64k page ppc64 box runs with this patch set we could create a 64k block size ext3, and able to handle empty directory block. Signed-off-by: Takashi Sato <[email protected]> Signed-off-by: Mingming Cao <[email protected]> Cc: <[email protected]> Acked-by: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18bit_spin_lock: use lock bitopsNick Piggin1-4/+22
Convert bit_spin_lock to new locking bitops. Slub can use the non-atomic store version to clear (Christoph?) Signed-off-by: Nick Piggin <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Redefine {un}register_hotcpu_notifier() !HOTPLUG_CPU stubsSatyam Sharma1-2/+3
The return of the present "do {} while" based stub definition of register_hotcpu_notifier() cannot be checked. This makes the stub asymmetric w.r.t. the real HOTPLUG_CPU=y implementation that is int-returning. So let us redefine this to be consistent with the full version. Also do the same for unregister_hotcpu_notifier(). We cannot define these as static inline functions due to an existing GCC bug (#33172). So define as macros that return appropriately instead (int '0' for the register_hotcpu_notifier case and void for unregister_hotcpu_notifier). Signed-off-by: Satyam Sharma <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18add-scaled-time-to-taskstats-based-process-accounting fixMichael Neuling1-8/+6
This moves the new items to the end of the taskstats struct as requested by Balbir and yourself. Cc: Balbir Singh <[email protected]> Cc: Jay Lan <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Add scaled time to taskstats based process accountingMichael Neuling3-3/+12
This adds items to the taststats struct to account for user and system time based on scaling the CPU frequency and instruction issue rates. Adds account_(user|system)_time_scaled callbacks which architectures can use to account for time using this mechanism. Signed-off-by: Michael Neuling <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jay Lan <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Char: cyclades, fix some -W warningsJiri Slaby1-6/+6
Most of them are signedness, the rest unused function parameters. Signed-off-by: Jiri Slaby <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Char: cyclades, remove bottom half processingJiri Slaby1-15/+0
The work done in bottom half doesn't cost much cpu time (e.g. tty_hangup itself schedules its own bottom half), it's possible to do the work in isr directly and save hence some .text. Signed-off-by: Jiri Slaby <[email protected]> Cc: Alan Cox <[email protected]> Cc: Paul Fulghum <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18V3 file capabilities: alter behavior of cap_setpcapAndrew Morgan2-4/+7
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <[email protected]> Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: James Morris <[email protected]> Cc: Casey Schaufler <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18sysctl: Error on bad sysctl tablesEric W. Biederman1-0/+1
After going through the kernels sysctl tables several times it has become clear that code review and testing is just not effective in prevent problematic sysctl tables from being used in the stable kernel. I certainly can't seem to fix the problems as fast as they are introduced. Therefore this patch adds sysctl_check_table which is called when a sysctl table is registered and checks to see if we have a problematic sysctl table. The biggest part of the code is the table of valid binary sysctl entries, but since we have frozen our set of binary sysctls this table should not need to change, and it makes it much easier to detect when someone unintentionally adds a new binary sysctl value. As best as I can determine all of the several hundred errors spewed on boot up now are legitimate. [[email protected]: kernel/sysctl_check.c must #include <linux/string.h>] Signed-off-by: Eric W. Biederman <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18sysctl: properly register the irda binary sysctl numbersEric W. Biederman1-0/+20
Grumble. These numbers should have been in sysctl.h from the beginning if we ever expected anyone to use them. Oh well put them there now so we can find them and make maintenance easier. Signed-off-by: Eric W. Biederman <[email protected]> Acked-by: Samuel Ortiz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18sysctl: parport remove binary pathsEric W. Biederman1-1/+0
The sysctl binary paths don't look as if they even code work, .data is not filled in, and all of the proc_handlers look at extra1 and there is not strategy routine. So just kill the binary paths. In addition this patch removes the setting of extra1 on directories. It doesn't look like the parport code ever examines it, and it's bad sysctl form. [[email protected]: remove parport_device_num()] Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18sysctl: Factor out sysctl_data.Eric W. Biederman1-0/+1
There as been no easy way to wrap the default sysctl strategy routine except for returning 0. Which is not always what we want. The few instances I have seen that want different behaviour have written their own version of sysctl_data. While not too hard it is unnecessary code and has the potential for extra bugs. So to make these situations easier and make that part of sysctl more symetric I have factord sysctl_data out of do_sysctl_strategy and exported as a function everyone can use. Further having sysctl_data be an explicit function makes checking for badly formed sysctl tables much easier. Signed-off-by: Eric W. Biederman <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18sysctl core: Stop using the unnecessary ctl_table typedefEric W. Biederman1-18/+18
In sysctl.h the typedef struct ctl_table ctl_table violates coding style isn't needed and is a bit of a nuisance because it makes it harder to recognize ctl_table is a type name. So this patch removes it from the generic sysctl code. Hopefully I will have enough energy to send the rest of my patches will follow and to remove it from the rest of the kernel. Signed-off-by: Eric W. Biederman <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18stop using DMA_xxBIT_MASKAndrew Morton1-2/+8
Now that we have DMA_BIT_MASK(), these macros are pointless. Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18unify DMA_..BIT_MASK definitions: v3.1Borislav Petkov1-10/+14
Remove redundant DMA_..BIT_MASK definitions across two drivers. The computation of the majority of the bitmasks is done by the compiler. The initial split of the patch touching each a different file got removed due to possible git bisect breakage. Signed-off-by: Borislav Petkov <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Cc: Muli Ben-Yehuda <[email protected]> Cc: Jeff Garzik <[email protected]> Cc: James Bottomley <[email protected]> Reviewed-by: Satyam Sharma <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday().Tony Breeds1-0/+5
On platforms that copy sys_tz into the vdso (currently only x86_64, soon to include powerpc), it is possible for the vdso to get out of sync if a user calls (admittedly unusual) settimeofday(NULL, ptr). This patch adds a hook for architectures that set CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also updatee their copy in the vdso. Signed-off-by: Tony Breeds <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Tony Luck <[email protected]> Acked-by: John Stultz <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Remove struct task_struct::io_waitAlexey Dobriyan2-19/+0
Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b during 2.6.9 development. Commit introduced io_wait field which remained write-only than and still remains write-only. Also garbage collect macros which "use" io_wait. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Hibernation: Check if ACPI is enabled during restore in the right placeRafael J. Wysocki1-0/+7
The following scenario leads to total confusion of the platform firmware on some boxes (eg. HPC nx6325): * Hibernate with ACPI enabled * Resume passing "acpi=off" to the boot kernel To prevent this from happening it's necessary to check if ACPI is enabled (and enable it if that's not the case) _right_ _after_ control has been transfered from the boot kernel to the image kernel, before device_power_up() is called (ie. with interrupts disabled).  Enabling ACPI after calling device_power_up() turns out to be insufficient. For this reason, introduce new hibernation callback ->leave() that will be executed before device_power_up() by the restored image kernel.  To make it work, it also is necessary to move swsusp_suspend() from swsusp.c to disk.c (it's name is changed to "create_image", which is more up to the point). Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18serial: turn serial console suspend a boot rather than compile time optionAndres Salomon1-5/+2
Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop the serial console from being suspended when the rest of the machine goes to sleep. This is incredibly useful for debugging power management-related things; however, having it as a compile-time option has proved to be incredibly inconvenient for us (OLPC). There are plenty of times that we want serial console to not suspend, but for the most part we'd like serial console to be suspended. This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel boot parameter (no_console_suspend). By default, the serial console will be suspended along with the rest of the system; by passing 'no_console_suspend' to the kernel during boot, serial console will remain alive during suspend. For now, this is pretty serial console specific; further fixes could be applied to make this work for things like netconsole. Signed-off-by: Andres Salomon <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Nigel Cunningham <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18freezer: introduce freezer-friendly waiting macrosRafael J. Wysocki1-0/+38
Introduce freezer-friendly wrappers around wait_event_interruptible() and wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to be used in freezable kernel threads. Make some of the freezable kernel threads use them. This is necessary for the freezer to stop sending signals to kernel threads, which is implemented in the next patch. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Nigel Cunningham <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Rename hibernation_ops to platform_hibernation_opsRafael J. Wysocki1-4/+4
Rename 'struct hibernation_ops' to 'struct platform_hibernation_ops' in analogy with 'struct platform_suspend_ops'. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Len Brown <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Rework struct hibernation_opsRafael J. Wysocki1-6/+32
During hibernation we also need to tell the ACPI core that we're going to put the system into the S4 sleep state. For this reason, an additional method in 'struct hibernation_ops' is needed, playing the role of set_target() in 'struct platform_suspend_operations'. Moreover, the role of the .prepare() method is now different, so it's better to introduce another method, that in general may be different from .prepare(), that will be used to prepare the platform for creating the hibernation image (.prepare() is used anyway to notify the platform that we're going to enter the low power state after the image has been saved). Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Make suspend_ops staticRafael J. Wysocki1-2/+0
The variable suspend_ops representing the set of global platform-specific suspend-related operations, used by the PM core, need not be exported outside of kernel/power/main.c .  Make it static. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Rework struct platform_suspend_opsRafael J. Wysocki1-8/+5
There is no reason why the .prepare() and .finish() methods in 'struct platform_suspend_ops' should take any arguments, since architectures don't use these methods' argument in any practically meaningful way (ie. either the target system sleep state is conveyed to the platform by .set_target(), or there is only one suspend state supported and it is indicated to the PM core by .valid(), or .prepare() and .finish() aren't defined at all).  There also is no reason why .finish() should return any result. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Len Brown <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Rename struct pm_ops and related thingsRafael J. Wysocki1-11/+11
The name of 'struct pm_ops' suggests that it is related to the power management in general, but in fact it is only related to suspend.  Moreover, its name should indicate what this structure is used for, so it seems reasonable to change it to 'struct platform_suspend_ops'.  In that case, the name of the global variable of this type used by the PM core and the names of related functions should be changed accordingly. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Len Brown <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18PM: Move definition of struct pm_ops to suspend.hRafael J. Wysocki2-109/+110
Move the definition of 'struct pm_ops' and related functions from <linux/pm.h> to <linux/suspend.h> . There are, at least, the following reasons to do that: * 'struct pm_ops' is specifically related to suspend and not to the power management in general. * As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it causes the entire kernel to be recompiled, which is unnecessary and annoying. * Some suspend-related features are already defined in <linux/suspend.h>, so it is logical to move the definition of 'struct pm_ops' into there. * 'struct hibernation_ops', being the hibernation-related counterpart of 'struct pm_ops', is defined in <linux/suspend.h> . Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Pavel Machek <[email protected]> Cc: Len Brown <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easierAnton Blanchard1-1/+1
Pull the copy_to_user out of hrtimer_nanosleep and into the callers (common_nsleep, sys_nanosleep) in preparation for converting compat_sys_nanosleep to use hrtimers. Signed-off-by: Anton Blanchard <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2007-10-18[libata] kill ata_sg_is_last()Jeff Garzik1-12/+0
Short term, this works around a bug introduced by early sg-chaining work. Long term, removing this function eliminates a branch from a hot path loop in each scatter/gather table build. Also, as this code demonstrates, we don't need to _track_ the end of the s/g list, as long as we mark it in some way. And doing so programatically is nice. So its a useful cleanup, regardless of its short term effects. Based conceptually on a quick patch by Jens Axboe. Signed-off-by: Jeff Garzik <[email protected]>
2007-10-18sched: reduce schedstat variable overhead a bitKen Chen1-21/+21
schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by: Ken Chen <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2007-10-18[NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is requiredLi Zefan1-8/+5
Macros like SCTP_CHUNKMAP_XXX(chukmap) require chukmap to be an array, but match_packet() passes a pointer to these macros. Also remove the ELEMCOUNT macro and fix a bug in SCTP_CHUNKMAP_COPY. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-10-18[NET]: Fix OOPS due to missing check in dev_parse_header().Patrick McHardy1-1/+1
[ This is kernel bugzilla 9174 "linux-2.6.23-git11 kernel panic" ] The device in question is an IPv6-over-IPv4 tunnel, which doesn't have any header_ops, so the crash happens in dev_parse_header when dereferencing them. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-10-17[NET]: Introduce the sk_detach_filter() callPavel Emelyanov1-0/+1
Filter is attached in a separate function, so do the same for filter detaching. This also removes one variable sock_setsockopt(). Signed-off-by: Pavel Emelyanov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-10-17[SPARC/64]: Consolidate of_register_driverStephen Rothwell1-0/+4
Also of_unregister_driver. These will be shortly also used by the PowerPC code. Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-10-17napi_synchronize: waiting for NAPIStephen Hemminger1-0/+18
Some drivers with shared NAPI need a synchronization barrier. Also suggested by Benjamin Herrenschmidt for EMAC. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: Jeff Garzik <[email protected]>
2007-10-17ext4: Convert ext4_extent_idx.ei_leaf to ext4_extent_idx.ei_leaf_loAneesh Kumar K.V1-1/+1
Convert ext4_extent_idx.ei_leaf ext4_extent_idx.ei_leaf_lo This helps in finding BUGs due to direct partial access of these split 48 bit values. Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: Convert ext4_extent.ee_start to ext4_extent.ee_start_loAneesh Kumar K.V1-1/+1
Convert ext4_extent.ee_start to ext4_extent.ee_start_lo This helps in finding BUGs due to direct partial access of these split 48 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: Convert s_r_blocks_count and s_free_blocks_countAneesh Kumar K.V1-6/+6
Convert s_r_blocks_count and s_free_blocks_count to s_r_blocks_count_lo and s_free_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2007-10-17ext4: Convert s_blocks_count to s_blocks_count_loAneesh Kumar K.V1-3/+3
Convert s_blocks_count to s_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: Convert bg_inode_bitmap and bg_inode_tableAneesh Kumar K.V1-2/+2
Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo and bg_inode_table_lo. This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix one direct partial access Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: Convert bg_block_bitmap to bg_block_bitmap_loAneesh Kumar K.V1-1/+1
Convert bg_block_bitmap to bg_block_bitmap_lo This helps in catching some BUGS due to direct partial access of these split fields. Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: FLEX_BG Kernel support v2.Jose R. Santos1-1/+3
This feature relaxes check restrictions on where each block groups meta data is located within the storage media. This allows for the allocation of bitmaps or inode tables outside the block group boundaries in cases where bad blocks forces us to look for new blocks which the owning block group can not satisfy. This will also allow for new meta-data allocation schemes to improve performance and scalability. Signed-off-by: Jose R. Santos <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2007-10-17ext4: Fix sparse warningsAneesh Kumar K.V1-7/+7
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2007-10-17Ext4: Uninitialized Block GroupsAndreas Dilger1-4/+12
In pass1 of e2fsck, every inode table in the fileystem is scanned and checked, regardless of whether it is in use. This is this the most time consuming part of the filesystem check. The unintialized block group feature can greatly reduce e2fsck time by eliminating checking of uninitialized inodes. With this feature, there is a a high water mark of used inodes for each block group. Block and inode bitmaps can be uninitialized on disk via a flag in the group descriptor to avoid reading or scanning them at e2fsck time. A checksum of each group descriptor is used to ensure that corruption in the group descriptor's bit flags does not cause incorrect operation. The feature is enabled through a mkfs option mke2fs /dev/ -O uninit_groups A patch adding support for uninitialized block groups to e2fsprogs tools has been posted to the linux-ext4 mailing list. The patches have been stress tested with fsstress and fsx. In performance tests testing e2fsck time, we have seen that e2fsck time on ext3 grows linearly with the total number of inodes in the filesytem. In ext4 with the uninitialized block groups feature, the e2fsck time is constant, based solely on the number of used inodes rather than the total inode count. Since typical ext4 filesystems only use 1-10% of their inodes, this feature can greatly reduce e2fsck time for users. With performance improvement of 2-20 times, depending on how full the filesystem is. The attached graph shows the major improvements in e2fsck times in filesystems with a large total inode count, but few inodes in use. In each group descriptor if we have EXT4_BG_INODE_UNINIT set in bg_flags: Inode table is not initialized/used in this group. So we can skip the consistency check during fsck. EXT4_BG_BLOCK_UNINIT set in bg_flags: No block in the group is used. So we can skip the block bitmap verification for this group. We also add two new fields to group descriptor as a part of uninitialized group patch. __le16 bg_itable_unused; /* Unused inodes count */ __le16 bg_checksum; /* crc16(sb_uuid+group+desc) */ bg_itable_unused: If we have EXT4_BG_INODE_UNINIT not set in bg_flags then bg_itable_unused will give the offset within the inode table till the inodes are used. This can be used by fsck to skip list of inodes that are marked unused. bg_checksum: Now that we depend on bg_flags and bg_itable_unused to determine the block and inode usage, we need to make sure group descriptor is not corrupt. We add checksum to group descriptor to detect corruption. If the descriptor is found to be corrupt, we mark all the blocks and inodes in the group used. Signed-off-by: Avantika Mathur <[email protected]> Signed-off-by: Andreas Dilger <[email protected]> Signed-off-by: Mingming Cao <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]>
2007-10-17ext4: remove #ifdef CONFIG_EXT4_INDEXEric Sandeen1-12/+2
CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext4_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Mingming Cao <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2007-10-17ext4: Remove (partial, never completed) fragment supportColy Li3-37/+6
Fragment support in ext2/3/4 was never implemented, and it probably will never be implemented. So remove it from ext4. Signed-off-by: Coly Li <[email protected]> Acked-by: Andreas Dilger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2007-10-17jbd2: JBD_XXX to JBD2_XXX naming cleanupMingming Cao3-19/+20
change JBD_XXX macros to JBD2_XXX in JBD2/Ext4 Signed-off-by: Mingming Cao <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2007-10-17JBD2: replace jbd_kmalloc with kmalloc directly.Mingming Cao1-7/+0
This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <[email protected]>
2007-10-17JBD: replace jbd_kmalloc with kmalloc directlyMingming Cao1-6/+0
This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <[email protected]>