aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2012-10-04Btrfs: kill obsolete arguments in btrfs_wait_ordered_extentsLiu Bo6-18/+7
nocow_only is now an obsolete argument. Signed-off-by: Liu Bo <[email protected]>
2012-10-04Btrfs: cleanup fs_info->hashersLiu Bo2-2/+0
fs_info->hashers is now an obsolete one. Signed-off-by: Liu Bo <[email protected]>
2012-10-04Btrfs: cleanup for duplicated code in find_free_extentLiu Bo1-4/+0
There is already an 'add free space' phrase in front of this one, we needn't to redo it. Signed-off-by: Liu Bo <[email protected]>
2012-10-04Btrfs: fix race in sync and freeze againJosef Bacik3-10/+18
I screwed this up, there is a race between checking if there is a running transaction and actually starting a transaction in sync where we could race with a freezer and get ourselves into trouble. To fix this we need to make a new join type to only do the try lock on the freeze stuff. If it fails we'll return EPERM and just return from sync. This fixes a hang Liu Bo reported when running xfstest 68 in a loop. Thanks, Reported-by: Liu Bo <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-10-04btrfs: return EPERM upon rmdir on a subvolumeDavid Sterba1-2/+3
A subvolume cannot be deleted via rmdir, but the error code ENOTEMPTY is confusing. Return EPERM instead, as this is not permitted. Signed-off-by: David Sterba <[email protected]>
2012-10-04Btrfs: using for_each_set_bit_from to simplify the codeWei Yongjun1-6/+2
Using for_each_set_bit_from() to simplify the code. spatch with a semantic match is used to found this. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <[email protected]>
2012-10-04Btrfs: write_buf is now callable outside send.cAnand Jain2-5/+7
Developing service cmds needs it. Signed-off-by: Anand Jain <[email protected]>
2012-10-04Btrfs: remove unnecessary code in btree_get_extent()Tsutomu Itoh1-7/+1
Unnecessary lookup_extent_mapping() is removed because an error is returned to the caller. This patch was made based on the advice from Stefan Behrens, thanks. Signed-off-by: Tsutomu Itoh <[email protected]>
2012-10-04Btrfs: cleanup of error processing in btree_get_extent()Tsutomu Itoh1-9/+5
This patch simplifies a little complex error processing in btree_get_extent(). Signed-off-by: Tsutomu Itoh <[email protected]>
2012-10-03ore: signedness bug in _sp2d_min_pg()Dan Carpenter1-1/+1
This for loop doesn't work correctly when "p" is unsigned. Signed-off-by: Dan Carpenter <[email protected]>
2012-10-03ceph: avoid 32-bit page index overflowAlex Elder1-6/+5
A pgoff_t is defined (by default) to have type (unsigned long). On architectures such as i686 that's a 32-bit type. The ceph address space code was attempting to produce 64 bit offsets by shifting a page's index by PAGE_CACHE_SHIFT, but the result was not what was desired because the shift occurred before the result got promoted to 64 bits. Fix this by converting all uses of page->index used in this way to use the page_offset() macro, which ensures the 64-bit result has the intended value. This fixes http://tracker.newdream.net/issues/3112 Reported-by: Mohamed Pakkeer <[email protected]> Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2012-10-03ceph: return EIO on invalid layout on GET_DATALOC ioctlSage Weil1-2/+6
If the user calls GET_DATALOC on a file with an invalid (e.g., zeroed) layout, return EIO to userland. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-10-03Merge tag 'jfs-3.7' of git://github.com/kleikamp/linux-shaggyLinus Torvalds10-27/+373
Pull JFS update from Dave Kleikamp: "JFS TRIM support and some minor fixes" * tag 'jfs-3.7' of git://github.com/kleikamp/linux-shaggy: jfs: Fix do_div precision in commit b40c2e66 JFS: use list_move instead of list_del/list_add jfs: Remove obsolete email address fs/jfs: TRIM support for JFS Filesystem
2012-10-02Merge branch 'next' of ↵Linus Torvalds3-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - Integrity: add local fs integrity verification to detect offline attacks - Integrity: add digital signature verification - Simple stacking of Yama with other LSMs (per LSS discussions) - IBM vTPM support on ppc64 - Add new driver for Infineon I2C TIS TPM - Smack: add rule revocation for subject labels" Fixed conflicts with the user namespace support in kernel/auditsc.c and security/integrity/ima/ima_policy.c. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits) Documentation: Update git repository URL for Smack userland tools ima: change flags container data type Smack: setprocattr memory leak fix Smack: implement revoking all rules for a subject label Smack: remove task_wait() hook. ima: audit log hashes ima: generic IMA action flag handling ima: rename ima_must_appraise_or_measure audit: export audit_log_task_info tpm: fix tpm_acpi sparse warning on different address spaces samples/seccomp: fix 31 bit build on s390 ima: digital signature verification support ima: add support for different security.ima data types ima: add ima_inode_setxattr/removexattr function and calls ima: add inode_post_setattr call ima: replace iint spinblock with rwlock/read_lock ima: allocating iint improvements ima: add appraise action keywords and default rules ima: integrity appraisal extension vfs: move ima_file_free before releasing the file ...
2012-10-02Merge tag 'upstream-3.7-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds20-578/+451
Pull ubifs changes from Artem Bityutskiy: "No big changes for 3.7 in UBIFS: - Error reporting and debug printing improvements - Power cut emulation fixes - Minor cleanups" Fix trivial conflict in fs/ubifs/debug.c due to the user namespace changes. * tag 'upstream-3.7-rc1' of git://git.infradead.org/linux-ubifs: UBIFS: print less UBIFS: use pr_ helper instead of printk UBIFS: comply with coding style UBIFS: use __aligned() attribute UBIFS: remove __DATE__ and __TIME__ UBIFS: fix power cut emulation for mtdram UBIFS: improve scanning debug output UBIFS: always print full error reports UBIFS: print PID in debug messages
2012-10-02Merge tag 'for-linus-v3.7-rc1' of git://oss.sgi.com/xfs/xfsLinus Torvalds7-80/+447
Pull xfs update from Ben Myers: "Several enhancements and cleanups: - make inode32 and inode64 remountable options - SEEK_HOLE/SEEK_DATA enhancements - cleanup struct declarations in xfs_mount.h" * tag 'for-linus-v3.7-rc1' of git://oss.sgi.com/xfs/xfs: xfs: Make inode32 a remountable option xfs: add inode64->inode32 transition into xfs_set_inode32() xfs: Fix mp->m_maxagi update during inode64 remount xfs: reduce code duplication handling inode32/64 options xfs: make inode64 as the default allocation mode xfs: Fix m_agirotor reset during AG selection Make inode64 a remountable option xfs: stop the sync worker before xfs_unmountfs xfs: xfs_seek_hole() refinement with hole searching from page cache for unwritten extents xfs: xfs_seek_data() refinement with unwritten extents check up from page cache xfs: Introduce a helper routine to probe data or hole offset from page cache xfs: Remove type argument from xfs_seek_data()/xfs_seek_hole() xfs: fix race while discarding buffers [V4] xfs: check for possible overflow in xfs_ioc_trim xfs: unlock the AGI buffer when looping in xfs_dialloc xfs: kill struct declarations in xfs_mount.h xfs: fix uninitialised variable in xfs_rtbuf_get()
2012-10-02Merge branch 'for-linus' of ↵Linus Torvalds94-2092/+2486
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...
2012-10-02compat: fs: Generic compat_sys_sendfile implementationCatalin Marinas3-2/+26
This function is used by sparc, powerpc and arm64 for compat support. The patch adds a generic implementation which calls do_sendfile() directly and avoids set_fs(). The sparc architecture has wrappers for the sign extensions while powerpc relies on the compiler to do the this. The patch adds wrappers for powerpc to handle the u32->int type conversion. compat_sys_sendfile64() can be replaced by a sys_sendfile() call since compat_loff_t has the same size as off_t on a 64-bit system. On powerpc, the patch also changes the 64-bit sendfile call from sys_sendile64 to sys_sendfile. Signed-off-by: Catalin Marinas <[email protected]> Acked-by: David S. Miller <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Al Viro <[email protected]>
2012-10-02fs: push rcu_barrier() from deactivate_locked_super() to filesystemsKirill A. Shutemov47-6/+240
There's no reason to call rcu_barrier() on every deactivate_locked_super(). We only need to make sure that all delayed rcu free inodes are flushed before we destroy related cache. Removing rcu_barrier() from deactivate_locked_super() affects some fast paths. E.g. on my machine exit_group() of a last process in IPC namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time. Signed-off-by: Kirill A. Shutemov <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Al Viro <[email protected]>
2012-10-02btrfs: reada_extent doesn't need kref for refcountAl Viro1-11/+7
All increments and decrements are under the same spinlock - have to be, since they need to protect the radix_tree it's found in. Just use int, no need to wank with kref... Signed-off-by: Al Viro <[email protected]>
2012-10-02coredump: move core dump functionality into its own fileAlex Kelly3-645/+688
This prepares for making core dump functionality optional. The variable "suid_dumpable" and associated functions are left in fs/exec.c because they're used elsewhere, such as in ptrace. Signed-off-by: Alex Kelly <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Al Viro <[email protected]>
2012-10-02Merge branch 'for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "A few drivers were updated with device tree bindings and others got a few small cleanups and fixes." Fix trivial conflict in drivers/input/keyboard/omap-keypad.c due to changes clashing with a whitespace cleanup. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (28 commits) Input: wacom - mark Intuos5 pad as in-prox when touching buttons Input: synaptics - adjust threshold for treating position values as negative Input: hgpk - use %*ph to dump small buffer Input: gpio_keys_polled - fix dt pdata->nbuttons Input: Add KD[GS]KBDIACRUC ioctls to the compatible list Input: omap-keypad - fixed formatting Input: tegra - move platform data header Input: wacom - add support for EMR on Cintiq 24HD touch Input: s3c2410_ts - make s3c_ts_pmops const Input: samsung-keypad - use of_get_child_count() helper Input: samsung-keypad - use of_match_ptr() Input: uinput - fix formatting Input: uinput - specify exact bit sizes on userspace APIs Input: uinput - mark failed submission requests as free Input: uinput - fix race that can block nonblocking read Input: uinput - return -EINVAL when read buffer size is too small Input: uinput - take event lock when fetching events from buffer Input: get rid of MATCH_BIT() macro Input: rotary-encoder - add DT bindings Input: rotary-encoder - constify platform data pointers ...
2012-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-4/+4
Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...
2012-10-02Merge branch 'for-linus' of ↵Linus Torvalds108-528/+1011
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...
2012-10-02Merge branch 'for-3.7' of ↵Linus Torvalds1-0/+180
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - xattr support added. The implementation is shared with tmpfs. The usage is restricted and intended to be used to manage per-cgroup metadata by system software. tmpfs changes are routed through this branch with Hugh's permission. - cgroup subsystem ID handling simplified. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Define CGROUP_SUBSYS_COUNT according the configuration cgroup: Assign subsystem IDs during compile time cgroup: Do not depend on a given order when populating the subsys array cgroup: Wrap subsystem selection macro cgroup: Remove CGROUP_BUILTIN_SUBSYS_COUNT cgroup: net_prio: Do not define task_netpioidx() when not selected cgroup: net_cls: Do not define task_cls_classid() when not selected cgroup: net_cls: Move sock_update_classid() declaration to cls_cgroup.h cgroup: trivial fixes for Documentation/cgroups/cgroups.txt xattr: mark variable as uninitialized to make both gcc and smatch happy fs: add missing documentation to simple_xattr functions cgroup: add documentation on extended attributes usage cgroup: rename subsys_bits to subsys_mask cgroup: add xattr support cgroup: revise how we re-populate root directory xattr: extract simple_xattr code from tmpfs
2012-10-02Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds12-34/+17
Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...
2012-10-01Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds32-1547/+4884
Pull CIFS updates from Steve French: "This patchset is the final section of the SMB2.1 support merge for cifs.ko. It also includes improvements to the cifs socket handling from Jeff, and also fixes a few cifs bug fixes. It adds SMB2 support for file and inode operations as well as moves some existing cifs code to use ops server struct of protocol specific callbacks. Most of this code is SMB2 specific. When enabled SMB2.1 does pass various functional tests including most of the connectathon test suite, For SMB2.1, Connectathon test 4 and some related tests fail due to not updating mode bits remotely (cifsacl support where mode bits are approximated with the cifs acl is not enable for smb2), and test8 (symlink) support is not completed for SMB2 yet (note that we will likely have a "Unix Extensions" eventually, at least for Samba, so in the long run posix locks won't have to be emulated when mounting Linux to Linux, but for most NAS and for Windows mounts posix lock emulation will still used for SMB2 in a similar fashion as we do for cifs). SMB2.1 dialect is supported. Although additional fixes to enable smb2 (the original smb2.02) dialect and to add various optional features of the smb3 dialect are expected to be added in the future as testing progresses, currently mounting with the "vers=2.1" is supported (in order to mount using SMB2.1 to servers like Samba 4, and Windows 7, Windows 2008R2)." * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: (82 commits) [CIFS] Fix indentation of fs/cifs/Kconfig entries [CIFS] Fix SMB2 negotiation support to select only one dialect (based on vers=) cifs: obtain file access during backup intent lookup (resend) CIFS: Fix possible freed pointer dereference in CIFS_SessSetup CIFS: Fix possible freed pointer dereference in SMB2_sess_setup CIFS: Make ops->close return void cifs: change DOS/NT/POSIX mapping of ERRnoresource cifs: remove support for deprecated "forcedirectio" and "strictcache" mount options cifs: remove support for CIFS_IOC_CHECKUMOUNT ioctl CIFS: Fix possible memory leaks in SMB2 code CIFS: Fix endian conversion of IndexNumber Trivial endian fixes MARK SMB2 support EXPERIMENTAL Update cifs version number cifs: add FL_CLOSE to fl_flags mask in cifs_read_flock cifs: Mangle string used for unc in /proc/mounts cifs: cleanups for cifs_mkdir_qinfo CIFS: Fix fast lease break after open problem CIFS: Add SMB2.1 lease break support CIFS: Fix cache coherency for read oplock case ...
2012-10-01ceph: propagate layout error on osd request creationSage Weil2-6/+6
If we are creating an osd request and get an invalid layout, return an EINVAL to the caller. We switch up the return to have an error code instead of NULL implying -ENOMEM. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-10-01Merge branch 'next' into for-linusDmitry Torokhov1-0/+2
Prepare first set of updates for 3.7 merge window.
2012-10-01Merge tag 'dlm-3.7' of ↵Linus Torvalds13-130/+289
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "There are two main patches in this set, both related to the userland dlm_controld daemon. The first fixes a deadlock between dlm_controld and the dlm_send workqueue when both access configfs data simultaneously. The second reworks some code to get around a long standing, but intentional, unlock balance warning. The userland daemon no longer takes a lock that is later released from the kernel. The other commits are minor fixes and changes." * tag 'dlm-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: check the maximum size of a request from user dlm: cleanup send_to_sock routine dlm: convert add_sock routine return value type to void dlm: remove redundant variable assignments dlm: fix unlock balance warnings dlm: fix uninitialized spinlock dlm: fix deadlock between dlm_send and dlm_controld
2012-10-01ceph: convert to use le32_add_cpu()Wei Yongjun1-1/+1
Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2012-10-01ceph: Fix oops when handling mdsmap that decreases max_mdsYan, Zheng1-1/+2
When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return NULL. Passing NULL to memcmp() triggers oops. Signed-off-by: Yan, Zheng <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2012-10-01ceph: let path portion of mount "device" be optionalAlex Elder1-11/+26
A recent change to /sbin/mountall causes any trailing '/' character in the "device" (or fs_spec) field in /etc/fstab to be stripped. As a result, an entry for a ceph mount that intends to mount the root of the name space ends up with now path portion, and the ceph mount option processing code rejects this. That is, an entry in /etc/fstab like: cephserver:port:/ /mnt ceph defaults 0 0 provides to the ceph code just "cephserver:port:" as the "device," and that gets rejected. Although this is a bug in /sbin/mountall, we can have the ceph mount code support an empty/nonexistent path, interpreting it to mean the root of the name space. RFC 5952 offers recommendations for how to express IPv6 addresses, and recommends the usage found in RFC 3986 (which specifies the format for URI's) for representing both IPv4 and IPv6 addresses that include port numbers. (See in particular the definition of "authority" found in the Appendix of RFC 3986.) According to those standards, no host specification will ever contain a '/' character. As a result, it is sufficient to scan a provided "device" from an /etc/fstab entry for the first '/' character, and if it's found, treat that as the beginning of the path. If no '/' character is present, we can treat the entire string as the monitor host specification(s), and assume the path to be the root of the name space. We'll still require a ':' to separate the host portion from the (possibly empty) path portion. This means that we can more formally define how ceph will interpret the "device" it's provided when processing a mount request: "device" will look like: <server_spec>[,<server_spec>...]:[<path>] where <server_spec> is <ip>[:<port>] <path> is optional, but if present must begin with '/' This addresses http://tracker.newdream.net/issues/2919 Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Dan Mick <[email protected]>
2012-10-01Merge tag 'tty-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds1-0/+6
Pull TTY changes from Greg Kroah-Hartman: "As we skipped the merge window for 3.6-rc1 for the tty tree, everything is now settled down and working properly, so we are ready for 3.7-rc1. Here's the patchset, it's big, but the large changes are removing a firmware file and adding a staging tty driver (it depended on the tty core changes, so it's going through this tree instead of the staging tree.) All of these patches have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <[email protected]>" Fix up more-or-less trivial conflicts in - drivers/char/pcmcia/synclink_cs.c: tty NULL dereference fix vs tty_port_cts_enabled() helper function - drivers/staging/{Kconfig,Makefile}: add-add conflict (dgrp driver added close to other staging drivers) - drivers/staging/ipack/devices/ipoctal.c: "split ipoctal_channel from iopctal" vs "TTY: use tty_port_register_device" * tag 'tty-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (235 commits) tty/serial: Add kgdb_nmi driver tty/serial/amba-pl011: Quiesce interrupts in poll_get_char tty/serial/amba-pl011: Implement poll_init callback tty/serial/core: Introduce poll_init callback kdb: Turn KGDB_KDB=n stubs into static inlines kdb: Implement disable_nmi command kernel/debug: Mask KGDB NMI upon entry serial: pl011: handle corruption at high clock speeds serial: sccnxp: Make 'default' choice in switch last serial: sccnxp: Remove mask termios caps for SW flow control serial: sccnxp: Report actual baudrate back to core serial: samsung: Add poll_get_char & poll_put_char Powerpc 8xx CPM_UART setting MAXIDL register proportionaly to baud rate Powerpc 8xx CPM_UART maxidl should not depend on fifo size Powerpc 8xx CPM_UART too many interrupts Powerpc 8xx CPM_UART desynchronisation serial: set correct baud_base for EXSYS EX-41092 Dual 16950 serial: omap: fix the reciever line error case 8250: blacklist Winbond CIR port 8250_pnp: do pnp probe before legacy probe ...
2012-10-01Revert "Btrfs: do not do filemap_write_and_wait_range in fsync"Miao Xie1-3/+11
This reverts commit 0885ef5b5601e9b007c383e77c172769b1f214fd After applying the above patch, the performance slowed down because the dirty page flush can only be done by one task, so revert it. The following is the test result of sysbench: Before After 24MB/s 39MB/s Signed-off-by: Miao Xie <[email protected]>
2012-10-01Btrfs: remove bytes argument from do_chunk_allocJosef Bacik1-15/+10
Everybody is just making stuff up, and it's just used to see if we really do need to alloc a chunk, and since we do this when we already know we really do it's just a waste of space. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: delay block group item insertionJosef Bacik4-67/+79
So we have lots of places where we try to preallocate chunks in order to make sure we have enough space as we make our allocations. This has historically meant that we're constantly tweaking when we should allocate a new chunk, and historically we have gotten this horribly wrong so we way over allocate either metadata or data. To try and keep this from happening we are going to make it so that the block group item insertion is done out of band at the end of a transaction. This will allow us to create chunks even if we are trying to make an allocation for the extent tree. With this patch my enospc tests run faster (didn't expect this) and more efficiently use the disk space (this is what I wanted). Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-01btrfs: Kill some bi_idx referencesKent Overstreet2-17/+2
For immutable bio vecs, I've been auditing and removing bi_idx references. These were harmless, but removing them will make auditing easier. scrub_bio_end_io_worker() was open coding a bio_reset() - but this doesn't appear to have been needed for anything as right after it does a bio_put(), and perusing the code it doesn't appear anything else was holding a reference to the bio. The other use end_bio_extent_readpage() was just for a pr_debug() - changed it to something that might be a bit more useful. Signed-off-by: Kent Overstreet <[email protected]> CC: Chris Mason <[email protected]> CC: Stefan Behrens <[email protected]>
2012-10-01Btrfs: fix unnecessary warning when the fragments make the space alloc failMiao Xie1-1/+1
When we wrote some data by compress mode into a btrfs filesystem which was full of the fragments, the kernel will report: BTRFS warning (device xxx): Aborting unused transaction. The reason is: We can not find a long enough free space to store the compressed data because of the fragmentary free space, and the compressed data can not be splited, so the kernel outputed the above message. In fact, btrfs can deal with this problem very well: it fall back to uncompressed IO, split the uncompressed data into small ones, and then store them into to the fragmentary free space. So we shouldn't output the above warning message. Signed-off-by: Miao Xie <[email protected]>
2012-10-01Btrfs: create a pinned em when writing to a prealloc range in DIOJosef Bacik1-0/+55
Wade Cline reported a problem where he was getting garbage and warnings when writing to a preallocated range via O_DIRECT. This is because we weren't creating our normal pinned extent_map for the range we were writing to, which was causing all sorts of issues. This patch fixes the problem and makes his testcase much happier. Thanks, Reported-by: Wade Cline <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: move the sb_end_intwrite until after the throttle logicJosef Bacik1-2/+2
Sage reported the following lockdep backtrace ===================================== [ BUG: bad unlock balance detected! ] 3.6.0-rc2-ceph-00171-gc7ed62d #1 Not tainted ------------------------------------- btrfs-cleaner/7607 is trying to release lock (sb_internal) at: [<ffffffffa00422ae>] btrfs_commit_transaction+0xa6e/0xb20 [btrfs] but there are no more locks to release! other info that might help us debug this: 1 lock held by btrfs-cleaner/7607: #0: (&fs_info->cleaner_mutex){+.+...}, at: [<ffffffffa003b405>] cleaner_kthread+0x95/0x120 [btrfs] stack backtrace: Pid: 7607, comm: btrfs-cleaner Not tainted 3.6.0-rc2-ceph-00171-gc7ed62d #1 Call Trace: [<ffffffffa00422ae>] ? btrfs_commit_transaction+0xa6e/0xb20 [btrfs] [<ffffffff810afa9e>] print_unlock_inbalance_bug+0xfe/0x110 [<ffffffff810b289e>] lock_release_non_nested+0x1ee/0x310 [<ffffffff81172f9b>] ? kmem_cache_free+0x7b/0x160 [<ffffffffa004106c>] ? put_transaction+0x8c/0x130 [btrfs] [<ffffffffa00422ae>] ? btrfs_commit_transaction+0xa6e/0xb20 [btrfs] [<ffffffff810b2a95>] lock_release+0xd5/0x220 [<ffffffff81173071>] ? kmem_cache_free+0x151/0x160 [<ffffffff8117d9ed>] __sb_end_write+0x7d/0x90 [<ffffffffa00422ae>] btrfs_commit_transaction+0xa6e/0xb20 [btrfs] [<ffffffff81079850>] ? __init_waitqueue_head+0x60/0x60 [<ffffffff81634c6b>] ? _raw_spin_unlock+0x2b/0x40 [<ffffffffa0042758>] __btrfs_end_transaction+0x368/0x3c0 [btrfs] [<ffffffffa0042808>] btrfs_end_transaction_throttle+0x18/0x20 [btrfs] [<ffffffffa00318f0>] btrfs_drop_snapshot+0x410/0x600 [btrfs] [<ffffffff8132babd>] ? do_raw_spin_unlock+0x5d/0xb0 [<ffffffffa00430ef>] btrfs_clean_old_snapshots+0xaf/0x150 [btrfs] [<ffffffffa003b405>] ? cleaner_kthread+0x95/0x120 [btrfs] [<ffffffffa003b419>] cleaner_kthread+0xa9/0x120 [btrfs] [<ffffffffa003b370>] ? btrfs_destroy_delayed_refs.isra.102+0x220/0x220 [btrfs] [<ffffffff810791ee>] kthread+0xae/0xc0 [<ffffffff810b379d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8163e744>] kernel_thread_helper+0x4/0x10 [<ffffffff81635430>] ? retint_restore_args+0x13/0x13 [<ffffffff81079140>] ? flush_kthread_work+0x1a0/0x1a0 [<ffffffff8163e740>] ? gs_change+0x13/0x13 This is because the throttle stuff can commit the transaction, which expects to be the one stopping the intwrite stuff, but we've already done it in the __btrfs_end_transaction. Moving the sb_end_intewrite after this logic makes the lockdep go away. Thanks, Tested-by: Sage Weil <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: use larger limit for translation of logical to inodeLiu Bo2-4/+5
This is the change of the kernel side. Translation of logical to inode used to have an upper limit 4k on inode container's size, but the limit is not large enough for a data with a great many of refs, so when resolving logical address, we can end up with "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" This changes to regard 64k as the upper limit and use vmalloc instead of kmalloc to get memory more easily. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Liu Bo <[email protected]>
2012-10-01Btrfs: use helper for logical resolveLiu Bo1-16/+3
We already have a helper, iterate_inodes_from_logical(), for logical resolve, so just use it. Signed-off-by: Liu Bo <[email protected]>
2012-10-01Btrfs: fix a bug in parsing return value in logical resolveLiu Bo5-20/+34
In logical resolve, we parse extent_from_logical()'s 'ret' as a kind of flag. It is possible to lose our errors because (-EXXXX & BTRFS_EXTENT_FLAG_TREE_BLOCK) is true. I'm not sure if it is on purpose, it just looks too hacky if it is. I'd rather use a real flag and a 'ret' to catch errors. Acked-by: Jan Schmidt <[email protected]> Signed-off-by: Liu Bo <[email protected]>
2012-10-01Btrfs: cleanup for unused ref cache stuffliubo2-8/+0
As ref cache has been removed from btrfs, there is no user on its lock and its check. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Liu Bo <[email protected]>
2012-10-01Btrfs: fix corrupted metadata in the snapshotMiao Xie3-18/+32
When we delete a inode, we will remove all the delayed items including delayed inode update, and then truncate all the relative metadata. If there is lots of metadata, we will end the current transaction, and start a new transaction to truncate the left metadata. In this way, we will leave a inode item that its link counter is > 0, and also may leave some directory index items in fs/file tree after the current transaction ends. In other words, the metadata in this fs/file tree is inconsistent. If we create a snapshot for this tree now, we will find a inode with corrupted metadata in the new snapshot, and we won't continue to drop the left metadata, because its link counter is not 0. We fix this problem by updating the inode item before the current transaction ends. Signed-off-by: Miao Xie <[email protected]>
2012-10-01btrfs: polish names of kmem cachesDavid Sterba4-9/+9
Usecase: watch 'grep btrfs < /proc/slabinfo' easy to watch all caches in one go. Signed-off-by: David Sterba <[email protected]>
2012-10-01Btrfs: fix our overcommit mathJosef Bacik1-29/+42
I noticed I was seeing large lags when running my torrent test in a vm on my laptop. While trying to make it lag less I noticed that our overcommit math was taking into account the number of bytes we wanted to reclaim, not the number of bytes we actually wanted to allocate, which means we wouldn't overcommit as often. This patch fixes the overcommit math and makes shrink_delalloc() use that logic so that it will stop looping faster. We still have pretty high spikes of latency, but the test now takes 3 minutes less time (about 5% faster). Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: wait on async pages when shrinking delallocJosef Bacik1-0/+7
Mitch reported a problem where you could get an ENOSPC error when untarring a kernel git tree onto a 16gb file system with compress-force=zlib. This is because compression is a huge pain, it will return from ->writepages() without having actually created any ordered extents. To get around this we check to see if the async submit counter is up, and if it is wait until it drops to 0 before doing our normal ordered wait dance. With this patch I can now untar a kernel git tree onto a 16gb file system without getting ENOSPC errors. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defragLiu Bo5-14/+28
We're going to use this flag EXTENT_DEFRAG to indicate which range belongs to defragment so that we can implement snapshow-aware defrag: We set the EXTENT_DEFRAG flag when dirtying the extents that need defragmented, so later on writeback thread can differentiate between normal writeback and writeback started by defragmentation. Original-Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Liu Bo <[email protected]>