aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2011-03-23userns: make has_capability* into real functionsSerge E. Hallyn1-30/+4
So we can let type safety keep things sane, and as a bonus we can remove the declaration of init_user_ns in capability.h. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23userns: allow ptrace from non-init user namespacesSerge E. Hallyn1-0/+2
ptrace is allowed to tasks in the same user namespace according to the usual rules (i.e. the same rules as for two tasks in the init user namespace). ptrace is also allowed to a user namespace to which the current task the has CAP_SYS_PTRACE capability. Changelog: Dec 31: Address feedback by Eric: . Correct ptrace uid check . Rename may_ptrace_ns to ptrace_capable . Also fix the cap_ptrace checks. Jan 1: Use const cred struct Jan 11: use task_ns_capable() in place of ptrace_capable(). Feb 23: same_or_ancestore_user_ns() was not an appropriate check to constrain cap_issubset. Rather, cap_issubset() only is meaningful when both capsets are in the same user_ns. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23userns: allow sethostname in a containerSerge E. Hallyn1-3/+3
Changelog: Feb 23: let clone_uts_ns() handle setting uts->user_ns To do so we need to pass in the task_struct who'll get the utsname, so we can get its user_ns. Feb 23: As per Oleg's coment, just pass in tsk, instead of two of its members. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23userns: security: make capabilities relative to the user namespaceSerge E. Hallyn3-20/+48
- Introduce ns_capable to test for a capability in a non-default user namespace. - Teach cap_capable to handle capabilities in a non-default user namespace. The motivation is to get to the unprivileged creation of new namespaces. It looks like this gets us 90% of the way there, with only potential uid confusion issues left. I still need to handle getting all caps after creation but otherwise I think I have a good starter patch that achieves all of your goals. Changelog: 11/05/2010: [serge] add apparmor 12/14/2010: [serge] fix capabilities to created user namespaces Without this, if user serge creates a user_ns, he won't have capabilities to the user_ns he created. THis is because we were first checking whether his effective caps had the caps he needed and returning -EPERM if not, and THEN checking whether he was the creator. Reverse those checks. 12/16/2010: [serge] security_real_capable needs ns argument in !security case 01/11/2011: [serge] add task_ns_capable helper 01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion 02/16/2011: [serge] fix a logic bug: the root user is always creator of init_user_ns, but should not always have capabilities to it! Fix the check in cap_capable(). 02/21/2011: Add the required user_ns parameter to security_capable, fixing a compile failure. 02/23/2011: Convert some macros to functions as per akpm comments. Some couldn't be converted because we can't easily forward-declare them (they are inline if !SECURITY, extern if SECURITY). Add a current_user_ns function so we can use it in capability.h without #including cred.h. Move all forward declarations together to the top of the #ifdef __KERNEL__ section, and use kernel-doc format. 02/23/2011: Per dhowells, clean up comment in cap_capable(). 02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable. (Original written and signed off by Eric; latest, modified version acked by him) [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: export current_user_ns() for ecryptfs] [serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23userns: add a user_namespace as creator/owner of uts_namespaceSerge E. Hallyn1-0/+4
The expected course of development for user namespaces targeted capabilities is laid out at https://wiki.ubuntu.com/UserNamespace. Goals: - Make it safe for an unprivileged user to unshare namespaces. They will be privileged with respect to the new namespace, but this should only include resources which the unprivileged user already owns. - Provide separate limits and accounting for userids in different namespaces. Status: Currently (as of 2.6.38) you can clone with the CLONE_NEWUSER flag to get a new user namespace if you have the CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID capabilities. What this gets you is a whole new set of userids, meaning that user 500 will have a different 'struct user' in your namespace than in other namespaces. So any accounting information stored in struct user will be unique to your namespace. However, throughout the kernel there are checks which - simply check for a capability. Since root in a child namespace has all capabilities, this means that a child namespace is not constrained. - simply compare uid1 == uid2. Since these are the integer uids, uid 500 in namespace 1 will be said to be equal to uid 500 in namespace 2. As a result, the lxc implementation at lxc.sf.net does not use user namespaces. This is actually helpful because it leaves us free to develop user namespaces in such a way that, for some time, user namespaces may be unuseful. Bugs aside, this patchset is supposed to not at all affect systems which are not actively using user namespaces, and only restrict what tasks in child user namespace can do. They begin to limit privilege to a user namespace, so that root in a container cannot kill or ptrace tasks in the parent user namespace, and can only get world access rights to files. Since all files currently belong to the initila user namespace, that means that child user namespaces can only get world access rights to *all* files. While this temporarily makes user namespaces bad for system containers, it starts to get useful for some sandboxing. I've run the 'runltplite.sh' with and without this patchset and found no difference. This patch: copy_process() handles CLONE_NEWUSER before the rest of the namespaces. So in the case of clone(CLONE_NEWUSER|CLONE_NEWUTS) the new uts namespace will have the new user namespace as its owner. That is what we want, since we want root in that new userns to be able to have privilege over it. Changelog: Feb 15: don't set uts_ns->user_ns if we didn't create a new uts_ns. Feb 23: Move extern init_user_ns declaration from init/version.c to utsname.h. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23pid: remove the child_reaper special case in init/main.cEric W. Biederman1-0/+11
This patchset is a cleanup and a preparation to unshare the pid namespace. These prerequisites prepare for Eric's patchset to give a file descriptor to a namespace and join an existing namespace. This patch: It turns out that the existing assignment in copy_process of the child_reaper can handle the initial assignment of child_reaper we just need to generalize the test in kernel/fork.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23rapidio: modify mport ID assignmentAlexandre Bounine1-0/+1
Changes mport ID and host destination ID assignment to implement unified method common to all mport drivers. Makes "riohdid=" kernel command line parameter common for all architectures with support for more that one host destination ID assignment. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Cc: Micha Nelissen <micha@neli.hopto.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23rapidio: modify subsystem and driver initialization sequenceAlexandre Bounine1-1/+0
Subsystem initialization sequence modified to support presence of multiple RapidIO controllers in the system. The new sequence is compatible with initialization of PCI devices. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Cc: Micha Nelissen <micha@neli.hopto.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23rapidio: add architecture specific callbacksAlexandre Bounine2-8/+22
This set of patches eliminates RapidIO dependency on PowerPC architecture and makes it available to other architectures (x86 and MIPS). It also enables support of new platform independent RapidIO controllers such as PCI-to-SRIO and PCI Express-to-SRIO. This patch: Extend number of mport callback functions to eliminate direct linking of architecture specific mport operations. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Cc: Micha Nelissen <micha@neli.hopto.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23proc: make struct proc_dir_entry::namelen unsigned intAlexey Dobriyan1-1/+1
1. namelen is declared "unsigned short" which hints for "maybe space savings". Indeed in 2.4 struct proc_dir_entry looked like: struct proc_dir_entry { unsigned short low_ino; unsigned short namelen; Now, low_ino is "unsigned int", all savings were gone for a long time. "struct proc_dir_entry" is not that countless to worry about it's size, anyway. 2. converting from unsigned short to int/unsigned int can only create problems, we better play it safe. Space is not really conserved, because of natural alignment for the next field. sizeof(struct proc_dir_entry) remains the same. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: convert uncharge batching from bytes to page granularityJohannes Weiner1-2/+2
We never uncharge subpage quantities. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: remove direct page_cgroup-to-page pointerJohannes Weiner1-17/+58
In struct page_cgroup, we have a full word for flags but only a few are reserved. Use the remaining upper bits to encode, depending on configuration, the node or the section, to enable page_cgroup-to-page lookups without a direct pointer. This saves a full word for every page in a system with memory cgroups enabled. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: fold __mem_cgroup_move_account into callerJohannes Weiner1-5/+0
It is one logical function, no need to have it split up. Also, get rid of some checks from the inner function that ensured the sanity of the outer function. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: change page_cgroup_zoneinfo signatureJohannes Weiner1-10/+0
Instead of passing a whole struct page_cgroup to this function, let it take only what it really needs from it: the struct mem_cgroup and the page. This has the advantage that reading pc->mem_cgroup is now done at the same place where the ordering rules for this pointer are enforced and explained. It is also in preparation for removing the pc->page backpointer. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: add memcg sanity checks at allocating and freeing pagesDaisuke Nishimura1-0/+17
Add checks at allocating or freeing a page whether the page is used (iow, charged) from the view point of memcg. This check may be useful in debugging a problem and we did similar checks before the commit 52d4b9ac(memcg: allocate all page_cgroup at boot). This patch adds some overheads at allocating or freeing memory, so it's enabled only when CONFIG_DEBUG_VM is enabled. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: simplify the way memory limits are checkedJohannes Weiner1-58/+14
Since transparent huge pages, checking whether memory cgroups are below their limits is no longer enough, but the actual amount of chargeable space is important. To not have more than one limit-checking interface, replace memory_cgroup_check_under_limit() and memory_cgroup_check_margin() with a single memory_cgroup_margin() that returns the chargeable space and leaves the comparison to the callsite. Soft limits are now checked the other way round, by using the already existing function that returns the amount by which soft limits are exceeded: res_counter_soft_limit_excess(). Also remove all the corresponding functions on the res_counter side that are now no longer used. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23memcg: soft limit reclaim should end at limit not belowJohannes Weiner1-2/+2
Soft limit reclaim continues until the usage is below the current soft limit, but the documented semantics are actually that soft limit reclaim will push usage back until the soft limits are met again. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23bitops: remove minix bitops from asm/bitops.hAkinobu Mita3-31/+0
minix bit operations are only used by minix filesystem and useless by other modules. Because byte order of inode and block bitmaps is different on each architecture like below: m68k: big-endian 16bit indexed bitmaps h8300, microblaze, s390, sparc, m68knommu: big-endian 32 or 64bit indexed bitmaps m32r, mips, sh, xtensa: big-endian 32 or 64bit indexed bitmaps for big-endian mode little-endian bitmaps for little-endian mode Others: little-endian bitmaps In order to move minix bit operations from asm/bitops.h to architecture independent code in minix filesystem, this provides two config options. CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k. CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu, m32r, mips, sh, xtensa). The architectures which always use little-endian bitmaps do not select these options. Finally, we can remove minix bit operations from asm/bitops.h for all architectures. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hirokazu Takata <takata@linux-m32r.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23bitops: remove ext2 non-atomic bitops from asm/bitops.hAkinobu Mita2-19/+0
As the result of conversions, there are no users of ext2 non-atomic bit operations except for ext2 filesystem itself. Now we can put them into architecture independent code in ext2 filesystem, and remove from asm/bitops.h for all architectures. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23reiserfs: use little-endian bitopsAkinobu Mita1-14/+13
As a preparation for removing ext2 non-atomic bit operations from asm/bitops.h. This converts ext2 non-atomic bit operations to little-endian bit operations. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23ext3: use little-endian bitopsAkinobu Mita1-5/+5
As a preparation for removing ext2 non-atomic bit operations from asm/bitops.h. This converts ext2 non-atomic bit operations to little-endian bit operations. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23asm-generic: use little-endian bitopsAkinobu Mita1-2/+2
As a preparation for removing ext2 non-atomic bit operations from asm/bitops.h. This converts ext2 non-atomic bit operations to little-endian bit operations. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23bitops: introduce little-endian bitops for most architecturesAkinobu Mita3-4/+1
Introduce little-endian bit operations to the big-endian architectures which do not have native little-endian bit operations and the little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300, ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa) These architectures can just include generic implementation (asm-generic/bitops/le.h). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23asm-generic: change little-endian bitops to take any pointer typesAkinobu Mita1-24/+53
This makes the little-endian bitops take any pointer types by changing the prototypes and adding casts in the preprocessor macros. That would seem to at least make all the filesystem code happier, and they can continue to do just something like #define ext2_set_bit __test_and_set_bit_le (or whatever the exact sequence ends up being). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23asm-generic: rename generic little-endian bitops functionsAkinobu Mita3-25/+25
As a preparation for providing little-endian bitops for all architectures, This renames generic implementation of little-endian bitops. (remove "generic_" prefix and postfix "_le") s/generic_find_next_le_bit/find_next_bit_le/ s/generic_find_next_zero_le_bit/find_next_zero_bit_le/ s/generic_find_first_zero_le_bit/find_first_zero_bit_le/ s/generic___test_and_set_le_bit/__test_and_set_bit_le/ s/generic___test_and_clear_le_bit/__test_and_clear_bit_le/ s/generic_test_le_bit/test_bit_le/ s/generic___set_le_bit/__set_bit_le/ s/generic___clear_le_bit/__clear_bit_le/ s/generic_test_and_set_le_bit/test_and_set_bit_le/ s/generic_test_and_clear_le_bit/test_and_clear_bit_le/ Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23bitops: merge little and big endian definisions in asm-generic/bitops/le.hAkinobu Mita1-26/+20
This patch series introduces little-endian bit operations in asm/bitops.h for all architectures and converts all ext2 non-atomic and minix bit operations to use little-endian bit operations. It enables us to remove ext2 non-atomic and minix bit operations from asm/bitops.h. The reason they should be removed from asm/bitops.h is as follows: For ext2 non-atomic bit operations, they are used for little-endian byte order bitmap access by some filesystems and modules. But using ext2_*() functions on a module other than ext2 filesystem makes some feel strange. For minix bit operations, they are only used by minix filesystem and are useless by other modules. Because byte order of inode and block bitmap is This patch: In order to make the forthcoming changes smaller, this merges macro definisions in asm-generic/bitops/le.h for big-endian and little-endian as much as possible. This also removes unused BITOP_WORD macro. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23Introduce ARCH_NO_SYSDEV_OPS config option (v2)Rafael J. Wysocki3-4/+17
Introduce Kconfig option allowing architectures where sysdev operations used during system suspend, resume and shutdown have been completely replaced with struct sycore_ops operations to avoid building sysdev code that will never be used. Make callbacks in struct sys_device and struct sysdev_driver depend on ARCH_NO_SYSDEV_OPS to allows us to verify if all of the references have been actually removed from the code the given architecture depends on. Make x86 select ARCH_NO_SYSDEV_OPS. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-23dt: eliminate OF_NO_DEEP_PROBE and test for NULL match tableGrant Likely1-3/+0
There are no users of OF_NO_DEEP_PROBE, and of_match_node() now gracefully handles being passed a NULL pointer, so the checks at the top of of_platform_bus_probe can be dropped. While at it, consolidate the root node pointer check to be easier to read and tidy up related comments. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-03-23mm: implement access_remote_vmStephen Wilson1-0/+2
Provide an alternative to access_process_vm that allows the caller to obtain a reference to the supplied mm_struct. Signed-off-by: Stephen Wilson <wilsons@start.ca> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23mm: arch: rename in_gate_area_no_task to in_gate_area_no_mmStephen Wilson1-3/+3
Now that gate vma's are referenced with respect to a particular mm and not a particular task it only makes sense to propagate the change to this predicate as well. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23mm: arch: make in_gate_area take an mm_struct instead of a task_structStephen Wilson1-2/+2
Morally, the question of whether an address lies in a gate vma should be asked with respect to an mm, not a particular task. Moreover, dropping the dependency on task_struct will help make existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23mm: arch: make get_gate_vma take an mm_struct instead of a task_structStephen Wilson1-1/+1
Morally, the presence of a gate vma is more an attribute of a particular mm than a particular task. Moreover, dropping the dependency on task_struct will help make both existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23NFSv4.1: layoutcommitAndy Adamson3-0/+25
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to the data server (as opposed to the MDS) and the data server WRITE is not NFS_FILE_SYNC. Only whole file layout support means that there is only one IOMODE_RW layout segment. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn> Tested-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23NFSv4.1: filelayout driver specific code for COMMITFred Isaman2-0/+2
Implement all the hooks created in the previous patches. This requires exporting quite a few functions and adding a few structure fields. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23NFSv4.1: add generic layer hooks for pnfs COMMITFred Isaman2-1/+6
We create three major hooks for the pnfs code. pnfs_mark_request_commit() is called during writeback_done from nfs_mark_request_commit, which gives the driver an opportunity to claim it wants control over commiting a particular req. pnfs_choose_commit_list() is called from nfs_scan_list to choose which list a given req should be added to, based on where we intend to send it for COMMIT. It is up to the driver to have preallocated list headers for each destination it may need. pnfs_commit_list() is how the driver actually takes control, it is used instead of nfs_commit_list(). In order to pass information between the above functions, we create a union in nfs_page to hold a lseg (which is possible because the req is not on any list while in transition), and add some flags to indicate if we need to use the pnfs code. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23mlx4: Add blue flame support for kernel consumersEli Cohen2-0/+14
Using blue flame can improve latency by allowing the HW to more efficiently access the WQE. This patch presents two functions that are used to allocate or release HW resources for using blue flame; the caller need to supply a struct mlx4_bf object when allocating resources. Consumers that make use of this API should post doorbells to the UAR object pointed by the initialized struct mlx4_bf; Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: Enabling new steeringYevgeny Petrilin1-3/+9
The mlx4_en module now uses the new steering mechanism. The RX packets are now steered through the MCG table instead of Mac table for unicast, and default entry for multicast. The feature is enabled through INIT_HCA Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: generalization of multicast steering.Yevgeny Petrilin1-2/+12
The same packet steering mechanism would be used both for IB and Ethernet, Both multicasts and unicasts. This commit prepares the general infrastructure for this. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: Reporting HW revision in ethtool -iYevgeny Petrilin1-1/+1
HW revision is derived from device ID and rev id. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Wake on LAN supportYevgeny Petrilin1-0/+4
The driver queries the FW for WOL support. Ethtool get/set_wol is implemented accordingly. Only magic packets are supported at the time. Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Changing interrupt schemeYevgeny Petrilin1-0/+8
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the interrupt vectors. Those vectors are not opened at mlx4 device initialization, opened by demand. Changed the max number of possible EQs according to the new scheme, no longer relies on on number of cores. The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq(). Customers that do not use the new API will get completion vectors as before. Signed-off-by: Markuze Alex <markuze@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23genirq: Provide locked setter for chip, handler, nameThomas Gleixner1-5/+24
Some irq_set_type() callbacks need to change the chip and the handler when the trigger mode changes. We have already a (misnomed) setter function for the handler which can be called from irq_set_type(). Provide one which allows to set chip and name as well. Put the misnomed function under the COMPAT switch and provide a replacement. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-23genirq: Provide a lockdep helperThomas Gleixner1-0/+9
Some irq chips need to call genirq functions for nested chips from their callbacks. That upsets lockdep. So they need to set a different lock class for those nested chips. Provide a helper function to avoid open access to irq_desc. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-23genirq; Remove the last leftovers of the old sparse irq codeThomas Gleixner1-7/+0
All users converted. Get rid of it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-23NFS: Detect loops in a readdir due to bad cookiesBryan Schumaker1-0/+2
Some filesystems (such as ext4) can return the same cookie value for multiple files. If we try to start a readdir with one of these cookies, the server will return the first file found with a cookie of the same value. This can cause the client to enter an infinite loop. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23NFS: Create nfs_open_dir_contextBryan Schumaker1-0/+3
nfs_opendir() created a context that held much more information than we need for a readdir. This patch introduces a slimmed-down nfs_open_dir_context that contains only the cookie and the cred used for RPC operations. The new context will eventually be used to help detect readdir loops. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23[SCSI] libiscsi_tcp: use kmap in xmit pathMike Christie1-0/+1
The xmit path can sleep with a page kmapped in the network xmit code while it waits for space to open up, so we have to use kmap instead of kmap atomic in that path. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-23[SCSI] target: add initial statisticsNicholas Bellinger2-2/+37
This patch adds a target_core_mib.c statistics conversion for backend context struct se_subsystem_dev + struct se_device config_group based statistics in target_core_device.c using CONFIGFS_EATTR() based struct config_item_types from target_core_stat.c code. The conversion from backend /proc/scsi_target/mib/ context output to configfs default groups+attributes include scsi_dev, scsi_lu, and scsi_tgt_dev output from within individual: /sys/kernel/config/target/core/$HBA/DEV/ The legacy procfs output now appear as individual configfs attributes under: *) $HBA/$DEV/statistics/scsi_dev: |-- indx |-- inst |-- ports `-- role *) $HBA/$DEV/statistics/scsi_lu: |-- creation_time |-- dev |-- dev_type |-- full_stat |-- hs_num_cmds |-- indx |-- inst |-- lu_name |-- lun |-- num_cmds |-- prod |-- read_mbytes |-- resets |-- rev |-- state_bit |-- status |-- vend `-- write_mbytes *) $HBA/$DEV/statistics/scsi_tgt_dev: |-- indx |-- inst |-- non_access_lus |-- num_lus |-- resets `-- status The conversion from backend /proc/scsi_target/mib/ context output to configfs default groups+attributes include scsi_port, scsi_tgt_port and scsi_transport output from within individual: /sys/kernel/config/target/fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/ The legacy procfs output now appear as individual configfs attributes under: *) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_port |-- busy_count |-- dev |-- indx |-- inst `-- role *) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_tgt_port |-- dev |-- hs_in_cmds |-- in_cmds |-- indx |-- inst |-- name |-- port_index |-- read_mbytes `-- write_mbytes *) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_transport |-- dev_name |-- device |-- indx `-- inst The conversion from backend /proc/scsi_target/mib/ context output to configfs default groups+attributes include scsi_att_intr_port and scsi_auth_intr output from within individual: /sys/kernel/config/target/fabric/$WWN/tpgt_$TPGT/acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/ The legacy procfs output now appear as individual configfs attributes under: *) acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/scsi_att_intr_port |-- dev |-- indx |-- inst |-- port |-- port_auth_indx `-- port_ident *) acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/scsi_auth_intr |-- att_count |-- creation_time |-- dev |-- dev_or_port |-- hs_num_cmds |-- indx |-- inst |-- intr_name |-- map_indx |-- num_cmds |-- port |-- read_mbytes |-- row_status `-- write_mbytes Also, this includes adding struct target_fabric_configfs_template-> tfc_wwn_fabric_stats_cit and ->tfc_tpg_nacl_stat_cit respectively for use during target_core_fabric_configfs.c:target_fabric_setup_cits() Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-23[SCSI] target: update version to v4.0.0-rc7-mlNicholas Bellinger1-1/+1
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-03-23[SCSI] target: Convert TMR REQ/RSP definitions to target namespaceNicholas Bellinger1-30/+22
This patch changes include/target/target_core_tmr.h code to use target specific 'TMR_*' prefixed definitions for fabric independent SCSI Task Management Request/Request naming in include/scsi/scsi.h definitions for mainline target code. Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>