aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-10-21SELinux: drop useless (and incorrect) AVTAB_MAX_SIZEEric Paris2-3/+2
AVTAB_MAX_SIZE was a define which was supposed to be used in userspace to define a maximally sized avtab when userspace wasn't sure how big of a table it needed. It doesn't make sense in the kernel since we always know our table sizes. The only place it is used we have a more appropiately named define called AVTAB_MAX_HASH_BUCKETS, use that instead. Signed-off-by: Eric Paris <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21SELinux: deterministic ordering of range transition rulesEric Paris1-3/+13
Range transition rules are placed in the hash table in an (almost) arbitrary order. This patch inserts them in a fixed order to make policy retrival more predictable. Signed-off-by: Eric Paris <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21kernel: roundup should only reference arguments onceEric Paris1-1/+6
Currently the roundup macro references it's arguments more than one time. This patch changes it so it will only use its arguments once. Suggested-by: Andrew Morton <[email protected]> Signed-off-by: Eric Paris <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21kernel: rounddown helper functionEric Paris1-0/+6
The roundup() helper function will round a given value up to a multiple of another given value. aka roundup(11, 7) would give 14 = 7 * 2. This new function does the opposite. It will round a given number down to the nearest multiple of the second number: rounddown(11, 7) would give 7. I need this in some future SELinux code and can carry the macro myself, but figured I would put it in the core kernel so others might find and use it if need be. Signed-off-by: Eric Paris <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21secmark: export secctx, drop secmark in procfsEric Paris2-6/+50
The current secmark code exports a secmark= field which just indicates if there is special labeling on a packet or not. We drop this field as it isn't particularly useful and instead export a new field secctx= which is the actual human readable text label. Signed-off-by: Eric Paris <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21conntrack: export lsm context rather than internal secid via netlinkEric Paris2-11/+45
The conntrack code can export the internal secid to userspace. These are dynamic, can change on lsm changes, and have no meaning in userspace. We should instead be sending lsm contexts to userspace instead. This patch sends the secctx (rather than secid) to userspace over the netlink socket. We use a new field CTA_SECCTX and stop using the the old CTA_SECMARK field since it did not send particularly useful information. Signed-off-by: Eric Paris <[email protected]> Reviewed-by: Paul Moore <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21security: secid_to_secctx returns len when data is NULLEric Paris3-4/+16
With the (long ago) interface change to have the secid_to_secctx functions do the string allocation instead of having the caller do the allocation we lost the ability to query the security server for the length of the upcoming string. The SECMARK code would like to allocate a netlink skb with enough length to hold the string but it is just too unclean to do the string allocation twice or to do the allocation the first time and hold onto the string and slen. This patch adds the ability to call security_secid_to_secctx() with a NULL data pointer and it will just set the slen pointer. Signed-off-by: Eric Paris <[email protected]> Reviewed-by: Paul Moore <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21secmark: make secmark object handling genericEric Paris10-141/+104
Right now secmark has lots of direct selinux calls. Use all LSM calls and remove all SELinux specific knowledge. The only SELinux specific knowledge we leave is the mode. The only point is to make sure that other LSMs at least test this generic code before they assume it works. (They may also have to make changes if they do not represent labels as strings) Signed-off-by: Eric Paris <[email protected]> Acked-by: Paul Moore <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21secmark: do not return early if there was no errorEric Paris1-1/+1
Commit 4a5a5c73 attempted to pass decent error messages back to userspace for netfilter errors. In xt_SECMARK.c however the patch screwed up and returned on 0 (aka no error) early and didn't finish setting up secmark. This results in a kernel BUG if you use SECMARK. Signed-off-by: Eric Paris <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21AppArmor: Ensure the size of the copy is < the buffer allocated to hold itJohn Johansen1-1/+3
Actually I think in this case the appropriate thing to do is to BUG as there is currently a case (remove) where the alloc_size needs to be larger than the copy_size, and if copy_size is ever greater than alloc_size there is a mistake in the caller code. Signed-off-by: John Johansen <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21TOMOYO: Print URL information before panic().Tetsuo Handa1-1/+10
Configuration files for TOMOYO 2.3 are not compatible with TOMOYO 2.2. But current panic() message is too unfriendly and is confusing users. Signed-off-by: Tetsuo Handa <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21security: remove unused parameter from security_task_setscheduler()KOSAKI Motohiro8-26/+17
All security modules shouldn't change sched_param parameter of security_task_setscheduler(). This is not only meaningless, but also make a harmful result if caller pass a static variable. This patch remove policy and sched_param parameter from security_task_setscheduler() becuase none of security module is using it. Cc: James Morris <[email protected]> Signed-off-by: KOSAKI Motohiro <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21tpm: change 'tpm_suspend_pcr' to be module parameterDmitry Torokhov1-12/+10
Fix the following warning: drivers/char/tpm/tpm.c:1085: warning: `tpm_suspend_setup' defined but not used and make the workaround operable in case when TPM is compiled as a module. As a side-effect the option will be called tpm.suspend_pcr. Signed-off-by: Dmitry Torokhov <[email protected]> Cc: Rajiv Andrade <[email protected]> Cc: David Safford <[email protected]> Cc: James Morris <[email protected]> Cc: Debora Velarde <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: fix up style problem on /selinux/statusKaiGai Kohei2-11/+7
This patch fixes up coding-style problem at this commit: 4f27a7d49789b04404eca26ccde5f527231d01d5 selinux: fast status update interface (/selinux/status) Signed-off-by: KaiGai Kohei <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: change to new flag variablematt mooney1-1/+1
Replace EXTRA_CFLAGS with ccflags-y. Signed-off-by: matt mooney <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: really fix dependency causing parallel compile failure.Paul Gortmaker2-20/+6
While the previous change to the selinux Makefile reduced the window significantly for this failure, it is still possible to see a compile failure where cpp starts processing selinux files before the auto generated flask.h file is completed. This is easily reproduced by adding the following temporary change to expose the issue everytime: - cmd_flask = scripts/selinux/genheaders/genheaders ... + cmd_flask = sleep 30 ; scripts/selinux/genheaders/genheaders ... This failure happens because the creation of the object files in the ss subdir also depends on flask.h. So simply incorporate them into the parent Makefile, as the ss/Makefile really doesn't do anything unique. With this change, compiling of all selinux files is dependent on completion of the header file generation, and this test case with the "sleep 30" now confirms it is functioning as expected. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: fix parallel compile errorPaul Gortmaker1-1/+1
Selinux has an autogenerated file, "flask.h" which is included by two other selinux files. The current makefile has a single dependency on the first object file in the selinux-y list, assuming that will get flask.h generated before anyone looks for it, but that assumption breaks down in a "make -jN" situation and you get: selinux/selinuxfs.c:35: fatal error: flask.h: No such file or directory compilation terminated. remake[9]: *** [security/selinux/selinuxfs.o] Error 1 Since flask.h is included by security.h which in turn is included nearly everywhere, make the dependency apply to all of the selinux-y list of objs. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: fast status update interface (/selinux/status)KaiGai Kohei5-1/+210
This patch provides a new /selinux/status entry which allows applications read-only mmap(2). This region reflects selinux_kernel_status structure in kernel space. struct selinux_kernel_status { u32 length; /* length of this structure */ u32 sequence; /* sequence number of seqlock logic */ u32 enforcing; /* current setting of enforcing mode */ u32 policyload; /* times of policy reloaded */ u32 deny_unknown; /* current setting of deny_unknown */ }; When userspace object manager caches access control decisions provided by SELinux, it needs to invalidate the cache on policy reload and setenforce to keep consistency. However, the applications need to check the kernel state for each accesses on userspace avc, or launch a background worker process. In heuristic, frequency of invalidation is much less than frequency of making access control decision, so it is annoying to invoke a system call to check we don't need to invalidate the userspace cache. If we can use a background worker thread, it allows to receive invalidation messages from the kernel. But it requires us an invasive coding toward the base application in some cases; E.g, when we provide a feature performing with SELinux as a plugin module, it is unwelcome manner to launch its own worker thread from the module. If we could map /selinux/status to process memory space, application can know updates of selinux status; policy reload or setenforce. A typical application checks selinux_kernel_status::sequence when it tries to reference userspace avc. If it was changed from the last time when it checked userspace avc, it means something was updated in the kernel space. Then, the application can reset userspace avc or update current enforcing mode, without any system call invocations. This sequence number is updated according to the seqlock logic, so we need to wait for a while if it is odd number. Signed-off-by: KaiGai Kohei <[email protected]> Acked-by: Eric Paris <[email protected]> -- security/selinux/include/security.h | 21 ++++++ security/selinux/selinuxfs.c | 56 +++++++++++++++ security/selinux/ss/Makefile | 2 +- security/selinux/ss/services.c | 3 + security/selinux/ss/status.c | 129 +++++++++++++++++++++++++++++++++++ 5 files changed, 210 insertions(+), 1 deletions(-) Signed-off-by: James Morris <[email protected]>
2010-10-21.gitignore: ignore apparmor/rlim_names.hYong Zhang1-0/+1
Signed-off-by: Yong Zhang <[email protected]> Signed-off-by: John Johansen <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21LSM: Fix security_module_enable() error.Tetsuo Handa1-10/+2
We can set default LSM module to DAC (which means "enable no LSM module"). If default LSM module was set to DAC, security_module_enable() must return 0 unless overridden via boot time parameter. Signed-off-by: Tetsuo Handa <[email protected]> Acked-by: Serge E. Hallyn <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21selinux: type_bounds_sanity_check has a meaningless variable declarationEric Paris1-2/+2
type is not used at all, stop declaring and assigning it. Signed-off-by: Eric Paris <[email protected]> Acked-by: Stephen Smalley <[email protected]> Signed-off-by: James Morris <[email protected]>
2010-10-21tomoyo: cleanup. don't store bogus pointerDan Carpenter1-2/+4
If domain is NULL then &domain->list is a bogus address. Let's leave head->r.domain NULL instead of saving an unusable pointer. This is just a cleanup. The current code always checks head->r.eof before dereferencing head->r.domain. Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Tetsuo Handa <[email protected]>
2010-10-20x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64GFUJITA Tomonori1-0/+3
Set CONFIG_ARCH_DMA_ADDR_T_64BIT when we set dma_addr_t to 64 bits in <asm/types.h>; this allows Kconfig decisions based on this property. Signed-off-by: FUJITA Tomonori <[email protected]> LKML-Reference: <[email protected]> Acked-by: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-10-20ceph: do not carry i_lock for readdir from dcacheSage Weil2-26/+18
We were taking dcache_lock inside of i_lock, which introduces a dependency not found elsewhere in the kernel, complicationg the vfs locking scalability work. Since we don't actually need it here anyway, remove it. We only need i_lock to test for the I_COMPLETE flag, so be careful to do so without dcache_lock held. Signed-off-by: Sage Weil <[email protected]>
2010-10-20fs/ceph/xattr.c: Use kmemdupJulia Lawall1-2/+1
Convert a sequence of kmalloc and memcpy to use kmemdup. The semantic patch that performs this transformation is: (http://coccinelle.lip6.fr/) // <smpl> @@ expression a,flag,len; expression arg,e1,e2; statement S; @@ a = - \(kmalloc\|kzalloc\)(len,flag) + kmemdup(arg,len,flag) <... when != a if (a == NULL || ...) S ...> - memcpy(a,arg,len+1); // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20rbd: passing wrong variable to bvec_kunmap_irq()Dan Carpenter1-1/+1
We should be passing "buf" here insead of "bv". This is tricky because it's not the same as kmap() and kunmap(). GCC does warn about it if you compile on i386 with CONFIG_HIGHMEM. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20rbd: null vs ERR_PTRDan Carpenter1-2/+2
ceph_alloc_page_vector() returns ERR_PTR(-ENOMEM) on errors. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: fix num_pages_free accounting in pagelistSage Weil1-0/+1
Decrement the free page counter when removing a page from the free_list. Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: add CEPH_MDS_OP_SETDIRLAYOUT and associated ioctl.Greg Farnum3-1/+70
Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: don't crash when passed bad mount optionsYehuda Sadeh1-1/+1
This only happened when parse_extra_token was not passed to ceph_parse_option() (hence, only happened in rbd). Signed-off-by: Yehuda Sadeh <[email protected]>
2010-10-20ceph: fix debugfs warningsRandy Dunlap1-1/+2
Include "super.h" outside of CONFIG_DEBUG_FS to eliminate a compiler warning: fs/ceph/debugfs.c:266: warning: 'struct ceph_fs_client' declared inside parameter list fs/ceph/debugfs.c:266: warning: its scope is only this definition or declaration, which is probably not what you want fs/ceph/debugfs.c:271: warning: 'struct ceph_fs_client' declared inside parameter list Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Yehuda Sadeh <[email protected]>
2010-10-20block: rbd: removing unnecessary testYehuda Sadeh1-4/+0
rbd_get_segment() can't return a negative value, we don't need to check the return output. Signed-off-by: Yehuda Sadeh <[email protected]>
2010-10-20block: rbd: fixed may leaksVasiliy Kulikov1-6/+8
rbd_client_create() doesn't free rbdc, this leads to many leaks. seg_len in rbd_do_op() is unsigned, so (seg_len < 0) makes no sense. Also if fixed check fails then seg_name is leaked. Signed-off-by: Vasiliy Kulikov <[email protected]> Signed-off-by: Yehuda Sadeh <[email protected]>
2010-10-20ceph: switch from BKL to lock_flocks()Sage Weil1-5/+6
Switch from using the BKL explicitly to the new lock_flocks() interface. Eventually this will turn into a spinlock. Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: preallocate flock state without locks heldGreg Farnum2-15/+44
When the lock_kernel() turns into lock_flocks() and a spinlock, we won't be able to do allocations with the lock held. Preallocate space without the lock, and retry if the lock state changes out from underneath us. Signed-off-by: Greg Farnum <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: add pagelist_reserve, pagelist_truncate, pagelist_set_cursorGreg Farnum2-9/+118
These facilitate preallocation of pages so that we can encode into the pagelist in an atomic context. Signed-off-by: Greg Farnum <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: use mapping->nrpages to determine if mapping is emptySage Weil1-12/+1
This is simpler and faster. Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: only invalidate on check_caps if we actually have pagesSage Weil1-1/+1
The i_rdcache_gen value only implies we MAY have cached pages; actually check the mapping to see if it's worth bothering with an invalidate. Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: do not hide .snap in root directorySage Weil1-1/+0
Snaps in the root directory are now supported by the MDS, and harmless on older versions. Signed-off-by: Sage Weil <[email protected]>
2010-10-20rbd: introduce rados block device (rbd), based on libcephYehuda Sadeh6-2/+1944
The rados block device (rbd), based on osdblk, creates a block device that is backed by objects stored in the Ceph distributed object storage cluster. Each device consists of a single metadata object and data striped over many data objects. The rbd driver supports read-only snapshots. Signed-off-by: Yehuda Sadeh <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: factor out libceph from Ceph file systemYehuda Sadeh73-1838/+2566
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph-rbd: osdc support for osd call and rollback operationsYehuda Sadeh3-0/+29
This will be used for rbd snapshots administration. Signed-off-by: Yehuda Sadeh <[email protected]>
2010-10-20ceph: messenger and osdc changes for rbdYehuda Sadeh6-101/+436
Allow the messenger to send/receive data in a bio. This is added so that we wouldn't need to copy the data into pages or some other buffer when doing IO for an rbd block device. We can now have trailing variable sized data for osd ops. Also osd ops encoding is more modular. Signed-off-by: Yehuda Sadeh <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: refactor osdc requests creation functionsYehuda Sadeh2-57/+155
The osd requests creation are being decoupled from the vino parameter, allowing clients using the osd to use other arbitrary object names that are not necessarily vino based. Also, calc_raw_layout now takes a snap id. Signed-off-by: Yehuda Sadeh <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20ceph: lookup pool in osdmap by nameYehuda Sadeh2-0/+15
Implement a pool lookup by name. This will be used by rbd. Signed-off-by: Yehuda Sadeh <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-10-20x86: Spread tlb flush vector between nodesShaohua Li1-1/+47
Currently flush tlb vector allocation is based on below equation: sender = smp_processor_id() % 8 This isn't optimal, CPUs from different node can have the same vector, this causes a lot of lock contention. Instead, we can assign the same vectors to CPUs from the same node, while different node has different vectors. This has below advantages: a. if there is lock contention, the lock contention is between CPUs from one node. This should be much cheaper than the contention between nodes. b. completely avoid lock contention between nodes. This especially benefits kswapd, which is the biggest user of tlb flush, since kswapd sets its affinity to specific node. In my test, this could reduce > 20% CPU overhead in extreme case.The test machine has 4 nodes and each node has 16 CPUs. I then bind each node's kswapd to the first CPU of the node. I run a workload with 4 sequential mmap file read thread. The files are empty sparse file. This workload will trigger a lot of page reclaim and tlbflush. The kswapd bind is to easy trigger the extreme tlb flush lock contention because otherwise kswapd keeps migrating between CPUs of a node and I can't get stable result. Sure in real workload, we can't always see so big tlb flush lock contention, but it's possible. [ hpa: folded in fix from Eric Dumazet to use this_cpu_read() ] Signed-off-by: Shaohua Li <[email protected]> LKML-Reference: <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-10-20percpu: Introduce a read-mostly percpu APIShaohua Li2-0/+13
Add a new readmostly percpu section and API. This can be used to avoid dirtying data lines which are generally not written to, which is especially important for data which may be accessed by processors other than the one for which the percpu area belongs to. [ hpa: moved it *after* the page-aligned section, for obvious reasons. ] Signed-off-by: Shaohua Li <[email protected]> LKML-Reference: <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-10-20Linux 2.6.36Linus Torvalds1-1/+1
2010-10-20Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linusLinus Torvalds7-9/+14
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: O32 compat/N32: Fix to use compat syscall wrappers for AIO syscalls. MAINTAINERS: Change list for ioc_serial to linux-serial. SERIAL: ioc3_serial: Return -ENOMEM on memory allocation failure MIPS: jz4740: Fix Kbuild Platform file. MIPS: Repair Kbuild make clean breakage.
2010-10-20virtio: console: Don't block entire guest if host doesn't read dataAmit Shah1-3/+14
If the host is slow in reading data or doesn't read data at all, blocking write calls not only blocked the program that called write() but the entire guest itself. To overcome this, let's not block till the host signals it has given back the virtio ring element we passed it. Instead, send the buffer to the host and return to userspace. This operation then becomes similar to how non-blocking writes work, so let's use the existing code for this path as well. This code change also ensures blocking write calls do get blocked if there's not enough room in the virtio ring as well as they don't return -EAGAIN to userspace. Signed-off-by: Amit Shah <[email protected]> Acked-by: Hans de Goede <[email protected]> CC: [email protected] Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>