aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-05-01take cgroup_open() and cpuset_open() to fs/proc/base.cAl Viro5-31/+35
Signed-off-by: Al Viro <[email protected]>
2013-05-01ppc: Clean up scanlogDavid Howells1-18/+11
Clean up the pseries scanlog driver's use of procfs: (1) Don't need to save the proc_dir_entry pointer as we have the filename to remove with. (2) Save the scan log buffer pointer in a static variable (there is only one of it) and don't save it in the PDE (which doesn't have a destructor). Signed-off-by: David Howells <[email protected]> cc: Benjamin Herrenschmidt <[email protected]> cc: Paul Mackerras <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01ppc: Clean up rtas_flash driver somewhatDavid Howells1-248/+204
Clean up some of the problems with the rtas_flash driver: (1) It shouldn't fiddle with the internals of the procfs filesystem (altering pde->count). (2) If pid namespaces are in effect, then you can get multiple inodes connected to a single pde, thereby rendering the pde->count > 2 test useless. (3) The pde->count fudging doesn't work for forked, dup'd or cloned file descriptors, so add static mutexes and use them to wrap access to the driver through read, write and release methods. (4) The driver can only handle one device, so allocate most of the data previously attached to the pde->data as static variables instead (though allocate the validation data buffer with kmalloc). (5) We don't need to save the pde pointers as long as we have the filenames available for removal. (6) Don't try to multiplex what the update file read method does based on the filename. Instead provide separate file ops and split the function. Whilst we're at it, tabulate the procfile information and loop through it when creating or destroying them rather than manually coding each one. [Folded fixes from Vasant Hegde <[email protected]>] Signed-off-by: David Howells <[email protected]> cc: Benjamin Herrenschmidt <[email protected]> cc: Paul Mackerras <[email protected]> cc: Anton Blanchard <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01hostap: proc: Use remove_proc_subtree()David Howells1-19/+1
Use remove_proc_subtree() rather than remove_proc_entry() to remove a device-specific proc directory and all its children. Signed-off-by: David Howells <[email protected]> cc: Jouni Malinen <[email protected]> cc: Johannes Berg <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01drm: proc: Use remove_proc_subtree()David Howells1-15/+7
Use remove_proc_subtree() rather than remove_proc_entry() to remove a minor-specific drm proc directory and all its children. Things could theoretically be improved by storing the drm_minor pointer in the minor-specific dir proc_dir_entry struct data and then scrapping the list of proc files - but that's shared with the debugfs interface where you can't do that, so I don't see an easy way of doing it. Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01drm: proc: Use minor->index to label things, not PDE->nameDavid Howells3-11/+7
Use minor->index to label things, not the name field from the proc_dir_entry of the /proc/dwm/<minor>/ directory. Also, use "%u" not "%d" to render the value and use a 12-byte buffer in which to render the integer, not a 16-byte buffer. The longest string an unsigned int can give you is 10 chars (4294967295) plus a NUL, so round up to 12 as the stack is likely to be 4- or 8-byte aligned. Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01drm: Constify drm_proc_list[]David Howells2-4/+4
Constify drm_proc_list[] and related pointers. Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01zoran: Don't print proc_dir_entry data in debugDavid Howells1-1/+1
Don't print proc_dir_entry data in debug as we're soon to have no direct access to the contents of the PDE. Print what was put in there instead. Signed-off-by: David Howells <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()David Howells1-11/+18
Don't access the proc_dir_entry in ReiserFS's r_open(), r_start() r_show() procfs interface functions. ReiserFS stores the ->show() method pointer in PDE->data and the super_block pointer in PDE->parent->data. This isn't changing. Currently, ReiserFS passes the PDE pointer into seq_file::private from r_open() so that r_start() and r_show() can then access it. Instead, use seq_open_private() to allocate a two-pointer struct that's passed through seq_file::private and put the ->show() method and the sb pointers in there. Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Supply an accessor for getting the data from a PDE's parentDavid Howells5-3/+11
Supply an accessor function for getting the private data from the parent proc_dir_entry struct of the proc_dir_entry struct associated with an inode. ReiserFS, for instance, stores the super_block pointer in the proc directory it makes for that super_block, and a pointer to the respective seq_file show function in each of the proc files in that directory. This allows a reduction in the number of file_operations structs, open functions and seq_operations structs required. The problem otherwise is that each show function requires two pieces of data but only has storage for one per PDE (and this has no release function). Signed-off-by: David Howells <[email protected]> Acked-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> cc: Jerry Chuang <[email protected]> cc: Maxim Mikityanskiy <[email protected]> cc: YAMANE Toshiaki <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01airo: Use remove_proc_subtree()David Howells1-36/+13
Use remove_proc_subtree() to remove the airo device subdir and all its children instead of doing it manually. Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01rtl8192u: Don't need to save device proc dir PDEDavid Howells2-13/+6
Don't need to save the PDE of a directory created under /proc/net/rtl8192/ as we can use proc subtree deletion to get rid of it and all its children. Signed-off-by: David Howells <[email protected]> Acked-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> cc: Jerry Chuang <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01rtl8187se: Use a dir under /proc/net/r8180/David Howells2-19/+8
Create a dir under /proc/net/r8180/ named for the device and create that device's files under there. This means that there won't be a problem for multiple devices in the system (if such is possible) and it means we don't need to save the 'device directory' PDE any more as we can just do a proc subtree removal. Signed-off-by: David Howells <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> cc: Maxim Mikityanskiy <[email protected]> cc: YAMANE Toshiaki <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Add proc_mkdir_data()David Howells6-33/+28
Add proc_mkdir_data() to allow procfs directories to be created that are annotated at the time of creation with private data rather than doing this post-creation. This means no access is then required to the proc_dir_entry struct to set this. Signed-off-by: David Howells <[email protected]> Acked-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> cc: Neela Syam Kolli <[email protected]> cc: Jerry Chuang <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}David Howells5-37/+38
Move some bits from linux/proc_fs.h to linux/of.h, signal.h and tty.h. Also move proc_tty_init() and proc_device_tree_init() to fs/proc/internal.h as they're internal to procfs. Signed-off-by: David Howells <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Grant Likely <[email protected]> cc: [email protected] cc: [email protected] cc: Greg Kroah-Hartman <[email protected]> cc: Jri Slaby <[email protected]> Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Move PDE_NET() to fs/proc/proc_net.cDavid Howells2-5/+4
Move PDE_NET() to fs/proc/proc_net.c as that's where the only user is. Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Split the namespace stuff out into linux/proc_ns.hDavid Howells15-92/+109
Split the proc namespace stuff out into linux/proc_ns.h. Signed-off-by: David Howells <[email protected]> cc: [email protected] cc: Serge E. Hallyn <[email protected]> cc: Eric W. Biederman <[email protected]> Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Move proc_fd() to fs/proc/fd.hDavid Howells2-5/+5
Move proc_fd() to fs/proc/fd.h. Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Uninline pid_delete_dentry()David Howells2-9/+14
Uninline pid_delete_dentry() as it's only used by three function pointers. Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2013-05-01proc: Supply PDE attribute setting accessor functionsDavid Howells14-35/+40
Supply accessor functions to set attributes in proc_dir_entry structs. The following are supplied: proc_set_size() and proc_set_user(). Signed-off-by: David Howells <[email protected]> Acked-by: Mauro Carvalho Chehab <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2013-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1506-37021/+86321
Pull networking updates from David Miller: "Highlights (1721 non-merge commits, this has to be a record of some sort): 1) Add 'random' mode to team driver, from Jiri Pirko and Eric Dumazet. 2) Make it so that any driver that supports configuration of multiple MAC addresses can provide the forwarding database add and del calls by providing a default implementation and hooking that up if the driver doesn't have an explicit set of handlers. From Vlad Yasevich. 3) Support GSO segmentation over tunnels and other encapsulating devices such as VXLAN, from Pravin B Shelar. 4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton. 5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita Dukkipati. 6) In the PHY layer, allow supporting wake-on-lan in situations where the PHY registers have to be written for it to be configured. Use it to support wake-on-lan in mv643xx_eth. From Michael Stapelberg. 7) Significantly improve firewire IPV6 support, from YOSHIFUJI Hideaki. 8) Allow multiple packets to be sent in a single transmission using network coding in batman-adv, from Martin Hundebøll. 9) Add support for T5 cxgb4 chips, from Santosh Rastapur. 10) Generalize the VXLAN forwarding tables so that there is more flexibility in configurating various aspects of the endpoints. From David Stevens. 11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver, from Dmitry Kravkov. 12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo Neira Ayuso. 13) Start adding networking selftests. 14) In situations of overload on the same AF_PACKET fanout socket, or per-cpu packet receive queue, minimize drop by distributing the load to other cpus/fanouts. From Willem de Bruijn and Eric Dumazet. 15) Add support for new payload offset BPF instruction, from Daniel Borkmann. 16) Convert several drivers over to mdoule_platform_driver(), from Sachin Kamat. 17) Provide a minimal BPF JIT image disassembler userspace tool, from Daniel Borkmann. 18) Rewrite F-RTO implementation in TCP to match the final specification of it in RFC4138 and RFC5682. From Yuchung Cheng. 19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear you like netlink, so I implemented netlink dumping of netlink sockets.") From Andrey Vagin. 20) Remove ugly passing of rtnetlink attributes into rtnl_doit functions, from Thomas Graf. 21) Allow userspace to be able to see if a configuration change occurs in the middle of an address or device list dump, from Nicolas Dichtel. 22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes Frederic Sowa. 23) Increase accuracy of packet length used by packet scheduler, from Jason Wang. 24) Beginning set of changes to make ipv4/ipv6 fragment handling more scalable and less susceptible to overload and locking contention, from Jesper Dangaard Brouer. 25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*() instead. From Hong Zhiguo. 26) Optimize route usage in IPVS by avoiding reference counting where possible, from Julian Anastasov. 27) Convert IPVS schedulers to RCU, also from Julian Anastasov. 28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger Eitzenberger. 29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG, nfnetlink_log, and nfnetlink_queue. From Gao feng. 30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa. 31) Support several new r8169 chips, from Hayes Wang. 32) Support tokenized interface identifiers in ipv6, from Daniel Borkmann. 33) Use usbnet_link_change() helper in USB net driver, from Ming Lei. 34) Add 802.1ad vlan offload support, from Patrick McHardy. 35) Support mmap() based netlink communication, also from Patrick McHardy. 36) Support HW timestamping in mlx4 driver, from Amir Vadai. 37) Rationalize AF_PACKET packet timestamping when transmitting, from Willem de Bruijn and Daniel Borkmann. 38) Bring parity to what's provided by /proc/net/packet socket dumping and the info provided by netlink socket dumping of AF_PACKET sockets. From Nicolas Dichtel. 39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin Poirier" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) filter: fix va_list build error af_unix: fix a fatal race with bit fields bnx2x: Prevent memory leak when cnic is absent bnx2x: correct reading of speed capabilities net: sctp: attribute printl with __printf for gcc fmt checks netlink: kconfig: move mmap i/o into netlink kconfig netpoll: convert mutex into a semaphore netlink: Fix skb ref counting. net_sched: act_ipt forward compat with xtables mlx4_en: fix a build error on 32bit arches Revert "bnx2x: allow nvram test to run when device is down" bridge: avoid OOPS if root port not found drivers: net: cpsw: fix kernel warn on cpsw irq enable sh_eth: use random MAC address if no valid one supplied 3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA) tg3: fix to append hardware time stamping flags unix/stream: fix peeking with an offset larger than data in queue unix/dgram: fix peeking with an offset larger than data in queue unix/dgram: peek beyond 0-sized skbs openvswitch: Remove unneeded ovs_netdev_get_ifindex() ...
2013-05-01filter: fix va_list build errorXi Wang1-0/+1
This patch fixes the following build error. In file included from include/linux/filter.h:52:0, from arch/arm/net/bpf_jit_32.c:14: include/linux/printk.h:54:2: error: unknown type name ‘va_list’ include/linux/printk.h:105:21: error: unknown type name ‘va_list’ include/linux/printk.h:108:30: error: unknown type name ‘va_list’ Signed-off-by: Xi Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds49-444/+2917
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Assorted fixes and cleanups to the existing drivers plus a new driver for IMS Passenger Control Unit device they use for ther in-flight entertainment system." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits) Input: trackpoint - Optimize trackpoint init to use power-on reset Input: apbps2 - convert to devm_ioremap_resource() Input: ALPS - use %ph to print buffers ARM - shmobile: Armadillo800EVA: Move st1232 reset pin handling Input: st1232 - add reset pin handling Input: st1232 - convert to devm_* infrastructure Input: MT - handle semi-mt devices in core Input: adxl34x - use spi_get_drvdata() Input: ad7877 - use spi_get_drvdata() and spi_set_drvdata() Input: ads7846 - use spi_get_drvdata() and spi_set_drvdata() Input: ims-pcu - fix a memory leak on error Input: sysrq - supplement reset sequence with timeout functionality Input: tegra-kbc - support for defining row/columns based on SoC Input: imx_keypad - switch to using managed resources Input: arc_ps2 - add support for device tree Input: mma8450 - fix signed 12bits to 32bits conversion Input: eeti_ts - remove redundant null check Input: edt-ft5x06 - remove redundant null check before kfree Input: ad714x - add CONFIG_PM_SLEEP to suspend/resume functions Input: adxl34x - add CONFIG_PM_SLEEP to suspend/resume functions ...
2013-05-01xfs: fix da node magic number mismatchesDave Chinner2-3/+2
Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Ben Myers <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-05-01xfs: Remote attr validation fixes and optimisationsDave Chinner1-14/+5
- optimise the calcuation for the number of blocks in a remote xattr. - check attribute length against MAX_XATTR_SIZE, not MAXPATHLEN - whitespace fixes Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Ben Myers <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-05-01af_unix: fix a fatal race with bit fieldsEric Dumazet2-8/+9
Using bit fields is dangerous on ppc64/sparc64, as the compiler [1] uses 64bit instructions to manipulate them. If the 64bit word includes any atomic_t or spinlock_t, we can lose critical concurrent changes. This is happening in af_unix, where unix_sk(sk)->gc_candidate/ gc_maybe_cycle/lock share the same 64bit word. This leads to fatal deadlock, as one/several cpus spin forever on a spinlock that will never be available again. A safer way would be to use a long to store flags. This way we are sure compiler/arch wont do bad things. As we own unix_gc_lock spinlock when clearing or setting bits, we can use the non atomic __set_bit()/__clear_bit(). recursion_level can share the same 64bit location with the spinlock, as it is set only with this spinlock held. [1] bug fixed in gcc-4.8.0 : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080 Reported-by: Ambrose Feinstein <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01Merge branch 'bnx2x'David S. Miller1-2/+6
Yuval Mintz says: ==================== This fixes 2 small bugs - one which may cause an unnecessary link flap, and the other is a small memory leak when unloading while cnic is not loaded. ==================== Signed-off-by: David S. Miller <[email protected]>
2013-05-01bnx2x: Prevent memory leak when cnic is absentYuval Mintz1-0/+2
bnx2x driver allocates searcher T2 tables, but it releases that memory during unload only released if the cnic is loaded. Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Eilon Greenstein <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01bnx2x: correct reading of speed capabilitiesYaniv Rosner1-2/+4
When the bnx2x driver reads the port configuration - mask irrelevant bits. Without this change, the unintended bits may cause the driver to needlessly toggle the link, as a comparison in the link flap avoidance flow will show that the old link did not advertise the same capabilities and thus cannot be retained. Signed-off-by: Yaniv Rosner <[email protected]> Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Eilon Greenstein <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01net: sctp: attribute printl with __printf for gcc fmt checksDaniel Borkmann1-1/+1
Let GCC check for format string errors in sctp's probe printl function. This patch fixes the warning when compiled with W=1: net/sctp/probe.c:73:2: warning: function might be possible candidate for 'gnu_printf' format attribute [-Wmissing-format-attribute] Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01netlink: kconfig: move mmap i/o into netlink kconfigDaniel Borkmann2-9/+9
Currently, in menuconfig, Netlink's new mmaped IO is the very first entry under the ``Networking support'' item and comes even before ``Networking options'': [ ] Netlink: mmaped IO Networking options ---> ... Lets move this into ``Networking options'' under netlink's Kconfig, since this might be more appropriate. Introduced by commit ccdfcc398 (``netlink: mmaped netlink: ring setup''). Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01netpoll: convert mutex into a semaphoreNeil Horman2-8/+8
Bart Van Assche recently reported a warning to me: <IRQ> [<ffffffff8103d79f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103d7fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff814761dd>] mutex_trylock+0x16d/0x180 [<ffffffff813968c9>] netpoll_poll_dev+0x49/0xc30 [<ffffffff8136a2d2>] ? __alloc_skb+0x82/0x2a0 [<ffffffff81397715>] netpoll_send_skb_on_dev+0x265/0x410 [<ffffffff81397c5a>] netpoll_send_udp+0x28a/0x3a0 [<ffffffffa0541843>] ? write_msg+0x53/0x110 [netconsole] [<ffffffffa05418bf>] write_msg+0xcf/0x110 [netconsole] [<ffffffff8103eba1>] call_console_drivers.constprop.17+0xa1/0x1c0 [<ffffffff8103fb76>] console_unlock+0x2d6/0x450 [<ffffffff8104011e>] vprintk_emit+0x1ee/0x510 [<ffffffff8146f9f6>] printk+0x4d/0x4f [<ffffffffa0004f1d>] scsi_print_command+0x7d/0xe0 [scsi_mod] This resulted from my commit ca99ca14c which introduced a mutex_trylock operation in a path that could execute in interrupt context. When mutex debugging is enabled, the above warns the user when we are in fact exectuting in interrupt context interrupt context. After some discussion, It seems that a semaphore is the proper mechanism to use here. While mutexes are defined to be unusable in interrupt context, no such condition exists for semaphores (save for the fact that the non blocking api calls, like up and down_trylock must be used when in irq context). Signed-off-by: Neil Horman <[email protected]> Reported-by: Bart Van Assche <[email protected]> CC: Bart Van Assche <[email protected]> CC: David Miller <[email protected]> CC: [email protected] Signed-off-by: David S. Miller <[email protected]>
2013-05-01netlink: Fix skb ref counting.Pravin B Shelar1-1/+0
Commit f9c2288837ba072b21dba955f04a4c97eaa77b1e (netlink: implement memory mapped recvmsg) increamented skb->users ref count twice for a dump op which does not look right. Following patch fixes that. CC: Patrick McHardy <[email protected]> Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds11-49/+93
Pull tile arch changes from Chris Metcalf: "These are some minor new feature work and other changes that didn't merit getting pushed up after the 3.9 merge window closed. There should be a lot more activity in the 3.11 merge window" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: Fix syscall return value passed to tracepoint tile: comment assumption about __insn_mtspr for <asm/irqflags.h> tile: ns2cycles should use __raw_get_cpu_var arch: remove KCORE_ELF again [tile] tile: remove two outdated Kconfig entries tile: support atomic64_dec_if_positive() tile: support TIF_SYSCALL_TRACEPOINT; select HAVE_SYSCALL_TRACEPOINTS tile: Add definition of NR_syscalls tile: move declaration of sys_call_table to <asm/syscall.h> arch/tile: Enable HAVE_ARCH_TRACEHOOK arch/tile: Call tracehook_report_syscall_{entry,exit} in syscall trace
2013-05-01init: Do not warn on non-zero initcall returnSteven Rostedt1-4/+1
Commit f91eb62f71b3 ("init: scream bloody murder if interrupts are enabled too early") added three new warnings. The first two seemed reasonable, but the third included a warning when an initcall returned non-zero. Although, the third WARN() does include an imbalanced preempt disabled, or irqs disable, it shouldn't warn if it only had an initcall that just returns non-zero. In fact, according to Linus, it shouldn't print at all. As it only prints with initcall_debug set, and that already shows enough information to fix things. Link: http://lkml.kernel.org/r/CA+55aFzaBC5SFi7=F2mfm+KWY5qTsBmOqgbbs8E+LUS8JK-sBg@mail.gmail.com Suggested-by: Linus Torvalds <[email protected]> Reported-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01net_sched: act_ipt forward compat with xtablesJamal Hadi Salim1-3/+30
Deal with changes in newer xtables while maintaining backward compatibility. Thanks to Jan Engelhardt for suggestions. Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-05-01Merge branch 'topic/omap3isp' of ↵Linus Torvalds3-84/+225
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull omap3isp clk support from Mauro Carvalho Chehab: "This patch were sent in separate as it depends on a merge from clock framework, that you merged in commit 362ed48dee50" * 'topic/omap3isp' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] omap3isp: Use the common clock framework
2013-05-01jfs: fix a couple racesDave Kleikamp2-2/+3
This patch fixes races uncovered by xfstests testcase 068. One race is the result of jfs_sync() trying to write a sync point to the journal after it has been frozen (or possibly in the process). Since freezing sync's the journal, there is no need to write a sync point so we simply want to return. The second involves jfs_write_inode() being called on a deleted inode. It calls jfs_flush_journal which is held up by the jfs_commit thread doing the final iput on the same deleted inode, which itself is waiting for the I_SYNC flag to be cleared. jfs_write_inode need not do anything when i_nlink is zero, which is the easy fix. Reported-by: Michael L. Semon <[email protected]> Signed-off-by: Dave Kleikamp <[email protected]>
2013-05-01Merge branch 'next' into for-linusDmitry Torokhov17369-589942/+1081675
Prepare first set of updates for 3.10 merge window.
2013-05-01Merge branch 'ipc-scalability'Linus Torvalds7-341/+540
Merge IPC cleanup and scalability patches from Andrew Morton. This cleans up many of the oddities in the IPC code, uses the list iterator helpers, splits out locking and adds per-semaphore locks for greater scalability of the IPC semaphore code. Most normal user-level locking by now uses futexes (ie pthreads, but also a lot of specialized locks), but SysV IPC semaphores are apparently still used in some big applications, either for portability reasons, or because they offer tracking and undo (and you don't need to have a special shared memory area for them). Our IPC semaphore scalability was pitiful. We used to lock much too big ranges, and we used to have a single ipc lock per ipc semaphore array. Most loads never cared, but some do. There are some numbers in the individual commits. * ipc-scalability: ipc: sysv shared memory limited to 8TiB ipc/msg.c: use list_for_each_entry_[safe] for list traversing ipc,sem: fine grained locking for semtimedop ipc,sem: have only one list in struct sem_queue ipc,sem: open code and rename sem_lock ipc,sem: do not hold ipc lock more than necessary ipc: introduce lockless pre_down ipcctl ipc: introduce obtaining a lockless ipc object ipc: remove bogus lock comment for ipc_checkid ipc/msgutil.c: use linux/uaccess.h ipc: refactor msg list search into separate function ipc: simplify msg list search ipc: implement MSG_COPY as a new receive mode ipc: remove msg handling from queue scan ipc: set EFAULT as default error in load_msg() ipc: tighten msg copy loops ipc: separate msg allocation from userspace copy ipc: clamp with min()
2013-05-01ipc: sysv shared memory limited to 8TiBRobin Holt2-2/+2
Trying to run an application which was trying to put data into half of memory using shmget(), we found that having a shmall value below 8EiB-8TiB would prevent us from using anything more than 8TiB. By setting kernel.shmall greater than 8EiB-8TiB would make the job work. In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX. ipc/shm.c: 458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params) 459 { ... 465 int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT; ... 474 if (ns->shm_tot + numpages > ns->shm_ctlall) 475 return -ENOSPC; [[email protected]: make ipc/shm.c:newseg()'s numpages size_t, not int] Signed-off-by: Robin Holt <[email protected]> Reported-by: Alex Thorlton <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc/msg.c: use list_for_each_entry_[safe] for list traversingNikola Pajkovsky1-27/+8
The ipc/msg.c code does its list operations by hand and it open-codes the accesses, instead of using for_each_entry_[safe]. Signed-off-by: Nikola Pajkovsky <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Peter Hurley <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc,sem: fine grained locking for semtimedopRik van Riel4-125/+203
Introduce finer grained locking for semtimedop, to handle the common case of a program wanting to manipulate one semaphore from an array with multiple semaphores. If the call is a semop manipulating just one semaphore in an array with multiple semaphores, only take the lock for that semaphore itself. If the call needs to manipulate multiple semaphores, or another caller is in a transaction that manipulates multiple semaphores, the sem_array lock is taken, as well as all the locks for the individual semaphores. On a 24 CPU system, performance numbers with the semop-multi test with N threads and N semaphores, look like this: vanilla Davidlohr's Davidlohr's + Davidlohr's + threads patches rwlock patches v3 patches 10 610652 726325 1783589 2142206 20 341570 365699 1520453 1977878 30 288102 307037 1498167 2037995 40 290714 305955 1612665 2256484 50 288620 312890 1733453 2650292 60 289987 306043 1649360 2388008 70 291298 306347 1723167 2717486 80 290948 305662 1729545 2763582 90 290996 306680 1736021 2757524 100 292243 306700 1773700 3059159 [[email protected]: do not call sem_lock when bogus sma] [[email protected]: make refcounter atomic] Signed-off-by: Rik van Riel <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Chegu Vinod <[email protected]> Cc: Jason Low <[email protected]> Reviewed-by: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Emmanuel Benisty <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc,sem: have only one list in struct sem_queueRik van Riel1-31/+34
Having only one list in struct sem_queue, and only queueing simple semaphore operations on the list for the semaphore involved, allows us to introduce finer grained locking for semtimedop. Signed-off-by: Rik van Riel <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Chegu Vinod <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Jason Low <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc,sem: open code and rename sem_lockRik van Riel1-6/+23
Rename sem_lock() to sem_obtain_lock(), so we can introduce a sem_lock() later that only locks the sem_array and does nothing else. Open code the locking from ipc_lock() in sem_obtain_lock() so we can introduce finer grained locking for the sem_array in the next patch. [[email protected]: propagate the ipc_obtain_object() errno out of sem_obtain_lock()] Signed-off-by: Rik van Riel <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Chegu Vinod <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Jason Low <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc,sem: do not hold ipc lock more than necessaryDavidlohr Bueso2-48/+118
Instead of holding the ipc lock for permissions and security checks, among others, only acquire it when necessary. Some numbers.... 1) With Rik's semop-multi.c microbenchmark we can see the following results: Baseline (3.9-rc1): cpus 4, threads: 256, semaphores: 128, test duration: 30 secs total operations: 151452270, ops/sec 5048409 + 59.40% a.out [kernel.kallsyms] [k] _raw_spin_lock + 6.14% a.out [kernel.kallsyms] [k] sys_semtimedop + 3.84% a.out [kernel.kallsyms] [k] avc_has_perm_flags + 3.64% a.out [kernel.kallsyms] [k] __audit_syscall_exit + 2.06% a.out [kernel.kallsyms] [k] copy_user_enhanced_fast_string + 1.86% a.out [kernel.kallsyms] [k] ipc_lock With this patchset: cpus 4, threads: 256, semaphores: 128, test duration: 30 secs total operations: 273156400, ops/sec 9105213 + 18.54% a.out [kernel.kallsyms] [k] _raw_spin_lock + 11.72% a.out [kernel.kallsyms] [k] sys_semtimedop + 7.70% a.out [kernel.kallsyms] [k] ipc_has_perm.isra.21 + 6.58% a.out [kernel.kallsyms] [k] avc_has_perm_flags + 6.54% a.out [kernel.kallsyms] [k] __audit_syscall_exit + 4.71% a.out [kernel.kallsyms] [k] ipc_obtain_object_check 2) While on an Oracle swingbench DSS (data mining) workload the improvements are not as exciting as with Rik's benchmark, we can see some positive numbers. For an 8 socket machine the following are the percentages of %sys time incurred in the ipc lock: Baseline (3.9-rc1): 100 swingbench users: 8,74% 400 swingbench users: 21,86% 800 swingbench users: 84,35% With this patchset: 100 swingbench users: 8,11% 400 swingbench users: 19,93% 800 swingbench users: 77,69% [[email protected]: fix two locking bugs] [[email protected]: prevent releasing RCU read lock twice in semctl_main] [[email protected]: coding-style fixes] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Reviewed-by: Chegu Vinod <[email protected]> Acked-by: Michel Lespinasse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Jason Low <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc: introduce lockless pre_down ipcctlDavidlohr Bueso2-5/+29
Various forms of ipc use ipcctl_pre_down() to retrieve an ipc object and check permissions, mostly for IPC_RMID and IPC_SET commands. Introduce ipcctl_pre_down_nolock(), a lockless version of this function. The locking version is retained, yet modified to call the nolock version without affecting its semantics, thus transparent to all ipc callers. Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Cc: Chegu Vinod <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Jason Low <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc: introduce obtaining a lockless ipc objectDavidlohr Bueso2-14/+59
Through ipc_lock() and therefore ipc_lock_check() we currently return the locked ipc object. This is not necessary for all situations and can, therefore, cause unnecessary ipc lock contention. Introduce analogous ipc_obtain_object() and ipc_obtain_object_check() functions that only lookup and return the ipc object. Both these functions must be called within the RCU read critical section. [[email protected]: propagate the ipc_obtain_object() errno from ipc_lock()] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Reviewed-by: Chegu Vinod <[email protected]> Acked-by: Michel Lespinasse <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Jason Low <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc: remove bogus lock comment for ipc_checkidDavidlohr Bueso1-6/+1
This series makes the sysv semaphore code more scalable, by reducing the time the semaphore lock is held, and making the locking more scalable for semaphore arrays with multiple semaphores. The first four patches were written by Davidlohr Buesso, and reduce the hold time of the semaphore lock. The last three patches change the sysv semaphore code locking to be more fine grained, providing a performance boost when multiple semaphores in a semaphore array are being manipulated simultaneously. On a 24 CPU system, performance numbers with the semop-multi test with N threads and N semaphores, look like this: vanilla Davidlohr's Davidlohr's + Davidlohr's + threads patches rwlock patches v3 patches 10 610652 726325 1783589 2142206 20 341570 365699 1520453 1977878 30 288102 307037 1498167 2037995 40 290714 305955 1612665 2256484 50 288620 312890 1733453 2650292 60 289987 306043 1649360 2388008 70 291298 306347 1723167 2717486 80 290948 305662 1729545 2763582 90 290996 306680 1736021 2757524 100 292243 306700 1773700 3059159 This patch: There is no reason to be holding the ipc lock while reading ipcp->seq, hence remove misleading comment. Also simplify the return value for the function. Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Cc: Chegu Vinod <[email protected]> Cc: Emmanuel Benisty <[email protected]> Cc: Jason Low <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Stanislav Kinsbursky <[email protected]> Tested-by: Sedat Dilek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-05-01ipc/msgutil.c: use linux/uaccess.hHoSung Jung1-2/+2
Signed-off-by: HoSung Jung <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>