blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2009-08-09	SUNRPC: Eliminate PROC macro from rpcb_clnt	Chuck Lever	1	-22/+91
	Clean up: Replace PROC macro with open coded C99 structure initializers to improve readability. The rpcbind v4 GETVERSADDR procedure is never sent by the current implementation, so it is not copied to the new structures. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Clean up: Remove unused XDR decoder functions from rpcb_clnt.c	Chuck Lever	1	-81/+0
	Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Introduce new xdr_stream-based decoders to rpcb_clnt.c	Chuck Lever	1	-3/+77
	Replace the open-coded decode logic for PMAP_GETPORT/RPCB_GETADDR with an xdr_stream-based implementation, similar to what NFSv4 uses, to protect against buffer overflows. The new implementation also checks that the incoming port number is reasonable. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Introduce xdr_stream-based decoders for RPCB_UNSET	Chuck Lever	1	-0/+22
	Replace the open-coded decode logic for rpcbind UNSET results with an xdr_stream-based implementation, similar to what NFSv4 uses, to protect against buffer overflows. The new function is unused for the moment. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Clean up: Remove unused XDR encoder functions from rpcb_clnt.c	Chuck Lever	1	-34/+0
	Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Introduce new xdr_stream-based encoders to rpcb_clnt.c	Chuck Lever	1	-1/+75
	Replace the open-coded encode logic for rpcbind arguments with an xdr_stream-based implementation, similar to what NFSv4 uses, to better protect against buffer overflows. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFSD: Support IPv6 addresses in write_failover_ip()	Chuck Lever	1	-14/+7
	In write_failover_ip(), replace the sscanf() with a call to the common sunrpc.ko presentation address parser. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	lockd: Replace nsm_display_address() with rpc_ntop()	Chuck Lever	1	-39/+5
	Clean up. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	lockd: Replace nlm_clear_port()	Chuck Lever	1	-13/+1
	Clean up: Use shared rpc_set_port() function instead of nlm_clear_port(). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Replace nfs_set_port() with rpc_set_port()	Chuck Lever	3	-23/+4
	Clean up. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Replace nfs_parse_ip_address() with rpc_pton()	Chuck Lever	3	-133/+22
	Clean up: Use the common routine now provided in sunrpc.ko for parsing mount addresses. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Use rpc_ntop() for constructing transport address strings	Chuck Lever	2	-102/+52
	Clean up: In addition to using the new generic rpc_ntop() and rpc_get_port() functions, have the RPC client compute the presentation address buffer sizes dynamically using kstrdup(). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Remove duplicate universal address generation	Chuck Lever	4	-46/+29
	RPC universal address generation is currently done in several places: rpcb_clnt.c, nfs4proc.c xprtsock.c, and xprtrdma.c. Remove the redundant cases that convert a socket address to a universal address. The nfs4proc.c case takes a pre-formatted presentation address string, not a socket address, so we'll leave that one. Because the new uaddr constructor uses the recently introduced rpc_ntop(), it now supports proper "::" shorthanding for IPv6 addresses. This allows the kernel to register properly formed universal addresses with the local rpcbind service, in _all_ cases. The kernel can now also send properly formed universal addresses in RPCB_GETADDR requests, and support link-local properly when encoding and decoding IPv6 addresses. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Provide functions for managing universal addresses	Chuck Lever	4	-3/+403
	Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Move XDR data type size macros	Chuck Lever	1	-25/+31
	Clean up: To make subsequent patches cleaner, move the XDR data type size macros to the top of the file (similar to nfs4xdr.c) first. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: Clean up RPCBIND_MAXUADDRLEN definitions	Chuck Lever	1	-1/+16
	Clean up: Replace the single-integer definition of RPCBIND_MAXUADDRLEN with a definition that is based on previously defined address string sizes, and document the way this maximum is calculated. Also provide a separate macro for the size of the port number extension. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Use the authentication flavor list returned by mountd	Chuck Lever	1	-7/+51
	Commit a14017db added support in the kernel's NFS mount client to decode the authentication flavor list returned by mountd. The NFS client can now use this list to determine whether the authentication flavor requested by the user is actually supported by the server. Note we don't actually negotiate the security flavor if none was specified by the user. Instead, we try to use AUTH_SYS, and fail if the server does not support it. This prevents us from negotiating an inappropriate security flavor (some servers list AUTH_NULL first). If the server does not support AUTH_SYS, the user must provide an appropriate security flavor by specifying the "sec=" mount option. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Fix auth flavor len accounting	Chuck Lever	1	-14/+3
	Previous logic in the NFS mount parsing code path assumed auth_flavor_len was set to zero for simple authentication flavors (like AUTH_UNIX), and 1 for compound flavors (like AUTH_GSS). At some earlier point (maybe even before the option parsers were merged?) specific checks for auth_flavor_len being zero were removed from the functions that validate the mount option that sets the mount point's authentication flavor. Since we are populating an array for authentication flavors, the auth_flavor_len should always be set to the number of flavors. Let's eliminate some cleverness here, and prepare for new logic that needs to know the number of flavors in the auth_flavors[] array. (auth_flavors[] is an array because at some point we want to allow a list of acceptable authentication flavors to be specified via the sec= mount option. For now it remains a single element array). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Add ability to send MOUNTPROC_UMNT to the kernel's mountd client	Chuck Lever	2	-0/+80
	After certain failure modes of an NFS mount, an NFS client should send a MOUNTPROC_UMNT request to remove the just-added mount entry from the server's mount table. While no-one should rely on the accuracy of the server's mount table, sending a UMNT is simply being a good internet neighbor. Since NFS mount processing is handled in the kernel now, we will need a function in the kernel's mountd client that can post a MOUNTRPC_UMNT request, in order to handle these failure modes. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFS: Fix up new minorversion= option	Chuck Lever	1	-7/+11
	The new minorversion= mount option (commit 3fd5be9e) was merged at the same time as the recent sloppy parser fixes (commit a5a16bae), so minorversion= still uses the old value parsing logic. If the minorversion= option specifies a bogus value, it should fail with "bad value" not "bad option." Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFSv4: Clean up the nfs.callback_tcpport option	Trond Myklebust	1	-9/+17
	Tighten up the validity checking in param_set_port: check for NULL pointers. Ensure that the option shows up on 'modinfo' output. Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	SUNRPC: convert some sysctls into module parameters	Trond Myklebust	2	-0/+73
	Parameters like the minimum reserved port, and the number of slot entries should really be module parameters rather than sysctls. Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFSv4: Don't do idmapper upcalls for asynchronous RPC calls	Trond Myklebust	1	-28/+58
	We don't want to cause rpciod to hang... Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFSv4: Add 'server capability' flags for NFSv4 recommended attributes	Trond Myklebust	4	-16/+114
	If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code needs to know that, so that it don't keep trying to poll for it. However, by the same count, if the NFSv4 server does support that attribute, then we should ensure that the inode metadata is appropriately labelled as being untrusted. For instance, if we don't know the correct value of the file's uid, we should certainly not be caching ACLs or ACCESS results. Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	NFSv4: Don't loop forever on state recovery failure...	Trond Myklebust	1	-6/+12
	If the server is broken, then retrying forever won't fix it. We should just give up after a while, and return an error to the user. We set the number of retries to 10 for now... Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	nfs: Keep index within mnt_errtbl[]	Roel Kluin	1	-2/+2
	Ensure that index i remains within array mnt_errtbl[] and mnt3_errtbl[]. Signed-off-by: Roel Kluin <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2009-08-09	perf tools: callchain: Fix bad rounding of minimum rate	Frederic Weisbecker	1	-2/+3
	Sometimes we get callchain branches that have a rate under the limit given by the user. Say you launched: perf record -f -g -a ./hackbench 10 perf report -g fractal,10.0 And you got: 2.33% hackbench [kernel] [k] _spin_lock_irqsave \| \|--78.57%-- remove_wait_queue \| poll_freewait \| do_sys_poll \| sys_poll \| sysenter_dispatch \| 0xf7ffa430 \| 0x1ffadea3c \| \|--7.14%-- __up_read \| up_read \| do_page_fault \| page_fault \| 0xf7ffa430 \| 0xa0df710000000a ... It is abnormal to get a 7.14% branch whereas we passed a 10% filter. The problem is that we round down the minimum threshold. This happens mostly when we have very low number of events. If the total amount of your branch is 4 and you have a subranch of 3 events, filtering to 90% will be computed like follows: limit = 4 * 0.9; The result is about 3.6, but the cast to integer will round down to 3. It means that our filter is actually of 75% We must then explicitly round up the minimum threshold. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <20090809024235.GA10146@nowhere> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter tools: Fix libbfd detection for systems with libz dependency	Mike Galbraith	1	-0/+4
	Due to a libz dependency in some distro's binutils package, C++ demangle support isn't compiled in despite the necessary libraries being available. Fix this by adding a -lz link test to the dependency detection rules. Signed-off-by: Mike Galbraith <[email protected]> Acked-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf: "Longum est iter per praecepta, breve et efficax per exempla"	Carlos R. Mafra	1	-0/+225
	A few examples of how 'perf' can be used, from an e-mail by Ingo Molnar http://lkml.org/lkml/2009/8/4/346. Signed-off-by: Carlos R. Mafra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter: Fix a race on perf_counter_ctx	Peter Zijlstra	1	-15/+15
	While extending perfcounters with BTS hw-tracing, Markus Metzger managed to trigger this warning: [ 995.557128] WARNING: at kernel/perf_counter.c:1191 __perf_counter_task_sched_out+0x48/0x6b() triggers because commit 9f498cc5be7e013d8d6e4c616980ed0ffc8680d2 (perf_counter: Full task tracing) removed clearing of tsk->perf_counter_ctxp out from under ctx->lock which introduced a race (against perf_lock_task_context). Move it back and deal with the exit notification by explicitly passing along the former task context. Reported-by: Markus T Metzger <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <1249667341.17467.5.camel@twins> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter: Fix tracepoint sampling to be part of generic sampling	Frederic Weisbecker	2	-15/+15
	Based on Peter's comments, make tracepoint sampling generic just like all the other sampling bits are. This is a rename with no code changes: - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW - struct perf_tracepoint_record to perf_raw_record We want the system in place that transport tracepoints raw samples events into the perf ring buffer to be generalized and usable by any type of counter. Reported-by; Peter Zijlstra <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter: Work around gcc warning by initializing tracepoint record ↵	Frederic Weisbecker	1	-3/+4
	unconditionally Despite that the tracepoint record is always present when the PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning, thinking it might not be initialized: kernel/perf_counter.c: In function ‘perf_counter_output’: kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function Then, initialize it to NULL and always check if it's not NULL before dereference it. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf tools: callchain: Fix sum of percentages to be 100% by displaying ↵	Frederic Weisbecker	1	-3/+44
	amount of ignored chains in fractal mode When we filter the callchains below a given percentage, we ignore them and the end result only shows entries that have an upper percentage than the filter threshold. It seems to users then that we have an imbalance in the percentage, as if the sum inside a profiled branch doesn't reach 100%. Since in the past there have been real perf report bugs that showed the same sypmtom, it would be nice to assure the user that the data is perfect and trustable and it all sums up to 100.00%. So fix this by displaying the remaining hits that have been filtered but without more detail than their amount in each branches. Example while filtering below 50%: 7.73% [k] delay_tsc \| \|--98.22%-- __const_udelay \| \| \| \|--86.37%-- ath5k_hw_register_timeout \| \| ath5k_hw_noise_floor_calibration \| \| ath5k_hw_reset \| \| ath5k_reset \| \| ath5k_config \| \| ieee80211_hw_config \| \| \| \| \| \|--88.53%-- ieee80211_scan_work \| \| \| worker_thread \| \| \| kthread \| \| \| child_rip \| \| --11.47%-- [...] \| --13.63%-- [...] --1.78%-- [...] Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf tools: callchain: Fix 'perf report' display to be callchain by default	Frederic Weisbecker	2	-1/+16
	If we recorded with -g option to record the callchain, right now we require a -g option to perf report as well - and people reported this as unnecessary complication: the user already specified -g once, no need to require it a second time. So if the recording includes call-chains, display the callchain by default from perf report. ( The user can override this default using "-g none" option from perf report. ) Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty ↵	Frederic Weisbecker	1	-0/+2
	callchains When the callchain tree comes to insert an empty backtrace, it raises a spurious warning about the fact we are inserting an empty. This is spurious because the radix tree assumes it did something wrong to reach this state. But it didn't, we just met an empty callchain that has to be ignored. This happens occasionally with certain types of call-chain recordings. If it happens it's a big nuisance as perf report output starts with thousands of warning lines. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf record: Fix the -A UI for empty or non-existent perf.data	Pierre Habouzit	1	-4/+8
	1. Ignore the -A argument if there is no perf.data file 2. Treat an empty file like a non existent file. Else, perf will try to read the perf.data header, and fail with an error. Treating an empty file like a non-existent file makes sense, since an interupted (as in SIGKILLed) perf could leave such files around, and you don't want to annoy the user with errors for files with no data in it. Signed-off-by: Pierre Habouzit <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf util: Fix do_read() to fail on EOF instead of busy-looping	Pierre Habouzit	1	-0/+2
	While toying with perf, I've noticed that perf record can easily enter a busy loop when doing something as silly as: $ perf record -A ls Yeah, do_read here really wants to read a known size, not being able to should die(), not busy-loop ;) That was the cause for the bug. Signed-off-by: Pierre Habouzit <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf list: Fix the output to not include tracepoints without an id	Peter Zijlstra	1	-1/+17
	Stop perf list from displaying tracepoints without an id file, those are special tracepoints that are not interfaced to perfcounters so listing them is erroneous and passing them as events will produce no output. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Jason Baron <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Chris Mason <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support	Paul Mackerras	1	-0/+8
	If we have the powerpc perf_counter backend compiled in, but the cpu we are running on is one where we don't support the PMU, we currently oops in hw_perf_group_sched_in if we try to use any counters, because ppmu is NULL in that case, and we unconditionally dereference ppmu. This fixes the problem by adding a check if ppmu is NULL at the beginning of hw_perf_group_sched_in, and also at the beginning of the other functions that get called from the perf_counter core, i.e. hw_perf_disable, hw_perf_enable, and hw_perf_counter_setup. Signed-off-by: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf stat: Fix tool option consistency: rename -S/--scale to -c/--scale	Brice Goglin	2	-2/+2
	We want to use a coherent flag for -S/--stat across all tools, so free up -S in perf stat. Signed-off-by: Brice Goglin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf report: Add debug help for the finding of symbol bugs - show the symtab ↵	Arnaldo Carvalho de Melo	3	-12/+50
	origin (DSO, build-id, kernel, etc) Used with perf report --verbose: [acme@doppio linux-2.6-tip]$ perf report -v \| head -16 5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame, gfxIImageFrame, nsRect&) 2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal 1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable 1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet 1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock 1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret 1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer 1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet 1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light 1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject 0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc [acme@doppio linux-2.6-tip]$ The new field is just after the symbol address. To help in figuring out symbol resolution bugs. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf report: Fix per task mult-counter stat reporting	Peter Zijlstra	3	-5/+35
	Brice Goglin reported: > I can easily sort them by thread id, but I don't know how to match > my 4 events with each group of 4 lines. Also report the counter id and the time running/enabled stats (in case the counter got time-shared). Reported-by: Brice Goglin <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Tested-by: Brice Goglin <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf tools: Fix multi-counter stat bug caused by incorrect reading of ↵	Peter Zijlstra	1	-1/+2
	perf.data file header Brice Goglin reported that only the first result from a multi-counter perf record --stat run is accurate, the rest looks bogus. A silly mistake made us re-read the first attribute for every recorded attribute. Reported-by: Brice Goglin <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Tested-by: Brice Goglin <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf tools: Fix call-chain cumul hit based sub-total (fractal mode)	Frederic Weisbecker	3	-14/+24
	The callchain fractal mode builds each new total hits in a new branch of profiling by using the parent's hits of the current branch plus the hits of the children. This is wrong, the total hits of a branch should be made of the sum of every children hits, we must ignore the parent hits in this scope. This patch also fixes another mistake with the hit counting. Now the rates are correct. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Pekka Enberg <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf top: Update man page	Mike Galbraith	1	-13/+99
	perf_counter tools: update perf top manual page to reflect current implementation. Signed-off-by: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf top: Improve interactive key handling	Mike Galbraith	1	-35/+73
	Pressing any key which is not currently mapped to functionality, based on startup command line options, displays currently mapped keys, and prompts for input. Pressing any unmapped key at the prompt returns the user to display mode with variables unchanged. eg, pressing ? <SPACE> <ESC> etc displays currently available keys, the value of the variable associated with that key, and prompts. Pressing same again aborts input. Signed-off-by: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter: Fix software counters for fast moving event sources	Peter Zijlstra	1	-70/+94
	Reimplement the software counters to deal with fast moving event sources (such as tracepoints). This means being able to generate multiple overflows from a single 'event' as well as support throttling. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter tools: Allow perf top top users to switch between weighted and ↵	Mike Galbraith	1	-7/+17
	individual counter display Add [w]eighted hotkey. Pressing [w] toggles between displaying weighted total of all counters, and the counter selected via [E]vent select key. ------------------------------------------------------------------------------ PerfTop: 90395 irqs/sec kernel:16.1% [cache-misses/cache-references/instructions], (all, 4 CPUs) ------------------------------------------------------------------------------ weight samples pcnt RIP kernel function ______ _______ _____ ________________ _______________ 1275408.6 10881 - 5.3% - ffffffff81146f70 : copy_page_c 553683.4 43569 - 21.3% - ffffffff81146f20 : clear_page_c 74075.0 6768 - 3.3% - ffffffff81147190 : copy_user_generic_string 40602.9 7538 - 3.7% - ffffffff81284ba2 : _spin_lock 26882.1 965 - 0.5% - ffffffff8109d280 : file_ra_state_init [w] ------------------------------------------------------------------------------ PerfTop: 91221 irqs/sec kernel:14.5% [10000Hz cache-misses], (all, 4 CPUs) ------------------------------------------------------------------------------ weight samples pcnt RIP kernel function ______ _______ _____ ________________ _______________ 47320.00 - 22.3% - ffffffff81146f20 : clear_page_c 14261.00 - 6.7% - ffffffff810992f5 : __rmqueue 11046.00 - 5.2% - ffffffff81146f70 : copy_page_c 7842.00 - 3.7% - ffffffff81284ba2 : _spin_lock 7234.00 - 3.4% - ffffffff810aa1d6 : unmap_vmas Signed-off-by: Mike Galbraith <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter tools: Fix/resurrect perf top annotation in a simple ↵	Mike Galbraith	1	-40/+456
	interactive form perf top used to have annotation support, but it has bitrotted and removed. This patch restores that: it allows the user to select any symbol in kernel space for source level annotation on the fly, switch between event counters and alter display variables. When symbol details are being displayed, stopping annotation reverts to normal. known keys: [d] select display delay. [e] select display entries (lines). [E] select annotation event counter. [f] select normal display count filter. [F] select annotation display count filter (percentage). [qQ] quit. [s] select annotation symbol and start annotation. [S] stop annotation, revert to normal display. [z] toggle event count zeroing. Sample: ------------------------------------------------------------------------------ PerfTop: 16719 irqs/sec kernel:78.7% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs) ------------------------------------------------------------------------------ Showing cache-misses for e1000_clean_rx_irq Events Pcnt (>=3%) 0 0.0% /* adjust length to remove Ethernet CRC / 0 0.0% if (!(adapter->flags2 & FLAG2_CRC_STRIPPING)) 0 0.0% length -= 4; 436 5.0% f039: 41 f6 84 24 5c 29 00 testb $0x1,0x295c(%r12) 0 0.0% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx 0 0.0% f08c: 48 83 ef 02 sub $0x2,%rdi 0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi 811 9.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi) 0 0.0% 0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) { 0 0.0% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15) 7226 82.6% f119: 0f 85 24 fe ff ff jne ef43 <e1000_clean_rx_irq+0x84> Available events: 0 cache-misses 1 cache-references 2 instructions 3 cycles Enter details event counter: 2 ------------------------------------------------------------------------------ PerfTop: 15035 irqs/sec kernel:79.0% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs) ------------------------------------------------------------------------------ Showing instructions for e1000_clean_rx_irq Events Pcnt (>=3%) 0 0.0% int work_done, int work_to_do) 0 0.0% { 175 0.9% eebf: 55 push %rbp 1898 9.8% eec0: 48 89 e5 mov %rsp,%rbp 0 0.0% 0 0.0% i = rx_ring->next_to_clean; 140 0.7% ef0a: 0f b7 41 1a movzwl 0x1a(%rcx),%eax 670 3.4% ef0e: 89 45 ac mov %eax,-0x54(%rbp) 0 0.0% { 0 0.0% memcpy(skb->data + offset, from, len); 91 0.5% f07b: 49 8b b6 e8 00 00 00 mov 0xe8(%r14),%rsi 1153 5.9% f082: 48 8b b8 e8 00 00 00 mov 0xe8(%rax),%rdi 42 0.2% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx 14 0.1% f08c: 48 83 ef 02 sub $0x2,%rdi 0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi 1618 8.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi) 0 0.0% 0 0.0% /* return some buffers to hardware, one at a time is too slow */ 0 0.0% if (cleaned_count >= E1000_RX_BUFFER_WRITE) { 867 4.5% f0e7: 83 7d b0 0f cmpl $0xf,-0x50(%rbp) 0 0.0% 0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) { 37 0.2% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15) 4047 20.8% f119: 0f 85 24 fe ff ff jne ef43 <e1000_clean_rx_irq+0x84> Signed-off-by: Mike Galbraith <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-09	perf_counter: Fix/complete ftrace event records sampling	Frederic Weisbecker	7	-41/+126
	This patch implements the kernel side support for ftrace event record sampling. A new counter sampling attribute is added: PERF_SAMPLE_TP_RECORD which requests ftrace events record sampling. In this case if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint fires, we emit the tracepoint binary record to the perfcounter event buffer, as a sample. Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf record: perf record -f -F 1 -a -e workqueue:workqueue_execution perf report -D 0x21e18 [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........ . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!...... . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1........... . 0040: e0 b1 31 81 ff ff ff ff ....... . 0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33 The raw ftrace binary record starts at offset 0020. Translation: struct trace_entry { type = 0x2b = 43; flags = 1; preempt_count = 2; pid = 0xa = 10; tgid = 0xa = 10; } thread_comm = "events/1" thread_pid = 0xa = 10; func = 0xffffffff8131b1e0 = flush_to_ldisc() What will come next? - Userspace support ('perf trace'), 'flight data recorder' mode for perf trace, etc. - The unconditional copy from the profiling callback brings some costs however if someone wants no such sampling to occur, and needs to be fixed in the future. For that we need to have an instant access to the perf counter attribute. This is a matter of a flag to add in the struct ftrace_event. - Take care of the events recursivity! Don't ever try to record a lock event for example, it seems some locking is used in the profiling fast path and lead to a tracing recursivity. That will be fixed using raw spinlock or recursivity protection. - [...] - Profit! :-) Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Li Zefan <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Gabriel Munteanu <[email protected]> Cc: Lai Jiangshan <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>