summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2011-05-18perf bench, x86: Add alternatives-asm.h wrapperIngo Molnar
perf bench needs this to build the kernel's memcpy routine: In file included from bench/mem-memcpy-x86-64-asm.S:2:0: bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17perf: Fix multi-event parsing bugStephane Eranian
This patch fixes an issue with event parsing. The following commit appears to have broken the ability to specify a comma separated list of events: commit ceb53fbf6dbb1df26d38379a262c6981fe73dd36 Author: Ingo Molnar <mingo@elte.hu> Date: Wed Apr 27 04:06:33 2011 +0200 perf stat: Fail more clearly when an invalid modifier is specified This patch fixes this while preserving the desired effect: $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null Performance counter stats for 'ls /dev/null': 365956 instructions:u # 0.00 insns per cycle 731806 instructions:k # 0.00 insns per cycle 0.001108862 seconds time elapsed $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers Signed-off-by: Stephane Eranian <eranian@google.com> Cc: acme@redhat.com Cc: peterz@infradead.org Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10perf probe: Fix the missed parameter initializationLin Ming
pubname_callback_param::found should be initialized to 0 in fastpath lookup, the structure is on the stack and uninitialized otherwise. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10Merge commit 'v2.6.39-rc7' into perf/coreIngo Molnar
Merge reason: pull in the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-07perf tools: Makefile: Use gcc to determine ARCHLin Ming
The original Makefile uses "uname -m" to determine ARCH. This causes problem on x86 when compile perf tool on 32 bit userspace with a 64 bit kernel. bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages: bench/../../../arch/x86/lib/memcpy_64.S:28: Error: bad register name `%rdi' This is because "uname -m" returns x86_64 and memcpy_64.S is included in 32 bit build. Reported-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304743274.3132.17.camel@localhost Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-30perf stat: Tell user about unsupported events in the listDavid Ahern
Similar to perf-record, tell user about unsupported events that will not be counted if invoked in verbose mode. e.g., $ perf stat -e dTLB-prefetch-misses -v -- sleep 1 dTLB-prefetch-misses event is not supported by the kernel. dTLB-prefetch-misses: 0 0 0 Performance counter stats for 'sleep 1': <not counted> dTLB-prefetch-misses 1.001884783 seconds time elapsed Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf list: Fix max event string sizeIngo Molnar
Recent stalled-cycles event names were larger than the 40 chars printout used by perf list. Extend that, make it robust for future extensions and also adjust alignments in face of wider event names. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf stat: Fail softly on unsupported eventsIngo Molnar
David Ahern reported this perf stat failure: > # /tmp/build-perf/perf stat -- sleep 1 > Error: stalled-cycles-frontend event is not supported. > Fatal: Not all events could be opened. > > This is a Dell R410 with an E5620 processor. Fail in a softer fashion on unknown/unsupported events. Reported-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf stat: Leave more room for percentagesIngo Molnar
Triple digit percentages do not fit otherwise. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n005io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf stat: Adjust stall cycles warning percentagesIngo Molnar
Adjust to color thresholds to better match the percentages seen in real workloads. Both are now a bit more sensitive. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n004io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf stat: Analyze front-end and back-end stall countsIngo Molnar
Sample output: Performance counter stats for './loop_1b': 873.691065 task-clock # 1.000 CPUs utilized 1 context-switches # 0.000 M/sec 1 CPU-migrations # 0.000 M/sec 96 page-faults # 0.000 M/sec 2,012,637,222 cycles # 2.304 GHz (66.58%) 1,001,397,911 stalled-cycles-frontend # 49.76% frontend cycles idle (66.58%) 7,523,398 stalled-cycles-backend # 0.37% backend cycles idle (66.76%) 2,004,551,046 instructions # 1.00 insns per cycle # 0.50 stalled cycles per insn (66.80%) 1,001,304,992 branches # 1146.063 M/sec (66.76%) 39,453 branch-misses # 0.00% of all branches (66.64%) 0.874046121 seconds time elapsed Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n003io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29perf tools: Add front-end and back-end stalled cycles supportIngo Molnar
Update perf tooling to deal with front-end and back-end stalled cycles events. Add both the default 'perf stat' output. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n002io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-28perf stat: Fix compatibility behaviorIngo Molnar
Instead of failing on an unknown event, when new perf stat is run on older kernels: $ ./perf stat true Error: open_counter returned with 22 (Invalid argument). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. Just ignore EINVAL and ENOSYS, we'll print the results as not counted: Performance counter stats for 'true': 0.239483 task-clock # 0.493 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 86 page-faults # 0.359 M/sec 704,766 cycles # 2.943 GHz <not counted> stalled-cycles 381,961 instructions # 0.54 insns per cycle 69,626 branches # 290.735 M/sec 4,594 branch-misses # 6.60% of all branches 0.000485883 seconds time elapsed Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-28perf stat: Add --sync/-S optionIngo Molnar
--sync will tell perf stat to run sync() before starting a command. This allows IO-heavy tests to be used with --repeat, without one iteration impacting the other. Elapsed time will stabilize for example: before: 3.971525714 seconds time elapsed ( +- 8.56% ) after: 3.211098537 seconds time elapsed ( +- 1.52% ) So measurements will be more accurate. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27perf stat: Fix printout vertical alignmentIngo Molnar
Before: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | After: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | The last column (stddev noise) is properly aligned, vertically. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn0dsrm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Add -d/--detailed flag to run with a lot of eventsIngo Molnar
Add the new -d/--detailed flag, which generates a pretty detailed event list: Performance counter stats for './hackbench 10' (10 runs): 1514.287888 task-clock # 10.897 CPUs utilized ( +- 3.05% ) 39,698 context-switches # 0.026 M/sec ( +- 12.19% ) 8,147 CPU-migrations # 0.005 M/sec ( +- 16.55% ) 17,918 page-faults # 0.012 M/sec ( +- 0.37% ) 2,944,504,050 cycles # 1.944 GHz ( +- 3.89% ) (32.60%) 1,043,971,283 stalled-cycles # 35.45% of all cycles are idle ( +- 5.22% ) (44.48%) 1,655,906,768 instructions # 0.56 insns per cycle # 0.63 stalled cycles per insn ( +- 1.95% ) (55.09%) 338,832,373 branches # 223.757 M/sec ( +- 1.96% ) (64.47%) 3,892,416 branch-misses # 1.15% of all branches ( +- 5.49% ) (73.12%) 606,410,482 L1-dcache-loads # 400.459 M/sec ( +- 1.29% ) (71.21%) 31,204,395 L1-dcache-load-misses # 5.15% of all L1-dcache hits ( +- 3.04% ) (60.43%) 3,922,751 LLC-loads # 2.590 M/sec ( +- 6.80% ) (46.87%) 5,037,288 LLC-load-misses # 3.327 M/sec ( +- 3.56% ) (13.00%) 0.138966828 seconds time elapsed ( +- 4.11% ) This can be used "at a glance" for narrower analysis. -d can also be used in addition to other -e events, to further expand an event list. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-cxs98quixs3qyvdqx3goojc4@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Print out miss/hit ratio for L1 data-cache eventsIngo Molnar
Print out this kind of l1-dcache-misses percentage: Performance counter stats for './bw_tcp localhost': 29,956,262,201 cycles # 3.002 GHz (scaled from 85.14%) 8,255,209,558 stalled-cycles # 27.56% of all cycles are idle (scaled from 86.56%) 1,206,130,308 l1-dcache-misses # 40.49% of all L1-dcache hits (scaled from 86.30%) 2,978,756,779 l1-dcache-refs # 298.512 M/sec (scaled from 70.02%) 8,861,956,159 instructions # 0.30 insns per cycle # 0.93 stalled cycles per insn (scaled from 84.27%) 1,644,306,068 branches # 164.782 M/sec (scaled from 86.43%) 74,778,443 branch-misses # 4.55% of all branches (scaled from 70.69%) 9978.695711 task-clock # 0.693 CPUs utilized 14.404347983 seconds time elapsed And color the result depending on the severity of cache-trashing. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-54gmz0zymaid84zcs7joq02p@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Print branch misses warning colorsIngo Molnar
Print the missed-branches percentage with different warning level ASCII colors, as the percentage passes the 5%/10%/20% thresholds. These thresholds are set to relatively low levels, because on most CPUs even a moderate percentage of branch-misses already shows up as a slowdown. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-ybqukg7p86leiup7gl03ecgk@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Print stalled cycles warning colorsIngo Molnar
Print the stalled-cycles percentage with different warning level ASCII colors, as the percentage passes the 25%/50%/75% thresholds. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-e25zz44rcms7mu9az4fu5zp0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Fix -nan% output in perf stat noise printoutsIngo Molnar
Before: 0 CPU-migrations # 0.000 M/sec ( +- -nan% ) After: 0 CPU-migrations # 0.000 M/sec ( +- 0.00% ) Also factor out the noise printing function. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-z89h2v1bk1mikcbsf7e6v34q@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Add stalled cycles to the default outputIngo Molnar
The new default output looks like this: Performance counter stats for './loop_1b_instructions': 236.010686 task-clock # 0.996 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 99 page-faults # 0.000 M/sec 756,487,646 cycles # 3.205 GHz 354,938,996 stalled-cycles # 46.92% of all cycles are idle 1,001,403,797 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn 100,279,773 branches # 424.895 M/sec 12,646 branch-misses # 0.013 % of all branches 0.236902540 seconds time elapsed We dropped cache-refs and cache-misses and added stalled-cycles - this is a more generic "how well utilized is the CPU" metric. If the stalled-cycles ratio is too high then more specific measurements can be taken to figure out the source of the inefficiency. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Add stalled cycles accounting, prettify the resulting outputIngo Molnar
Add stalled cycles accounting and use it to print the "cycles stalled per instruction" value. Also change the unit of the cycles output from M/sec to GHz - this is more intuitive. Prettify the output to: Performance counter stats for './loop_1b_instructions': 239.775036 task-clock # 0.997 CPUs utilized 761,903,912 cycles # 3.178 GHz 356,620,620 stalled-cycles # 46.81% of all cycles are idle 1,001,578,351 instructions # 1.31 insns per cycle # 0.36 stalled cycles per insn 14,782 cache-references # 0.062 M/sec 5,694 cache-misses # 38.520 % of all cache refs 0.240493656 seconds time elapsed Also adjust the --repeat output to make the percentages align vertically: Performance counter stats for './loop_1b_instructions' (10 runs): 236.096793 task-clock # 0.997 CPUs utilized ( +- 0.011% ) 756,553,086 cycles # 3.204 GHz ( +- 0.002% ) 354,942,692 stalled-cycles # 46.92% of all cycles are idle ( +- 0.008% ) 1,001,389,700 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn ( +- 0.000% ) 10,166 cache-references # 0.043 M/sec ( +- 0.742% ) 468 cache-misses # 4.608 % of all cache refs ( +- 13.385% ) 0.236874136 seconds time elapsed ( +- 0.01% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-uapziqny39601apdmmhoz7hk@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Factor our shadow statsIngo Molnar
Create update_shadow_stats() which is then used in both read_counter_aggr() and read_counter(). This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES was not updated in read_counter(). Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-9uc55z3g88r47exde7zxjm6p@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Make all displayed event names parseable as wellIngo Molnar
Right now we display this by default: 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 85 page-faults # 0.420 M/sec The task-clock-msecs event cannot actually be passed back as an event name, the event name we recognize is 'task-clock'. So change the output of the cpu-clock and task-clock events to be idempotent. ( Units should be printed out in the right-side column, if needed. ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Fail more clearly when an invalid modifier is specifiedIngo Molnar
Currently we fail without printing any error message on "perf stat -e task-clock-msecs". The reason is that the task-clock event is matched and the "-msecs" postfix is assumed to be an event modifier - but is not recognized. This patch changes the code to be more informative: $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers And restructures the return value of parse_event_modifier() to allow the printing of all variants of invalid event modifiers. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf tools: Accept case-insensitive symbolic event variantsIngo Molnar
We currently fail on something like '-e CPU-migrations', with: invalid or unsupported event: 'CPU-migrations' While 'CPU-migrations' is how we actually print out the event in the default perf stat output: Performance counter stats for 'true': 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec So change the matching to be case-insensitive. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Print cache misses as percentageIngo Molnar
Before: 113,393,041 cache-references # 83.636 M/sec 7,052,454 cache-misses # 5.202 M/sec After: 112,589,441 cache-references # 87.925 M/sec 6,556,354 cache-misses # 5.823 % misses/hits percentages are more expressive than absolute numbers or rates. (Also prettify the CPUs printout line to not have a trailing whitespace.) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-axm28f43x439bl41zkvfzd63@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26perf stat: Print stalled cycles percentageIngo Molnar
Print: 611,527 cycles 400,553 instructions # ( 0.71 instructions per cycle ) 77,809 stalled-cycles # ( 12.71% of all cycles ) 0.000610987 seconds time elapsed Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/n/tip-fd6x8r1cpyb6zhlrc4ix8m45@git.kernel.org
2011-04-26perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLESIngo Molnar
The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate cycles the CPU does nothing useful, because it is stalled on a cache-miss or some other condition. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-20perf script: improve validation of sample attributes for output fieldsDavid Ahern
Check for required sample attributes using evsel rather than sample_type in the session header. If the attribute for a default field is not present for the event type (e.g., new command operating on file from older kernel) the field is removed from the output list. Expected event types must exist. For example, if a user specifies -f trace:time,trace -f sw:time,cpu,sym the perf.data file must contain both tracepoints and software events (ie., it is an error if either does not exist in the file). Attribute checking is done once at the beginning of perf-script rather than for each sample. v1 -> v2: - addressed comments from acme Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf script: Add support for PERF_TYPE_RAWArun Sharma
Useful for getting stack traces for hardware events not handled by PERF_TYPE_HARDWARE. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-qimdcdpekjqxuyqovy4kjusx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf tools: git mv tools/perf/{features-tests.mak,config/}Michael Witten
Signed-off-by: Michael Witten <mfwitten@gmail.com> Link: http://lkml.kernel.org/n/tip-a6zhefjayuounko1tk5sjji2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf tools: Move `try-cc'Michael Witten
The `try-cc' user-defined function was in tools/perf/feature-tests.mak; this commit moves it to tools/perf/config/utilities.mak. Signed-off-by: Michael Witten <mfwitten@gmail.com> Link: http://lkml.kernel.org/n/tip-bqhwcuxsrve0iodn6q4ejaoi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf tools: Makefile: PYTHON{,_CONFIG} to bandage Python 3 incompatibilityMichael Witten
Currently, Python 3 is not supported by perf's code; this can cause the build to fail for systems that have Python 3 installed as the default python: python{,-config} The Correct Solution is to write compatibility code so that Python 3 works out-of-the-box. However, users often have an ancillary Python 2 installed: python2{,-config} Therefore, a quick fix is to allow the user to specify those ancillary paths as the python binaries that Makefile should use, thereby avoiding Python 3 altogether; as an added benefit, the Python binaries may be installed in non-standard locations without the need for updating any PATH variable. This commit adds the ability to set PYTHON and/or PYTHON_CONFIG either as environment variables or as make variables on the command line; the paths may be relative, and usually only PYTHON is necessary in order for PYTHON_CONFIG to be defined implicitly. Some rudimentary error checking is performed when the user explicitly specifies a value for any of these variables. In addition, this commit introduces significantly robust makefile infrastructure for working with paths and communicating with the shell; it's currently only used for handling Python, but I hope it will prove useful in refactoring the makefiles. Thanks to: Raghavendra D Prabhu <rprabhu@wnohang.net> for motivating this patch. Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net> Link: http://lkml.kernel.org/r/e987828e-87ec-4973-95e7-47f10f5d9bab-mfwitten@gmail.com Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf tools: Makefile: Clean up `python/perf.so' ruleMichael Witten
There is no need for a subshell or an explicit `export'; as per the POSIX Shell Command Language specification: http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_01 http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_10_02 It is only necessary to include the environment variable assignment just before the command to be run. Also, it is better to use single-quotes, because GNU make might expand `$(BASIC_CFLAGS)' into something that the shell could interpret within double-quotes. Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net> Link: http://lkml.kernel.org/n/tip-58n38o02ocuzrm9qh096hsf5@git.kernel.org Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19perf symbols: Give more useful names to 'self' parametersArnaldo Carvalho de Melo
One more installment on an area that is mostly dormant. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: we'll be queueing up dependent changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-15perf evsel: Fix use of inheritArnaldo Carvalho de Melo
perf stat doesn't mmap and its perfectly fine for it to use task-bound counters with inheritance. So set the attr.inherit on the caller and leave the syscall itself to validate it. When the mmap fails perf_evlist__mmap will just emit a warning if this is the failure reason. Reported-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-15perf hists browser: Fix seg fault when annotate null symbolLin Ming
In hists browser, press hotkey 'a' to annotate current symbol. Now it causes segment fault if 'a' is pressed on a null symbol. Here are 2 small bugs: - In perf_evsel__hists_browse, the condition check after 'a' is pressed is not correct, we should check ->sym instead of ->map. - In symbol__tui_annotate we must check whether sym is NULL or not before getting annotation structure. This patch fixes above 2 small bugs. Link: http://lkml.kernel.org/r/1302244286.4106.36.camel@minggr.sh.intel.com Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-08perf: Fix a build error with some GCC versionsEric Dumazet
Fix this: util/cgroup.c: In function ‘open_cgroup’: util/cgroup.c:16:16: error: ‘saved_ptr’ may be used uninitialized in this function util/cgroup.c:16:16: note: ‘saved_ptr’ was declared here Apparently newer GCC (4.6) can figure out that this variable is properly initialized - but some versions of GCC (such as 4.5.2) need help. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-07Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', ↵Linus Torvalds
'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, fpu: Fix FPU exception handling on non-SSE systems x86, hibernate: Initialize mmu_cr4_features during boot x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change x86: visws: Fixup irq overhaul fallout * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Clean up rebalance_domains() load-balance interval calculation * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/mrst/vrtc: Fix boot crash in mrst_rtc_init() rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm() * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Fix cpumask leak in __setup_irq() * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf probe: Fix listing incorrect line number with inline function perf probe: Fix to find recursively inlined function perf probe: Fix multiple --vars options behavior perf probe: Fix to remove redundant close perf probe: Fix to ensure function declared file
2011-04-07Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
2011-04-05perf probe: Fix listing incorrect line number with inline functionMasami Hiramatsu
Fix a bug showing incorrect line number when a probe is put on the head of an inline function. This patch updates find_perf_probe_point() and introduces new rules to get correct line number. - If debuginfo doesn't have a correct file name, we shouldn't return line number too, because, without file name, line number is meaningless. - If the address is in a function, it stores the function name and the offset from the function entry. - If the address is on a line, it tries to get the relative line number from the function entry line, except for the address is same as the entry address of the function (in this case, the relative line number should be 0). - If the address is in an inline function entry (call-site), it uses the inline function call line number as the line on which the address is. - If the address is in an inline function body, it stores the inline function name and offset from the inline function call site instead of the (non-inlined) function. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092605.2132.11629.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05perf probe: Fix to find recursively inlined functionMasami Hiramatsu
Fix die_find_inlinefunc() to return correct innermost inlined function at given address. Without this fix, it returns the outermost inlined function. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092559.2132.78634.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05perf probe: Fix multiple --vars options behaviorMasami Hiramatsu
Fix a bug that perf-probe fails to initialize libdwfl and shows incorrect error when user gives multiple --vars options. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092553.2132.42691.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05perf probe: Fix to remove redundant closeMasami Hiramatsu
Since dwfl_end() closes given fd with dwfl, caller doesn't need to close its fd when finishing process. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092547.2132.93728.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05perf probe: Fix to ensure function declared fileMasami Hiramatsu
Fix to ensure function declared file matches given file name. This fixes a potential bug. As I've commented on Lin Ming's fastpath enhancement, decl_file should be checked on each probe point if user gives a probe point as func@file. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092541.2132.3584.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31perf: mmap 512 kiB by defaultFrederic Weisbecker
The default setting of perf record is to mmap 128 pages if the user did not override with -m. However the page size may vary accross different architecture settings, giving different default size between each. Moreover the kernel side still has a default max number of mlocked pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages with page size > 4096 overlaps this threshold. Thus, better adapt to this limitation and set the default number of pages to fit those 512 kiB + 1 page. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-30perf script: Add more documentation about the -f/--fields parametersArnaldo Carvalho de Melo
Using the commit log for 2c9e45f. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>