123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463 |
- perf-record(1)
- ==============
- NAME
- ----
- perf-record - Run a command and record its profile into perf.data
- SYNOPSIS
- --------
- [verse]
- 'perf record' [-e <EVENT> | --event=EVENT] [-l] [-a] <command>
- 'perf record' [-e <EVENT> | --event=EVENT] [-l] [-a] -- <command> [<options>]
- DESCRIPTION
- -----------
- This command runs a command and gathers a performance counter profile
- from it, into perf.data - without displaying anything.
- This file can then be inspected later on, using 'perf report'.
- OPTIONS
- -------
- <command>...::
- Any command you can specify in a shell.
- -e::
- --event=::
- Select the PMU event. Selection can be:
- - a symbolic event name (use 'perf list' to list all events)
- - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
- - a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
- 'param1', 'param2', etc are defined as formats for the PMU in
- /sys/bus/event_source/devices/<pmu>/format/*.
- - a symbolically formed event like 'pmu/config=M,config1=N,config3=K/'
- where M, N, K are numbers (in decimal, hex, octal format). Acceptable
- values for each of 'config', 'config1' and 'config2' are defined by
- corresponding entries in /sys/bus/event_source/devices/<pmu>/format/*
- param1 and param2 are defined as formats for the PMU in:
- /sys/bus/event_source/devices/<pmu>/format/*
- There are also some params which are not defined in .../<pmu>/format/*.
- These params can be used to overload default config values per event.
- Here is a list of the params.
- - 'period': Set event sampling period
- - 'freq': Set event sampling frequency
- - 'time': Disable/enable time stamping. Acceptable values are 1 for
- enabling time stamping. 0 for disabling time stamping.
- The default is 1.
- - 'call-graph': Disable/enable callgraph. Acceptable str are "fp" for
- FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
- "no" for disable callgraph.
- - 'stack-size': user stack size for dwarf mode
- Note: If user explicitly sets options which conflict with the params,
- the value set by the params will be overridden.
- Also not defined in .../<pmu>/format/* are PMU driver specific
- configuration parameters. Any configuration parameter preceded by
- the letter '@' is not interpreted in user space and sent down directly
- to the PMU driver. For example:
- perf record -e some_event/@cfg1,@cfg2=config/ ...
- will see 'cfg1' and 'cfg2=config' pushed to the PMU driver associated
- with the event for further processing. There is no restriction on
- what the configuration parameters are, as long as their semantic is
- understood and supported by the PMU driver.
- - a hardware breakpoint event in the form of '\mem:addr[/len][:access]'
- where addr is the address in memory you want to break in.
- Access is the memory access type (read, write, execute) it can
- be passed as follows: '\mem:addr[:[r][w][x]]'. len is the range,
- number of bytes from specified addr, which the breakpoint will cover.
- If you want to profile read-write accesses in 0x1000, just set
- 'mem:0x1000:rw'.
- If you want to profile write accesses in [0x1000~1008), just set
- 'mem:0x1000/8:w'.
- - a group of events surrounded by a pair of brace ("{event1,event2,...}").
- Each event is separated by commas and the group should be quoted to
- prevent the shell interpretation. You also need to use --group on
- "perf report" to view group events together.
- --filter=<filter>::
- Event filter. This option should follow a event selector (-e) which
- selects either tracepoint event(s) or a hardware trace PMU
- (e.g. Intel PT or CoreSight).
- - tracepoint filters
- In the case of tracepoints, multiple '--filter' options are combined
- using '&&'.
- - address filters
- A hardware trace PMU advertises its ability to accept a number of
- address filters by specifying a non-zero value in
- /sys/bus/event_source/devices/<pmu>/nr_addr_filters.
- Address filters have the format:
- filter|start|stop|tracestop <start> [/ <size>] [@<file name>]
- Where:
- - 'filter': defines a region that will be traced.
- - 'start': defines an address at which tracing will begin.
- - 'stop': defines an address at which tracing will stop.
- - 'tracestop': defines a region in which tracing will stop.
- <file name> is the name of the object file, <start> is the offset to the
- code to trace in that file, and <size> is the size of the region to
- trace. 'start' and 'stop' filters need not specify a <size>.
- If no object file is specified then the kernel is assumed, in which case
- the start address must be a current kernel memory address.
- <start> can also be specified by providing the name of a symbol. If the
- symbol name is not unique, it can be disambiguated by inserting #n where
- 'n' selects the n'th symbol in address order. Alternately #0, #g or #G
- select only a global symbol. <size> can also be specified by providing
- the name of a symbol, in which case the size is calculated to the end
- of that symbol. For 'filter' and 'tracestop' filters, if <size> is
- omitted and <start> is a symbol, then the size is calculated to the end
- of that symbol.
- If <size> is omitted and <start> is '*', then the start and size will
- be calculated from the first and last symbols, i.e. to trace the whole
- file.
- If symbol names (or '*') are provided, they must be surrounded by white
- space.
- The filter passed to the kernel is not necessarily the same as entered.
- To see the filter that is passed, use the -v option.
- The kernel may not be able to configure a trace region if it is not
- within a single mapping. MMAP events (or /proc/<pid>/maps) can be
- examined to determine if that is a possibility.
- Multiple filters can be separated with space or comma.
- --exclude-perf::
- Don't record events issued by perf itself. This option should follow
- a event selector (-e) which selects tracepoint event(s). It adds a
- filter expression 'common_pid != $PERFPID' to filters. If other
- '--filter' exists, the new filter expression will be combined with
- them by '&&'.
- -a::
- --all-cpus::
- System-wide collection from all CPUs.
- -p::
- --pid=::
- Record events on existing process ID (comma separated list).
- -t::
- --tid=::
- Record events on existing thread ID (comma separated list).
- This option also disables inheritance by default. Enable it by adding
- --inherit.
- -u::
- --uid=::
- Record events in threads owned by uid. Name or number.
- -r::
- --realtime=::
- Collect data with this RT SCHED_FIFO priority.
- --no-buffering::
- Collect data without buffering.
- -c::
- --count=::
- Event period to sample.
- -o::
- --output=::
- Output file name.
- -i::
- --no-inherit::
- Child tasks do not inherit counters.
- -F::
- --freq=::
- Profile at this frequency.
- -m::
- --mmap-pages=::
- Number of mmap data pages (must be a power of two) or size
- specification with appended unit character - B/K/M/G. The
- size is rounded up to have nearest pages power of two value.
- Also, by adding a comma, the number of mmap pages for AUX
- area tracing can be specified.
- --group::
- Put all events in a single event group. This precedes the --event
- option and remains only for backward compatibility. See --event.
- -g::
- Enables call-graph (stack chain/backtrace) recording.
- --call-graph::
- Setup and enable call-graph (stack chain/backtrace) recording,
- implies -g. Default is "fp".
- Allows specifying "fp" (frame pointer) or "dwarf"
- (DWARF's CFI - Call Frame Information) or "lbr"
- (Hardware Last Branch Record facility) as the method to collect
- the information used to show the call graphs.
- In some systems, where binaries are build with gcc
- --fomit-frame-pointer, using the "fp" method will produce bogus
- call graphs, using "dwarf", if available (perf tools linked to
- the libunwind or libdw library) should be used instead.
- Using the "lbr" method doesn't require any compiler options. It
- will produce call graphs from the hardware LBR registers. The
- main limition is that it is only available on new Intel
- platforms, such as Haswell. It can only get user call chain. It
- doesn't work with branch stack sampling at the same time.
- When "dwarf" recording is used, perf also records (user) stack dump
- when sampled. Default size of the stack dump is 8192 (bytes).
- User can change the size by passing the size after comma like
- "--call-graph dwarf,4096".
- -q::
- --quiet::
- Don't print any message, useful for scripting.
- -v::
- --verbose::
- Be more verbose (show counter open errors, etc).
- -s::
- --stat::
- Record per-thread event counts. Use it with 'perf report -T' to see
- the values.
- -d::
- --data::
- Record the sample addresses.
- -T::
- --timestamp::
- Record the sample timestamps. Use it with 'perf report -D' to see the
- timestamps, for instance.
- -P::
- --period::
- Record the sample period.
- --sample-cpu::
- Record the sample cpu.
- -n::
- --no-samples::
- Don't sample.
- -R::
- --raw-samples::
- Collect raw sample records from all opened counters (default for tracepoint counters).
- -C::
- --cpu::
- Collect samples only on the list of CPUs provided. Multiple CPUs can be provided as a
- comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
- In per-thread mode with inheritance mode on (default), samples are captured only when
- the thread executes on the designated CPUs. Default is to monitor all CPUs.
- -B::
- --no-buildid::
- Do not save the build ids of binaries in the perf.data files. This skips
- post processing after recording, which sometimes makes the final step in
- the recording process to take a long time, as it needs to process all
- events looking for mmap records. The downside is that it can misresolve
- symbols if the workload binaries used when recording get locally rebuilt
- or upgraded, because the only key available in this case is the
- pathname. You can also set the "record.build-id" config variable to
- 'skip to have this behaviour permanently.
- -N::
- --no-buildid-cache::
- Do not update the buildid cache. This saves some overhead in situations
- where the information in the perf.data file (which includes buildids)
- is sufficient. You can also set the "record.build-id" config variable to
- 'no-cache' to have the same effect.
- -G name,...::
- --cgroup name,...::
- monitor only in the container (cgroup) called "name". This option is available only
- in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
- container "name" are monitored when they run on the monitored CPUs. Multiple cgroups
- can be provided. Each cgroup is applied to the corresponding event, i.e., first cgroup
- to first event, second cgroup to second event and so on. It is possible to provide
- an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
- corresponding events, i.e., they always refer to events defined earlier on the command
- line.
- -b::
- --branch-any::
- Enable taken branch stack sampling. Any type of taken branch may be sampled.
- This is a shortcut for --branch-filter any. See --branch-filter for more infos.
- -j::
- --branch-filter::
- Enable taken branch stack sampling. Each sample captures a series of consecutive
- taken branches. The number of branches captured with each sample depends on the
- underlying hardware, the type of branches of interest, and the executed code.
- It is possible to select the types of branches captured by enabling filters. The
- following filters are defined:
- - any: any type of branches
- - any_call: any function call or system call
- - any_ret: any function return or system call return
- - ind_call: any indirect branch
- - call: direct calls, including far (to/from kernel) calls
- - u: only when the branch target is at the user level
- - k: only when the branch target is in the kernel
- - hv: only when the target is at the hypervisor level
- - in_tx: only when the target is in a hardware transaction
- - no_tx: only when the target is not in a hardware transaction
- - abort_tx: only when the target is a hardware transaction abort
- - cond: conditional branches
- +
- The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
- The privilege levels may be omitted, in which case, the privilege levels of the associated
- event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
- levels are subject to permissions. When sampling on multiple events, branch stack sampling
- is enabled for all the sampling events. The sampled branch type is the same for all events.
- The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
- Note that this feature may not be available on all processors.
- --weight::
- Enable weightened sampling. An additional weight is recorded per sample and can be
- displayed with the weight and local_weight sort keys. This currently works for TSX
- abort events and some memory events in precise mode on modern Intel CPUs.
- --transaction::
- Record transaction flags for transaction related events.
- --per-thread::
- Use per-thread mmaps. By default per-cpu mmaps are created. This option
- overrides that and uses per-thread mmaps. A side-effect of that is that
- inheritance is automatically disabled. --per-thread is ignored with a warning
- if combined with -a or -C options.
- -D::
- --delay=::
- After starting the program, wait msecs before measuring. This is useful to
- filter out the startup phase of the program, which is often very different.
- -I::
- --intr-regs::
- Capture machine state (registers) at interrupt, i.e., on counter overflows for
- each sample. List of captured registers depends on the architecture. This option
- is off by default. It is possible to select the registers to sample using their
- symbolic names, e.g. on x86, ax, si. To list the available registers use
- --intr-regs=\?. To name registers, pass a comma separated list such as
- --intr-regs=ax,bx. The list of register is architecture dependent.
- --running-time::
- Record running and enabled time for read events (:S)
- -k::
- --clockid::
- Sets the clock id to use for the various time fields in the perf_event_type
- records. See clock_gettime(). In particular CLOCK_MONOTONIC and
- CLOCK_MONOTONIC_RAW are supported, some events might also allow
- CLOCK_BOOTTIME, CLOCK_REALTIME and CLOCK_TAI.
- -S::
- --snapshot::
- Select AUX area tracing Snapshot Mode. This option is valid only with an
- AUX area tracing event. Optionally the number of bytes to capture per
- snapshot can be specified. In Snapshot Mode, trace data is captured only when
- signal SIGUSR2 is received.
- --proc-map-timeout::
- When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
- because the file may be huge. A time out is needed in such cases.
- This option sets the time out limit. The default value is 500 ms.
- --switch-events::
- Record context switch events i.e. events of type PERF_RECORD_SWITCH or
- PERF_RECORD_SWITCH_CPU_WIDE.
- --clang-path=PATH::
- Path to clang binary to use for compiling BPF scriptlets.
- (enabled when BPF support is on)
- --clang-opt=OPTIONS::
- Options passed to clang when compiling BPF scriptlets.
- (enabled when BPF support is on)
- --vmlinux=PATH::
- Specify vmlinux path which has debuginfo.
- (enabled when BPF prologue is on)
- --buildid-all::
- Record build-id of all DSOs regardless whether it's actually hit or not.
- --all-kernel::
- Configure all used events to run in kernel space.
- --all-user::
- Configure all used events to run in user space.
- --timestamp-filename
- Append timestamp to output file name.
- --switch-output::
- Generate multiple perf.data files, timestamp prefixed, switching to a new one
- when receiving a SIGUSR2.
- A possible use case is to, given an external event, slice the perf.data file
- that gets then processed, possibly via a perf script, to decide if that
- particular perf.data snapshot should be kept or not.
- Implies --timestamp-filename, --no-buildid and --no-buildid-cache.
- --dry-run::
- Parse options then exit. --dry-run can be used to detect errors in cmdline
- options.
- 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
- in config file is set to true.
- --tail-synthesize::
- Instead of collecting non-sample events (for example, fork, comm, mmap) at
- the beginning of record, collect them during finalizing an output file.
- The collected non-sample events reflects the status of the system when
- record is finished.
- --overwrite::
- Makes all events use an overwritable ring buffer. An overwritable ring
- buffer works like a flight recorder: when it gets full, the kernel will
- overwrite the oldest records, that thus will never make it to the
- perf.data file.
- When '--overwrite' and '--switch-output' are used perf records and drops
- events until it receives a signal, meaning that something unusual was
- detected that warrants taking a snapshot of the most current events,
- those fitting in the ring buffer at that moment.
- 'overwrite' attribute can also be set or canceled for an event using
- config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
- Implies --tail-synthesize.
- SEE ALSO
- --------
- linkperf:perf-stat[1], linkperf:perf-list[1]
|