twx-linux/kernel/trace
Steven Rostedt a485ea9e3e tracing: Fix irqsoff and wakeup latency tracers when using function graph
The function graph tracer has become generic so that kretprobes and BPF
can use it along with function graph tracing itself. Some of the
infrastructure was specific for function graph tracing such as recording
the calltime and return time of the functions. Calling the clock code on a
high volume function does add overhead. The calculation of the calltime
was removed from the generic code and placed into the function graph
tracer itself so that the other users did not incur this overhead as they
did not need that timestamp.

The calltime field was still kept in the generic return entry structure
and the function graph return entry callback filled it as that structure
was passed to other code.

But this broke both irqsoff and wakeup latency tracer as they still
depended on the trace structure containing the calltime when the option
display-graph is set as it used some of those same functions that the
function graph tracer used. But now the calltime was not set and was just
zero. This caused the calculation of the function time to be the absolute
value of the return timestamp and not the length of the function.

 # cd /sys/kernel/tracing
 # echo 1 > options/display-graph
 # echo irqsoff > current_tracer

The tracers went from:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   4)    <idle>-0    |  d..1. |   0.000 us    |  irqentry_enter();
        3 us |   4)    <idle>-0    |  d..2. |               |  irq_enter_rcu() {
        4 us |   4)    <idle>-0    |  d..2. |   0.431 us    |    preempt_count_add();
        5 us |   4)    <idle>-0    |  d.h2. |               |    tick_irq_enter() {
        5 us |   4)    <idle>-0    |  d.h2. |   0.433 us    |      tick_check_oneshot_broadcast_this_cpu();
        6 us |   4)    <idle>-0    |  d.h2. |   2.426 us    |      ktime_get();
        9 us |   4)    <idle>-0    |  d.h2. |               |      tick_nohz_stop_idle() {
       10 us |   4)    <idle>-0    |  d.h2. |   0.398 us    |        nr_iowait_cpu();
       11 us |   4)    <idle>-0    |  d.h1. |   1.903 us    |      }
       11 us |   4)    <idle>-0    |  d.h2. |               |      tick_do_update_jiffies64() {
       12 us |   4)    <idle>-0    |  d.h2. |               |        _raw_spin_lock() {
       12 us |   4)    <idle>-0    |  d.h2. |   0.360 us    |          preempt_count_add();
       13 us |   4)    <idle>-0    |  d.h3. |   0.354 us    |          do_raw_spin_lock();
       14 us |   4)    <idle>-0    |  d.h2. |   2.207 us    |        }
       15 us |   4)    <idle>-0    |  d.h3. |   0.428 us    |        calc_global_load();
       16 us |   4)    <idle>-0    |  d.h3. |               |        _raw_spin_unlock() {
       16 us |   4)    <idle>-0    |  d.h3. |   0.380 us    |          do_raw_spin_unlock();
       17 us |   4)    <idle>-0    |  d.h3. |   0.334 us    |          preempt_count_sub();
       18 us |   4)    <idle>-0    |  d.h1. |   1.768 us    |        }
       18 us |   4)    <idle>-0    |  d.h2. |               |        update_wall_time() {
      [..]

To:

 #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
 #      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   5)    <idle>-0    |  d.s2. |   0.000 us    |  _raw_spin_lock_irqsave();
        0 us |   5)    <idle>-0    |  d.s3. |   312159583 us |      preempt_count_add();
        2 us |   5)    <idle>-0    |  d.s4. |   312159585 us |      do_raw_spin_lock();
        3 us |   5)    <idle>-0    |  d.s4. |               |      _raw_spin_unlock() {
        3 us |   5)    <idle>-0    |  d.s4. |   312159586 us |        do_raw_spin_unlock();
        4 us |   5)    <idle>-0    |  d.s4. |   312159587 us |        preempt_count_sub();
        4 us |   5)    <idle>-0    |  d.s2. |   312159587 us |      }
        5 us |   5)    <idle>-0    |  d.s3. |               |      _raw_spin_lock() {
        5 us |   5)    <idle>-0    |  d.s3. |   312159588 us |        preempt_count_add();
        6 us |   5)    <idle>-0    |  d.s4. |   312159589 us |        do_raw_spin_lock();
        7 us |   5)    <idle>-0    |  d.s3. |   312159590 us |      }
        8 us |   5)    <idle>-0    |  d.s4. |   312159591 us |      calc_wheel_index();
        9 us |   5)    <idle>-0    |  d.s4. |               |      enqueue_timer() {
        9 us |   5)    <idle>-0    |  d.s4. |               |        wake_up_nohz_cpu() {
       11 us |   5)    <idle>-0    |  d.s4. |               |          native_smp_send_reschedule() {
       11 us |   5)    <idle>-0    |  d.s4. |   312171987 us |            default_send_IPI_single_phys();
    12408 us |   5)    <idle>-0    |  d.s3. |   312171990 us |          }
    12408 us |   5)    <idle>-0    |  d.s3. |   312171991 us |        }
    12409 us |   5)    <idle>-0    |  d.s3. |   312171991 us |      }

Where the calculation of the time for each function was the return time
minus zero and not the time of when the function returned.

Have these tracers also save the calltime in the fgraph data section and
retrieve it again on the return to get the correct timings again.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/20250113183124.61767419@gandalf.local.home
Fixes: f1f36e22bee9 ("ftrace: Have calltime be saved in the fgraph storage")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-01-14 09:38:09 -05:00
..
rv rv: Fix a typo 2024-10-04 19:09:23 -04:00
blktrace.c
bpf_trace.c bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog 2024-12-10 10:16:28 -08:00
bpf_trace.h
error_report-traces.c
fgraph.c fgraph: Add READ_ONCE() when accessing fgraph_array[] 2025-01-02 17:21:18 -05:00
fprobe.c
ftrace_internal.h
ftrace.c ftrace: Fix function profiler's filtering functionality 2025-01-02 17:21:33 -05:00
Kconfig function_graph: Support recording and printing the function return address 2024-10-05 10:14:04 -04:00
kprobe_event_gen_test.c
Makefile
pid_list.c trace/pid_list: Change gfp flags in pid_list_fill_irq() 2024-07-15 15:07:14 -04:00
pid_list.h
power-traces.c
preemptirq_delay_test.c minmax: make generic MIN() and MAX() macros available everywhere 2024-07-28 15:49:18 -07:00
rethook.c
ring_buffer_benchmark.c ring-buffer: Use str_low_high() helper in ring_buffer_producer() 2024-10-19 11:12:25 -04:00
ring_buffer.c ring-buffer: Fix overflow in __rb_map_vma 2024-12-18 14:15:10 -05:00
rpm-traces.c
synth_event_gen_test.c
trace_benchmark.c
trace_benchmark.h
trace_boot.c
trace_branch.c tracing: Remove TRACE_EVENT_FL_FILTERED logic 2024-10-08 15:24:49 -04:00
trace_btf.c
trace_btf.h
trace_clock.c tracing: Use atomic64_inc_return() in trace_clock_counter() 2024-10-09 19:59:49 -04:00
trace_dynevent.c
trace_dynevent.h
trace_entries.h function_graph: Support recording and printing the function return address 2024-10-05 10:14:04 -04:00
trace_eprobe.c tracing/eprobe: Fix to release eprobe when failed to add dyn_event 2024-12-08 23:25:09 +09:00
trace_event_perf.c trace/trace_event_perf: remove duplicate samples on the first tracepoint event 2024-10-09 19:44:54 -04:00
trace_events_filter_test.h
trace_events_filter.c tracing: Replace multiple deprecated strncpy with memcpy 2024-10-30 19:41:08 -04:00
trace_events_hist.c tracing: Remove redundant check on field->field in histograms 2024-11-12 11:36:57 -05:00
trace_events_inject.c tracing: Have format file honor EVENT_FILE_FL_FREED 2024-08-07 18:12:46 -04:00
trace_events_synth.c
trace_events_trigger.c tracing: Have format file honor EVENT_FILE_FL_FREED 2024-08-07 18:12:46 -04:00
trace_events_user.c tracepoints: Use new static branch API 2024-10-08 21:17:39 -04:00
trace_events.c tracing: Have process_string() also allow arrays 2024-12-31 00:10:32 -05:00
trace_export.c
trace_fprobe.c tracing/probes: Fix MAX_TRACE_ARGS limit handling 2024-10-23 17:24:44 +09:00
trace_functions_graph.c tracing updates for v6.13: 2024-11-22 13:27:01 -08:00
trace_functions.c ftrace: Do not find "true_parent" if HAVE_DYNAMIC_FTRACE_WITH_ARGS is not set 2024-12-16 17:22:26 -05:00
trace_hwlat.c tracing: Remove TRACE_EVENT_FL_FILTERED logic 2024-10-08 15:24:49 -04:00
trace_irqsoff.c tracing: Fix irqsoff and wakeup latency tracers when using function graph 2025-01-14 09:38:09 -05:00
trace_kdb.c trace: kdb: Replace simple_strtoul with kstrtoul in kdb_ftdump 2024-11-02 08:33:13 +00:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_kprobe.c tracing/kprobes: Fix to free objects when failed to copy a symbol 2025-01-10 08:57:18 +09:00
trace_mmiotrace.c tracing: Remove TRACE_EVENT_FL_FILTERED logic 2024-10-08 15:24:49 -04:00
trace_nop.c
trace_osnoise.c tracing: Remove TRACE_EVENT_FL_FILTERED logic 2024-10-08 15:24:49 -04:00
trace_output.c tracing: Check "%s" dereference via the field and not the TP_printk format 2024-12-17 11:40:11 -05:00
trace_output.h
trace_preemptirq.c tracing: Fix archs that still call tracepoints without RCU watching 2024-12-05 09:28:58 -05:00
trace_printk.c
trace_probe_kernel.h
trace_probe_tmpl.h
trace_probe.c tracing: Consider the NULL character when validating the event length 2024-10-23 17:24:47 +09:00
trace_probe.h
trace_recursion_record.c
trace_sched_switch.c tracing: Replace strncpy() with strscpy() when copying comm 2024-11-01 14:37:31 -04:00
trace_sched_wakeup.c tracing: Fix irqsoff and wakeup latency tracers when using function graph 2025-01-14 09:38:09 -05:00
trace_selftest_dynamic.c
trace_selftest.c ftrace updates for v6.13: 2024-11-20 11:34:10 -08:00
trace_seq.c
trace_stack.c sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
trace_stat.c
trace_stat.h
trace_synth.h
trace_syscalls.c tracing/perf: Add might_fault check to syscall probes 2024-10-09 17:09:46 -04:00
trace_uprobe.c bpf: Fix theoretical prog_array UAF in __uprobe_perf_func() 2024-12-10 13:06:51 -08:00
trace.c tracing: Prevent bad count for tracing_cpumask_write 2024-12-23 21:59:15 -05:00
trace.h tracing: Check "%s" dereference via the field and not the TP_printk format 2024-12-17 11:40:11 -05:00
tracing_map.c tracing: Fix cmp_entries_dup() to respect sort() comparison rules 2024-12-04 10:38:24 -05:00
tracing_map.h