twx-linux/kernel
Frederic Weisbecker 56799bc035 perf: Fix hang while freeing sigtrap event
Perf can hang while freeing a sigtrap event if a related deferred
signal hadn't managed to be sent before the file got closed:

perf_event_overflow()
   task_work_add(perf_pending_task)

fput()
   task_work_add(____fput())

task_work_run()
    ____fput()
        perf_release()
            perf_event_release_kernel()
                _free_event()
                    perf_pending_task_sync()
                        task_work_cancel() -> FAILED
                        rcuwait_wait_event()

Once task_work_run() is running, the list of pending callbacks is
removed from the task_struct and from this point on task_work_cancel()
can't remove any pending and not yet started work items, hence the
task_work_cancel() failure and the hang on rcuwait_wait_event().

Task work could be changed to remove one work at a time, so a work
running on the current task can always cancel a pending one, however
the wait / wake design is still subject to inverted dependencies when
remote targets are involved, as pictured by Oleg:

T1                                                      T2

fd = perf_event_open(pid => T2->pid);                  fd = perf_event_open(pid => T1->pid);
close(fd)                                              close(fd)
    <IRQ>                                                  <IRQ>
    perf_event_overflow()                                  perf_event_overflow()
       task_work_add(perf_pending_task)                        task_work_add(perf_pending_task)
    </IRQ>                                                 </IRQ>
    fput()                                                 fput()
        task_work_add(____fput())                              task_work_add(____fput())

    task_work_run()                                        task_work_run()
        ____fput()                                             ____fput()
            perf_release()                                         perf_release()
                perf_event_release_kernel()                            perf_event_release_kernel()
                    _free_event()                                          _free_event()
                        perf_pending_task_sync()                               perf_pending_task_sync()
                            rcuwait_wait_event()                                   rcuwait_wait_event()

Therefore the only option left is to acquire the event reference count
upon queueing the perf task work and release it from the task work, just
like it was done before 3a5465418f5f ("perf: Fix event leak upon exec and file release")
but without the leaks it fixed.

Some adjustments are necessary to make it work:

* A child event might dereference its parent upon freeing. Care must be
  taken to release the parent last.

* Some places assuming the event doesn't have any reference held and
  therefore can be freed right away must instead put the reference and
  let the reference counting to its job.

Reported-by: "Yi Lai" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/all/Zx9Losv4YcJowaP%2F@ly-workstation/
Reported-by: syzbot+3c4321e10eea460eb606@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/673adf75.050a0220.87769.0024.GAE@google.com/
Fixes: 3a5465418f5f ("perf: Fix event leak upon exec and file release")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250304135446.18905-1-frederic@kernel.org
2025-04-08 20:55:43 +02:00
..
bpf bpf_try_alloc_pages 2025-03-30 13:45:28 -07:00
cgroup treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
configs - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
debug TTY/Serial driver updates for 6.15-rc1 2025-04-02 18:17:33 -07:00
dma dma-mapping: fix missing clear bdr in check_ram_in_range_map() 2025-03-12 13:41:44 +01:00
entry Objtool changes for v6.15: 2025-03-24 21:18:05 -07:00
events perf: Fix hang while freeing sigtrap event 2025-04-08 20:55:43 +02:00
futex futex: Use a hashmask instead of hashsize 2025-02-26 16:07:59 +01:00
gcov gcov: clang: use correct function param names 2025-01-24 22:47:27 -08:00
irq genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move() 2025-04-04 17:08:36 +02:00
kcsan treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
livepatch Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
locking - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
module ring-buffer updates for v6.15 2025-03-31 13:37:22 -07:00
power This update includes the following changes: 2025-03-29 10:01:55 -07:00
printk printk changes for 6.15 2025-03-27 19:22:24 -07:00
rcu treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
sched Miscellaneous scheduler fixes/updates: 2025-04-06 10:44:58 -07:00
time A set of final cleanups for the timer subsystem: 2025-04-06 08:35:37 -07:00
trace Persistent buffer cleanups and simplifications for v6.15: 2025-04-03 16:09:29 -07:00
.gitignore
acct.c acct: block access to kernel internal filesystems 2025-02-12 12:24:16 +01:00
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry 2025-02-19 14:08:41 +01:00
audit.c audit: Initialize lsmctx to avoid memory allocation error 2025-01-29 20:02:04 -05:00
audit.h audit: change context data from secid to lsm_prop 2024-10-11 14:34:16 -04:00
auditfilter.c audit: fix suffixed '/' filename matching 2024-12-05 19:22:38 -05:00
auditsc.c fs: dedup handling of struct filename init and refcounts bumps 2025-03-18 15:34:27 +01:00
backtracetest.c backtracetest: add MODULE_DESCRIPTION() 2024-06-24 22:24:55 -07:00
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu_pm.c
cpu.c hyperv-next for 6.15 2025-03-25 14:47:04 -07:00
crash_core.c crash: Use note name macros 2025-02-10 16:56:58 -08:00
crash_reserve.c crash: remove an unused argument from reserve_crashkernel_generic() 2025-03-16 22:30:47 -07:00
cred.c cred: remove old {override,revert}_creds() helpers 2024-12-02 11:25:09 +01:00
delayacct.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c exit: fix the usage of delay_group_leader->exit_code in do_notify_parent() and pidfs_exit() 2025-03-25 15:56:22 +01:00
exit.h
extable.c
fail_function.c
fork.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
freezer.c sched/fair: Fix external p->on_rq users 2024-10-14 09:14:35 +02:00
gen_kheaders.sh Revert "kheaders: Ignore silly-rename files" 2025-03-15 21:22:52 +09:00
groups.c
hung_task.c hung_task: show the blocker task if the task is hung on mutex 2025-03-21 22:10:04 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms_internal.h kallsyms: get rid of code for absolute kallsyms 2024-07-20 16:33:21 +09:00
kallsyms_selftest.c kallsyms: Use kthread_run_on_cpu() 2025-01-02 22:12:12 +01:00
kallsyms_selftest.h
kallsyms.c kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU 2025-02-18 10:16:04 +01:00
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec crash, powerpc: default to CRASH_DUMP=n on PPC_BOOK3S_32 2024-11-14 22:43:48 -08:00
Kconfig.locks
Kconfig.preempt sched: No PREEMPT_RT=y for all{yes,mod}config 2024-11-07 15:25:05 +01:00
kcov.c kcov: mark in_softirq_really() as __always_inline 2024-12-30 17:59:08 -08:00
kexec_core.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c crash: let arch decide usable memory range in reserved area 2025-03-16 22:30:47 -07:00
kexec_internal.h kexec: use atomic_try_cmpxchg_acquire() in kexec_trylock() 2024-09-01 20:43:23 -07:00
kexec.c
kheaders.c kheaders: Simplify attribute through __BIN_ATTR_SIMPLE_RO() 2024-12-24 09:46:49 +01:00
kprobes.c kprobes: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
ksyms_common.c
ksysfs.c kernel/ksysfs.c: simplify bin_attribute definition 2025-01-07 16:59:15 +01:00
kthread.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
latencytop.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
Makefile tracing: Disable branch profiling in noinstr code 2025-03-22 09:49:26 +01:00
module_signature.c
notifier.c reboot: move reboot_notifier_list to kernel/reboot.c 2024-11-05 17:12:31 -08:00
nsproxy.c fdget(), trivial conversions 2024-11-03 01:28:06 -05:00
padata.c padata: switch padata_find_next() to using cpumask_next_wrap() 2025-02-24 16:37:23 -05:00
panic.c These are objtool fixes and updates by Josh Poimboeuf, centered 2025-04-02 10:30:10 -07:00
params.c params: Annotate struct module_param_attrs with __counted_by() 2025-03-10 11:54:46 +01:00
pid_namespace.c pid: Do not set pid_max in new pid namespaces 2025-03-06 10:18:36 +01:00
pid_sysctl.h treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
pid.c kernel-6.15-rc1.tasklist_lock 2025-03-24 13:39:27 -07:00
profile.c profiling: remove profile=sleep support 2024-08-04 13:36:28 -07:00
ptrace.c
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relay: use kasprintf() instead of fixed buffer formatting 2025-03-21 22:10:05 -07:00
resource_kunit.c resource, kunit: fix user-after-free in resource_test_region_intersects() 2024-10-09 12:47:19 -07:00
resource.c resource: replace open coded variant of DEFINE_RES() 2025-03-21 22:10:05 -07:00
rseq.c rseq: Fix segfault on registration when rseq_cs is non-zero 2025-03-06 22:26:49 +01:00
scftorture.c scftorture: Handle NULL argument passed to scf_add_to_free_list(). 2024-11-14 16:09:51 -08:00
scs.c
seccomp.c seccomp: avoid the lock trip seccomp_filter_release in common case 2025-02-24 11:17:10 -08:00
signal.c vfs-6.15-rc1.fixes 2025-04-02 16:05:21 -07:00
smp.c CSD-lock pull request for v6.14 2025-01-28 11:34:03 -08:00
smpboot.c
smpboot.h
softirq.c lockdep: Fix wait context check on softirq for PREEMPT_RT 2025-03-25 10:46:44 +01:00
stackleak.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
stacktrace.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
static_call.c
stop_machine.c stop-machine: Add comment for rcu_momentary_eqs() 2025-03-11 10:15:52 -07:00
sys_ni.c Probes updates for v6.11: 2024-07-18 12:19:20 -07:00
sys.c Updates for the core time/timer subsystem: 2025-03-25 10:33:23 -07:00
sysctl-test.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
sysctl.c s390 updates for 6.15 merge window 2025-03-29 11:59:43 -07:00
task_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
taskstats.c fdget(), more trivial conversions 2024-11-03 01:28:06 -05:00
torture.c torture: Add get_torture_init_jiffies() for test-start time 2025-02-05 07:14:24 -08:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c tsacct: replace strncpy() with strscpy() 2024-07-12 16:39:53 -07:00
ucount.c ucount: use rcuref_t for reference counting 2025-03-16 22:30:50 -07:00
uid16.c
uid16.h
umh.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
up.c
user_namespace.c uidgid: add map_id_range_up() 2025-02-12 12:12:27 +01:00
user-return-notifier.c
user.c uidgid: make sure we fit into one cacheline 2024-09-12 12:16:09 +02:00
usermode_driver.c
utsname_sysctl.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
utsname.c
vhost_task.c vhost: return task creation error instead of NULL 2025-03-01 02:52:52 -05:00
vmcore_info.c mm: support only one page_type per page 2024-09-03 21:15:43 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog_buddy.c
watchdog_perf.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
watchdog.c A treewide hrtimer timer cleanup 2025-03-25 10:54:15 -07:00
workqueue_internal.h
workqueue.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00