TWx Linux Repository
Go to file
Shakeel Butt a00e607102 cgroup: fix race between fork and cgroup.kill
commit b69bb476dee99d564d65d418e9a20acca6f32c3f upstream.

Tejun reported the following race between fork() and cgroup.kill at [1].

Tejun:
  I was looking at cgroup.kill implementation and wondering whether there
  could be a race window. So, __cgroup_kill() does the following:

   k1. Set CGRP_KILL.
   k2. Iterate tasks and deliver SIGKILL.
   k3. Clear CGRP_KILL.

  The copy_process() does the following:

   c1. Copy a bunch of stuff.
   c2. Grab siglock.
   c3. Check fatal_signal_pending().
   c4. Commit to forking.
   c5. Release siglock.
   c6. Call cgroup_post_fork() which puts the task on the css_set and tests
       CGRP_KILL.

  The intention seems to be that either a forking task gets SIGKILL and
  terminates on c3 or it sees CGRP_KILL on c6 and kills the child. However, I
  don't see what guarantees that k3 can't happen before c6. ie. After a
  forking task passes c5, k2 can take place and then before the forking task
  reaches c6, k3 can happen. Then, nobody would send SIGKILL to the child.
  What am I missing?

This is indeed a race. One way to fix this race is by taking
cgroup_threadgroup_rwsem in write mode in __cgroup_kill() as the fork()
side takes cgroup_threadgroup_rwsem in read mode from cgroup_can_fork()
to cgroup_post_fork(). However that would be heavy handed as this adds
one more potential stall scenario for cgroup.kill which is usually
called under extreme situation like memory pressure.

To fix this race, let's maintain a sequence number per cgroup which gets
incremented on __cgroup_kill() call. On the fork() side, the
cgroup_can_fork() will cache the sequence number locally and recheck it
against the cgroup's sequence number at cgroup_post_fork() site. If the
sequence numbers mismatch, it means __cgroup_kill() can been called and
we should send SIGKILL to the newly created task.

Reported-by: Tejun Heo <tj@kernel.org>
Closes: https://lore.kernel.org/all/Z5QHE2Qn-QZ6M-KW@slm.duckdns.org/ [1]
Fixes: 661ee6280931 ("cgroup: introduce cgroup.kill")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 13:57:17 +01:00
arch alpha: make stack 16-byte aligned (most cases) 2025-02-21 13:57:17 +01:00
block block: don't revert iter for -EIOCBQUEUED 2025-02-17 09:40:25 +01:00
certs
crypto crypto: ecc - Prevent ecc_digits_from_bytes from reading too many bytes 2025-01-09 13:31:52 +01:00
Documentation dt-bindings: mfd: bd71815: Fix rsense and typos 2025-02-08 09:51:52 +01:00
drivers efi: Avoid cold plugged memory for placing the kernel 2025-02-21 13:57:17 +01:00
fs orangefs: fix a oob in orangefs_debug_write 2025-02-21 13:57:12 +01:00
include cgroup: fix race between fork and cgroup.kill 2025-02-21 13:57:17 +01:00
init Compiler Attributes: disable __counted_by for clang < 19.1.3 2024-12-09 10:32:46 +01:00
io_uring io_uring/rw: commit provided buffer state on async 2025-02-17 09:40:37 +01:00
ipc ipc: fix memleak if msg_init_ns failed in create_ipc_ns 2024-12-09 10:32:54 +01:00
kernel cgroup: fix race between fork and cgroup.kill 2025-02-21 13:57:17 +01:00
lib maple_tree: simplify split calculation 2025-02-17 09:40:39 +01:00
LICENSES
mm mm: kmemleak: fix upper boundary check for physical address objects 2025-02-17 09:40:35 +01:00
net can: j1939: j1939_sk_send_loop(): fix unable to send messages with data length zero 2025-02-21 13:57:16 +01:00
rust rust: init: use explicit ABI to clean warning in future compilers 2025-02-17 09:40:27 +01:00
samples samples/landlock: Fix possible NULL dereference in parse_path() 2025-02-08 09:51:57 +01:00
scripts scripts/gdb: fix aarch64 userspace detection in get_current_task 2025-02-17 09:40:39 +01:00
security tomoyo: don't emit warning in tomoyo_write_control() 2025-02-17 09:40:07 +01:00
sound ASoC: Intel: bytcr_rt5640: Add DMI quirk for Vexia Edu Atla 10 tablet 5V 2025-02-21 13:57:12 +01:00
tools selftests: gpio: gpio-sim: Fix missing chip disablements 2025-02-21 13:57:12 +01:00
usr
virt
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap
.rustfmt.toml
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS
Makefile kbuild: userprogs: fix bitsize and target detection on clang 2025-02-21 13:57:17 +01:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.