twx-linux/kernel
Waiman Long 2125c0034c cgroup/cpuset: Make cpuset hotplug processing synchronous
Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside
get_online_cpus()"), cpuset hotplug was done asynchronously via a work
function. This is to avoid recursive locking of cgroup_mutex.

Since then, the cgroup locking scheme has changed quite a bit. A
cpuset_mutex was introduced to protect cpuset specific operations.
The cpuset_mutex is then replaced by a cpuset_rwsem. With commit
d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock
order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on,
cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes
allow the hotplug code to call into cpuset core directly.

The following commits were also merged due to the asynchronous nature
of cpuset hotplug processing.

  - commit b22afcdf04c9 ("cpu/hotplug: Cure the cpusets trainwreck")
  - commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume
    bugs")
  - commit 28b89b9e6f7b ("cpuset: handle race between CPU hotplug and
    cpuset_hotplug_work")

Clean up all these bandages by making cpuset hotplug
processing synchronous again with the exception that the call to
cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1
cpuset, if necessary, will still be done via a work function due to the
existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible
to reverse that dependency, but that will require updating a number of
different cgroup controllers. This special hotplug code path should be
rarely taken anyway.

As all the cpuset states will be updated by the end of the hotplug
operation, we can revert most the above commits except commit
50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs")
which is partially reverted.  Also removing some cpus_read_lock trylock
attempts in the cpuset partition code as they are no longer necessary
since the cpu_hotplug_lock is now held for the whole duration of the
cpuset hotplug code path.

Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-04-08 07:39:16 -10:00
..
bpf bpf: report RCU QS in cpumap kthread 2024-03-20 21:05:43 -07:00
cgroup cgroup/cpuset: Make cpuset hotplug processing synchronous 2024-04-08 07:39:16 -10:00
configs Networking changes for 6.9. 2024-03-12 17:44:08 -07:00
debug kdb: Fix a potential buffer overflow in kdb_local() 2024-01-17 17:19:06 +00:00
dma dma-mapping fixes for Linux 6.9 2024-03-24 10:45:31 -07:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-03-12 13:23:32 +01:00
events - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00
futex futex: Prevent the reuse of stale pi_state 2024-01-19 12:58:17 +01:00
gcov gcov: annotate struct gcov_iterator with __counted_by 2023-10-18 14:43:22 -07:00
irq Updates for the MSI interrupt subsystem and RISC-V initial MSI support: 2024-03-11 14:03:03 -07:00
kcsan mm: delete checks for xor_unlock_is_negative_byte() 2023-10-18 14:34:17 -07:00
livepatch livepatch: Fix missing newline character in klp_resolve_symbols() 2023-09-20 11:24:18 +02:00
locking locking/rtmutex: Use try_cmpxchg_relaxed() in mark_rt_mutex_waiters() 2024-03-01 13:02:05 +01:00
module Kbuild updates for v6.9 2024-03-21 14:41:00 -07:00
power cgroup/cpuset: Make cpuset hotplug processing synchronous 2024-04-08 07:39:16 -10:00
printk TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
rcu Merge branches 'rcu-doc.2024.02.14a', 'rcu-nocb.2024.02.14a', 'rcu-exp.2024.02.14a', 'rcu-tasks.2024.02.26a' and 'rcu-misc.2024.02.14a' into rcu.2024.02.26a 2024-02-26 17:37:25 -08:00
sched RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
time Two regression fixes for the timer and timer migration code: 2024-03-23 14:49:25 -07:00
trace tracing: Use div64_u64() instead of do_div() 2024-03-18 10:33:06 -04:00
.gitignore
acct.c fs: rename __mnt_{want,drop}_write*() helpers 2023-09-11 15:05:50 +02:00
async.c async: Use a dedicated unbound workqueue with raised min_active 2024-02-09 11:13:59 -10:00
audit_fsnotify.c
audit_tree.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
audit_watch.c audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare() 2023-11-14 17:34:27 -05:00
audit.c audit: use KMEM_CACHE() instead of kmem_cache_create() 2024-01-25 10:12:22 -05:00
audit.h audit: correct audit_filter_inodes() definition 2023-07-21 12:17:25 -04:00
auditfilter.c audit: remove unnecessary assignment in audit_dupe_lsm_field() 2024-01-25 09:59:27 -05:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-13 18:34:46 +02:00
backtracetest.c backtracetest: Convert from tasklet to BH workqueue 2024-02-05 13:22:34 -10:00
bounds.c bounds: support non-power-of-two CONFIG_NR_CPUS 2024-02-22 15:38:51 -08:00
capability.c lsm: constify the 'target' parameter in security_capget() 2023-08-08 16:48:47 -04:00
cfi.c
compat.c
configs.c
context_tracking.c context_tracking: Fix kerneldoc headers for __ct_user_{enter,exit}() 2024-02-14 07:53:50 -08:00
cpu_pm.c
cpu.c cgroup/cpuset: Make cpuset hotplug processing synchronous 2024-04-08 07:39:16 -10:00
crash_core.c crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
crash_reserve.c kexec: split crashkernel reservation code out from crash_core.c 2024-02-23 17:48:21 -08:00
cred.c cred: Use KMEM_CACHE() instead of kmem_cache_create() 2024-02-23 17:33:31 -05:00
delayacct.c
dma.c
elfcorehdr.c crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
exec_domain.c
exit.c vfs-6.9.pidfd 2024-03-11 10:21:06 -07:00
exit.h exit: add internal include file with helpers 2023-09-21 12:03:50 -06:00
extable.c
fail_function.c
fork.c RCU pull request for v6.9 2024-03-11 12:02:50 -07:00
freezer.c Linux 6.7-rc6 2023-12-23 15:52:13 +01:00
gen_kheaders.sh
groups.c groups: Convert group_info.usage to refcount_t 2023-09-29 11:28:39 -07:00
hung_task.c kernel/hung_task.c: export sysctl_hung_task_timeout_secs 2024-03-13 21:22:04 -04:00
iomem.c kernel/iomem.c: remove __weak ioremap_cache helper 2023-08-21 13:37:28 -07:00
irq_work.c
jump_label.c
kallsyms_internal.h
kallsyms_selftest.c mm/vmalloc: remove vmap_area_list 2024-02-23 17:48:19 -08:00
kallsyms_selftest.h
kallsyms.c kallsyms: Change func signature for cleanup_symbol_name() 2023-08-25 15:00:36 -07:00
kcmp.c file: convert to SLAB_TYPESAFE_BY_RCU 2023-10-19 11:02:48 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.kexec crash: clean up kdump related config items 2024-02-23 17:48:22 -08:00
Kconfig.locks
Kconfig.preempt
kcov.c
kexec_core.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
kexec_elf.c
kexec_file.c arm64, crash: wrap crash dumping code into crash related ifdefs 2024-02-23 17:48:23 -08:00
kexec_internal.h crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
kexec.c crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
kheaders.c
kprobes.c kprobes: Remove unnecessary initial values of variables 2024-02-08 23:29:29 +09:00
ksyms_common.c
ksysfs.c Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00
kthread.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
latencytop.c
Makefile crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
module_signature.c
notifier.c
nsproxy.c pidfd: add pidfs 2024-03-01 12:23:37 +01:00
numa.c kernel/numa.c: Move logging out of numa.h 2023-12-20 19:26:30 -05:00
padata.c Author: Gang Li padata: dispatch works on 2024-03-06 13:04:17 -08:00
panic.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
params.c params: Fix multi-line comment style 2023-12-01 09:51:44 -08:00
pid_namespace.c wait: Remove uapi header file from main header file 2023-12-20 19:26:31 -05:00
pid_sysctl.h memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy 2023-08-21 13:37:59 -07:00
pid.c pidfs: remove config option 2024-03-13 12:53:53 -07:00
profile.c
ptrace.c ptrace_attach: shift send(SIGSTOP) into ptrace_set_stopped() 2024-02-22 15:38:52 -08:00
range.c
reboot.c Thermal control updates for 6.8-rc1 2024-01-09 16:20:17 -08:00
regset.c
relay.c kernel: relay: remove relay_file_splice_read dead code, doesn't work 2023-12-29 12:22:27 -08:00
resource_kunit.c
resource.c Quite a lot of kexec work this time around. Many singleton patches in 2024-01-09 11:46:20 -08:00
rseq.c
scftorture.c scftorture: Pause testing after memory-allocation failure 2023-07-14 15:02:57 -07:00
scs.c
seccomp.c file: remove __receive_fd() 2023-12-12 14:24:14 +01:00
signal.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
smp.c CSD lock commits for v6.7 2023-10-30 17:56:53 -10:00
smpboot.c kthread: add kthread_stop_put 2023-10-04 10:41:57 -07:00
smpboot.h
softirq.c workqueue: Drain BH work items on hot-unplugged CPUs 2024-02-29 11:51:24 -10:00
stackleak.c
stacktrace.c stacktrace: fix kernel-doc typo 2023-12-29 12:22:29 -08:00
static_call_inline.c
static_call.c
stop_machine.c
sys_ni.c lsm/stable-6.8 PR 20240105 2024-01-09 12:57:46 -08:00
sys.c getrusage: use sig->stats_lock rather than lock_task_sighand() 2024-02-07 21:20:32 -08:00
sysctl-test.c
sysctl.c tracing: Support to dump instance traces by ftrace_dump_on_oops 2024-03-18 10:33:06 -04:00
task_work.c task_work: add kerneldoc annotation for 'data' argument 2023-09-19 13:21:32 -07:00
taskstats.c taskstats: fill_stats_for_tgid: use for_each_thread() 2023-10-04 10:41:57 -07:00
torture.c torture: Print out torture module parameters 2023-09-24 17:24:01 +02:00
tracepoint.c
tsacct.c
ucount.c sysctl: Add size to register_sysctl 2023-08-15 15:26:17 -07:00
uid16.c
uid16.h
umh.c sysctl: fix unused proc_cap_handler() function warning 2023-06-29 15:19:43 -07:00
up.c smp: Change function signatures to use call_single_data_t 2023-09-13 14:59:24 +02:00
user_namespace.c user_namespace: remove unnecessary NULL values from kbuf 2024-02-22 15:38:52 -08:00
user-return-notifier.c
user.c binfmt_misc: enable sandboxed mounts 2023-10-11 08:46:01 -07:00
usermode_driver.c
utsname_sysctl.c
utsname.c
vhost_task.c
vmcore_info.c crash_core: export vmemmap when CONFIG_SPARSEMEM_VMEMMAP is enabled 2024-03-04 17:01:27 -08:00
watch_queue.c watch_queue: fix kcalloc() arguments order 2023-12-21 13:17:54 +01:00
watchdog_buddy.c
watchdog_perf.c
watchdog.c watchdog/core: remove sysctl handlers from public header 2024-03-12 13:09:23 -07:00
workqueue_internal.h workqueue: Drop the special locking rule for worker->flags and worker_pool->flags 2023-08-07 15:57:22 -10:00
workqueue.c Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00