Commit Graph

2717 Commits

Author SHA1 Message Date
Ankur Arora 83b28cfe79 rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
states for read-side critical sections via rcu_all_qs().
One reason why this was needed: lacking preempt-count, the tick
handler has no way of knowing whether it is executing in a
read-side critical section or not.

With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), we get (PREEMPT_COUNT=y,
PREEMPT_RCU=n). In this configuration cond_resched() is a stub and
does not provide quiescent states via rcu_all_qs().
(PREEMPT_RCU=y provides this information via rcu_read_unlock() and
its nesting counter.)

So, use the availability of preempt_count() to report quiescent states
in rcu_flavor_sched_clock_irq().

Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05 07:01:55 -08:00
Ankur Arora fcf0e25ad4 rcu: handle unstable rdp in rcu_read_unlock_strict()
rcu_read_unlock_strict() can be called with preemption enabled
which can make for an unstable rdp and a racy norm value.

Fix this by dropping the preempt-count in __rcu_read_unlock()
after the call to rcu_read_unlock_strict(), adjusting the
preempt-count check appropriately.

Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05 07:01:55 -08:00
Ankur Arora 4dca1af414 rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
Replace mentions of PREEMPT_AUTO with PREEMPT_LAZY.

Also, since PREMPT_LAZY implies PREEMPTION, we can reduce the
TASKS_RCU selection criteria from this:

  NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)

to this:

  NEED_TASKS_RCU && PREEMPTION

CC: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05 07:01:55 -08:00
Vlastimil Babka 49d5377b38 rcu, slab: use a regular callback function for kvfree_rcu
RCU has been special-casing callback function pointers that are integers
lower than 4096 as offsets of rcu_head for kvfree() instead. The tree
RCU implementation no longer does that as the batched kvfree_rcu() is
not a simple call_rcu(). The tiny RCU still does, and the plan is also
to make tree RCU use call_rcu() for SLUB_TINY configurations.

Instead of teaching tree RCU again to special case the offsets, let's
remove the special casing completely. Since there's no SLOB anymore, it
is possible to create a callback function that can take a pointer to a
middle of slab object with unknown offset and determine the object's
pointer before freeing it, so implement that as kvfree_rcu_cb().

Large kmalloc and vmalloc allocations are handled simply by aligning
down to page size. For that we retain the requirement that the offset is
smaller than 4096. But we can remove __is_kvfree_rcu_offset() completely
and instead just opencode the condition in the BUILD_BUG_ON() check.

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-02-05 10:45:29 +01:00
Vlastimil Babka 7f4b19ef31 rcu: remove trace_rcu_kvfree_callback
Tree RCU does not handle kvfree_rcu() by queueing individual objects by
call_rcu() anymore, thus the tracepoint and associated
__is_kvfree_rcu_offset() check is dead code now. Remove it.

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-02-05 10:45:20 +01:00
Vlastimil Babka b14ff274e8 slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
Following the move of TREE_RCU implementation, let's move also the
TINY_RCU one for consistency and subsequent refactoring.

For simplicity, remove the separate inline __kvfree_call_rcu() as
TINY_RCU is not meant for high-performance hardware anyway.

Declare kvfree_call_rcu() in rcupdate.h to avoid header dependency
issues.

Also move the kvfree_rcu_barrier() declaration to slab.h

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-02-05 10:45:12 +01:00
Zilin Guan 764f6a8110 rcu: Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
There is one access to the per-CPU rdp->gpwrap field in the
__note_gp_changes() function that does not use READ_ONCE(), but all other
accesses do use READ_ONCE().  When using the 8*TREE03 and CONFIG_NR_CPUS=8
configuration, KCSAN found no data races at that point.  This is because
all calls to __note_gp_changes() hold rnp->lock, which excludes writes
to the rdp->gpwrap fields for all CPUs associated with that same leaf
rcu_node structure.

This commit therefore removes READ_ONCE() from rdp->gpwrap accesses
within the __note_gp_changes() function.

Signed-off-by: Zilin Guan <zilinguan811@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:52:38 -08:00
Paul E. McKenney a3e8162105 rcu: Split rcu_report_exp_cpu_mult() mask parameter and use for tracing
This commit renames the rcu_report_exp_cpu_mult() function from "mask"
to "mask_in" and introduced a "mask" local variable to better support
upcoming event-tracing additions.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:52:38 -08:00
Paul E. McKenney 81a208c56e rcu: Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text
This commit wordsmiths the RCU_LAZY and RCU_LAZY_DEFAULT_OFF Kconfig
options' help text.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:50:06 -08:00
Paul E. McKenney 053ca72554 rcu: Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header
This commit adds a description of the energy-efficiency delays that
call_rcu() can impose, along with a pointer to call_rcu_hurry() for
latency-sensitive kernel code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:50:06 -08:00
Paul E. McKenney 366ba3f7f9 srcu: Point call_srcu() to call_rcu() for detailed memory ordering
This commit causes the call_srcu() kernel-doc header to reference that
of call_rcu() for detailed memory-ordering guarantees.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:50:06 -08:00
Paul E. McKenney 21ef249862 rcu: Document self-propagating callbacks
This commit documents the fact that a given RCU callback function can
repost itself.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-04 21:50:06 -08:00
Linus Torvalds 9c5968db9e Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
 "The various patchsets are summarized below. Plus of course many
  indivudual patches which are described in their changelogs.

   - "Allocate and free frozen pages" from Matthew Wilcox reorganizes
     the page allocator so we end up with the ability to allocate and
     free zero-refcount pages. So that callers (ie, slab) can avoid a
     refcount inc & dec

   - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
     use large folios other than PMD-sized ones

   - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
     and fixes for this small built-in kernel selftest

   - "mas_anode_descend() related cleanup" from Wei Yang tidies up part
     of the mapletree code

   - "mm: fix format issues and param types" from Keren Sun implements a
     few minor code cleanups

   - "simplify split calculation" from Wei Yang provides a few fixes and
     a test for the mapletree code

   - "mm/vma: make more mmap logic userland testable" from Lorenzo
     Stoakes continues the work of moving vma-related code into the
     (relatively) new mm/vma.c

   - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
     Hildenbrand cleans up and rationalizes handling of gfp flags in the
     page allocator

   - "readahead: Reintroduce fix for improper RA window sizing" from Jan
     Kara is a second attempt at fixing a readahead window sizing issue.
     It should reduce the amount of unnecessary reading

   - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
     addresses an issue where "huge" amounts of pte pagetables are
     accumulated:

       https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/

     Qi's series addresses this windup by synchronously freeing PTE
     memory within the context of madvise(MADV_DONTNEED)

   - "selftest/mm: Remove warnings found by adding compiler flags" from
     Muhammad Usama Anjum fixes some build warnings in the selftests
     code when optional compiler warnings are enabled

   - "mm: don't use __GFP_HARDWALL when migrating remote pages" from
     David Hildenbrand tightens the allocator's observance of
     __GFP_HARDWALL

   - "pkeys kselftests improvements" from Kevin Brodsky implements
     various fixes and cleanups in the MM selftests code, mainly
     pertaining to the pkeys tests

   - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
     estimate application working set size

   - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
     provides some cleanups to memcg's hugetlb charging logic

   - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
     removes the global swap cgroup lock. A speedup of 10% for a
     tmpfs-based kernel build was demonstrated

   - "zram: split page type read/write handling" from Sergey Senozhatsky
     has several fixes and cleaups for zram in the area of
     zram_write_page(). A watchdog softlockup warning was eliminated

   - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
     Brodsky cleans up the pagetable destructor implementations. A rare
     use-after-free race is fixed

   - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
     simplifies and cleans up the debugging code in the VMA merging
     logic

   - "Account page tables at all levels" from Kevin Brodsky cleans up
     and regularizes the pagetable ctor/dtor handling. This results in
     improvements in accounting accuracy

   - "mm/damon: replace most damon_callback usages in sysfs with new
     core functions" from SeongJae Park cleans up and generalizes
     DAMON's sysfs file interface logic

   - "mm/damon: enable page level properties based monitoring" from
     SeongJae Park increases the amount of information which is
     presented in response to DAMOS actions

   - "mm/damon: remove DAMON debugfs interface" from SeongJae Park
     removes DAMON's long-deprecated debugfs interfaces. Thus the
     migration to sysfs is completed

   - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
     Peter Xu cleans up and generalizes the hugetlb reservation
     accounting

   - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
     removes a never-used feature of the alloc_pages_bulk() interface

   - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
     extends DAMOS filters to support not only exclusion (rejecting),
     but also inclusion (allowing) behavior

   - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
     introduces a new memory descriptor for zswap.zpool that currently
     overlaps with struct page for now. This is part of the effort to
     reduce the size of struct page and to enable dynamic allocation of
     memory descriptors

   - "mm, swap: rework of swap allocator locks" from Kairui Song redoes
     and simplifies the swap allocator locking. A speedup of 400% was
     demonstrated for one workload. As was a 35% reduction for kernel
     build time with swap-on-zram

   - "mm: update mips to use do_mmap(), make mmap_region() internal"
     from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
     mmap_region() can be made MM-internal

   - "mm/mglru: performance optimizations" from Yu Zhao fixes a few
     MGLRU regressions and otherwise improves MGLRU performance

   - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
     Park updates DAMON documentation

   - "Cleanup for memfd_create()" from Isaac Manjarres does that thing

   - "mm: hugetlb+THP folio and migration cleanups" from David
     Hildenbrand provides various cleanups in the areas of hugetlb
     folios, THP folios and migration

   - "Uncached buffered IO" from Jens Axboe implements the new
     RWF_DONTCACHE flag which provides synchronous dropbehind for
     pagecache reading and writing. To permite userspace to address
     issues with massive buildup of useless pagecache when
     reading/writing fast devices

   - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
     Weißschuh fixes and optimizes some of the MM selftests"

* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm/compaction: fix UBSAN shift-out-of-bounds warning
  s390/mm: add missing ctor/dtor on page table upgrade
  kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
  tools: add VM_WARN_ON_VMG definition
  mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
  seqlock: add missing parameter documentation for raw_seqcount_try_begin()
  mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
  mm/page_alloc: remove the incorrect and misleading comment
  zram: remove zcomp_stream_put() from write_incompressible_page()
  mm: separate move/undo parts from migrate_pages_batch()
  mm/kfence: use str_write_read() helper in get_access_type()
  selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
  kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
  selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
  selftests/mm: vm_util: split up /proc/self/smaps parsing
  selftests/mm: virtual_address_range: unmap chunks after validation
  selftests/mm: virtual_address_range: mmap() without PROT_WRITE
  selftests/memfd/memfd_test: fix possible NULL pointer dereference
  mm: add FGP_DONTCACHE folio creation flag
  mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
  ...
2025-01-26 18:36:23 -08:00
Linus Torvalds 1d6d399223 Merge tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks
Pull kthread updates from Frederic Weisbecker:
 "Kthreads affinity follow either of 4 existing different patterns:

   1) Per-CPU kthreads must stay affine to a single CPU and never
      execute relevant code on any other CPU. This is currently handled
      by smpboot code which takes care of CPU-hotplug operations.
      Affinity here is a correctness constraint.

   2) Some kthreads _have_ to be affine to a specific set of CPUs and
      can't run anywhere else. The affinity is set through
      kthread_bind_mask() and the subsystem takes care by itself to
      handle CPU-hotplug operations. Affinity here is assumed to be a
      correctness constraint.

   3) Per-node kthreads _prefer_ to be affine to a specific NUMA node.
      This is not a correctness constraint but merely a preference in
      terms of memory locality. kswapd and kcompactd both fall into this
      category. The affinity is set manually like for any other task and
      CPU-hotplug is supposed to be handled by the relevant subsystem so
      that the task is properly reaffined whenever a given CPU from the
      node comes up. Also care should be taken so that the node affinity
      doesn't cross isolated (nohz_full) cpumask boundaries.

   4) Similar to the previous point except kthreads have a _preferred_
      affinity different than a node. Both RCU boost kthreads and RCU
      exp kworkers fall into this category as they refer to "RCU nodes"
      from a distinctly distributed tree.

  Currently the preferred affinity patterns (3 and 4) have at least 4
  identified users, with more or less success when it comes to handle
  CPU-hotplug operations and CPU isolation. Each of which do it in its
  own ad-hoc way.

  This is an infrastructure proposal to handle this with the following
  API changes:

   - kthread_create_on_node() automatically affines the created kthread
     to its target node unless it has been set as per-cpu or bound with
     kthread_bind[_mask]() before the first wake-up.

   - kthread_affine_preferred() is a new function that can be called
     right after kthread_create_on_node() to specify a preferred
     affinity different than the specified node.

  When the preferred affinity can't be applied because the possible
  targets are offline or isolated (nohz_full), the kthread is affine to
  the housekeeping CPUs (which means to all online CPUs most of the time
  or only the non-nohz_full CPUs when nohz_full= is set).

  kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
  converted, along with a few old drivers.

  Summary of the changes:

   - Consolidate a bunch of ad-hoc implementations of
     kthread_run_on_cpu()

   - Introduce task_cpu_fallback_mask() that defines the default last
     resort affinity of a task to become nohz_full aware

   - Add some correctness check to ensure kthread_bind() is always
     called before the first kthread wake up.

   - Default affine kthread to its preferred node.

   - Convert kswapd / kcompactd and remove their halfway working ad-hoc
     affinity implementation

   - Implement kthreads preferred affinity

   - Unify kthread worker and kthread API's style

   - Convert RCU kthreads to the new API and remove the ad-hoc affinity
     implementation"

* tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
  kthread: modify kernel-doc function name to match code
  rcu: Use kthread preferred affinity for RCU exp kworkers
  treewide: Introduce kthread_run_worker[_on_cpu]()
  kthread: Unify kthread_create_on_cpu() and kthread_create_worker_on_cpu() automatic format
  rcu: Use kthread preferred affinity for RCU boost
  kthread: Implement preferred affinity
  mm: Create/affine kswapd to its preferred node
  mm: Create/affine kcompactd to its preferred node
  kthread: Default affine kthread to its preferred NUMA node
  kthread: Make sure kthread hasn't started while binding it
  sched,arm64: Handle CPU isolation on last resort fallback rq selection
  arm64: Exclude nohz_full CPUs from 32bits el0 support
  lib: test_objpool: Use kthread_run_on_cpu()
  kallsyms: Use kthread_run_on_cpu()
  soc/qman: test: Use kthread_run_on_cpu()
  arm/bL_switcher: Use kthread_run_on_cpu()
2025-01-21 17:10:05 -08:00
Linus Torvalds 9f3ee94e70 Merge tag 'rcu.release.v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Uladzislau Rezki:
 "Misc fixes:
   - check if IRQs are disabled in rcu_exp_need_qs()
   - instrument KCSAN exclusive-writer assertions
   - add extra WARN_ON_ONCE() check
   - set the cpu_no_qs.b.exp under lock
   - warn if callback enqueued on offline CPU

  Torture-test updates:
   - add rcutorture.preempt_duration kernel module parameter
   - make the TREE03 scenario do preemption
   - improve pooling timeouts for rcu_torture_writer()
   - improve output of "Failure/close-call rcutorture reader segments"
   - add some reader-state debugging checks
   - update doc of polled APIs
   - add extra diagnostics for per-reader-segment preemption
   - add an extra test for sched_clock()
   - improve testing on unresponsive systems

  SRCU updates:
   - improve doc for srcu_read_lock() in terms of return value
   - fix typo in comments
   - remove redundant GP sequence checks in the srcu_funnel_gp_start"

* tag 'rcu.release.v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (31 commits)
  srcu: Remove redundant GP sequence checks in srcu_funnel_gp_start
  srcu: Fix typo s/srcu_check_read_flavor()/__srcu_check_read_flavor()/
  srcu: Guarantee non-negative return value from srcu_read_lock()
  MAINTAINERS: Update RCU git tree
  rcu: Add lockdep_assert_irqs_disabled() to rcu_exp_need_qs()
  rcu: Add KCSAN exclusive-writer assertions for rdp->cpu_no_qs.b.exp
  rcu: Make preemptible rcu_exp_handler() check idempotency
  rcu: Replace open-coded rcu_exp_need_qs() from rcu_exp_handler() with call
  rcu: Move rcu_report_exp_rdp() setting of ->cpu_no_qs.b.exp under lock
  rcu: Make rcu_report_exp_cpu_mult() caller acquire lock
  rcu: Report callbacks enqueued on offline CPU blind spot
  rcutorture: Use symbols for SRCU reader flavors
  rcutorture: Add per-reader-segment preemption diagnostics
  rcutorture: Read CPU ID for decoration protected by both reader types
  rcutorture: Add preempt_count() to rcutorture_one_extend_check() diagnostics
  rcutorture: Add parameters to control polled/conditional wait interval
  rcutorture: Add documentation for recent conditional and polled APIs
  rcutorture: Ignore attempts to test preemption and forward progress
  rcutorture: Make rcutorture_one_extend() check reader state
  rcutorture: Pretty-print rcutorture reader segments
  ...
2025-01-21 14:39:21 -08:00
Peter Zijlstra d40797d672 kasan: make kasan_record_aux_stack_noalloc() the default behaviour
kasan_record_aux_stack_noalloc() was introduced to record a stack trace
without allocating memory in the process.  It has been added to callers
which were invoked while a raw_spinlock_t was held.  More and more callers
were identified and changed over time.  Is it a good thing to have this
while functions try their best to do a locklessly setup?  The only
downside of having kasan_record_aux_stack() not allocate any memory is
that we end up without a stacktrace if stackdepot runs out of memory and
at the same stacktrace was not recorded before To quote Marco Elver from
https://lore.kernel.org/all/CANpmjNPmQYJ7pv1N3cuU8cP18u7PP_uoZD8YxwZd4jtbof9nVQ@mail.gmail.com/

| I'd be in favor, it simplifies things. And stack depot should be
| able to replenish its pool sufficiently in the "non-aux" cases
| i.e. regular allocations. Worst case we fail to record some
| aux stacks, but I think that's only really bad if there's a bug
| around one of these allocations. In general the probabilities
| of this being a regression are extremely small [...]

Make the kasan_record_aux_stack_noalloc() behaviour default as
kasan_record_aux_stack().

[bigeasy@linutronix.de: dressed the diff as patch]
Link: https://lkml.kernel.org/r/20241122155451.Mb2pmeyJ@linutronix.de
Fixes: 7cb3007ce2 ("kasan: generic: introduce kasan_record_aux_stack_noalloc()")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: syzbot+39f85d612b7c20d8db48@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67275485.050a0220.3c8d68.0a37.GAE@google.com
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: <kasan-dev@googlegroups.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: syzkaller-bugs@googlegroups.com
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13 22:40:36 -08:00
Uladzislau Rezki (Sony) bbe658d658 mm/slab: Move kvfree_rcu() into SLAB
Move kvfree_rcu() functionality to the slab_common.c file.

The reason to have kvfree_rcu() functionality as part of SLAB is that
there is a clear trend and need of closer integration. One of the recent
example is creating a barrier function for SLAB caches.

Another reason is to prevent of having several implementations of RCU
machinery for reclaiming objects after a GP. As future steps, it can be
more integrated(easier) with SLAB internals.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Tested-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-01-11 20:39:43 +01:00
Uladzislau Rezki (Sony) c18bcd85ce rcu/kvfree: Adjust a shrinker name
Rename "rcu-kfree" to "slab-kvfree-rcu" since it goes to the
slab_common.c file soon.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Tested-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-01-11 20:39:36 +01:00
Uladzislau Rezki (Sony) ba5cac52d0 rcu/kvfree: Adjust names passed into trace functions
Currently trace functions are supplied with "rcu_state.name"
member which is located in the structure. The problem is that
the "rcu_state" structure variable is local and can not be
accessed from another place.

To address this, this preparation patch passes "slab" string
as a first argument.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Tested-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-01-11 20:39:29 +01:00
Uladzislau Rezki (Sony) d824ed707b rcu/kvfree: Move some functions under CONFIG_TINY_RCU
Currently when a tiny RCU is enabled, the tree.c file is not
compiled, thus duplicating function names do not conflict with
each other.

Because of moving of kvfree_rcu() functionality to the SLAB,
we have to reorder some functions and place them together under
CONFIG_TINY_RCU macro definition. Therefore, those functions name
will not conflict when a kernel is compiled for CONFIG_TINY_RCU
flavor.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Tested-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-01-11 20:39:19 +01:00
Uladzislau Rezki (Sony) 0f52b4db4f rcu/kvfree: Initialize kvfree_rcu() separately
Introduce a separate initialization of kvfree_rcu() functionality.
For such purpose a kfree_rcu_batch_init() is renamed to a kvfree_rcu_init()
and it is invoked from the main.c right after rcu_init() is done.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Tested-by: Hyeonggon Yoo <hyeonggon.yoo@sk.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-01-11 20:39:02 +01:00
Frederic Weisbecker 8044c58976 rcu: Use kthread preferred affinity for RCU exp kworkers
Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU exp workers to
use it instead of handling all the constraints by itself.

Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-01-08 18:15:03 +01:00
Frederic Weisbecker b04e317b52 treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-01-08 18:15:03 +01:00
Frederic Weisbecker db7ee3cb62 rcu: Use kthread preferred affinity for RCU boost
Now that kthreads have an infrastructure to handle preferred affinity
against CPU hotplug and housekeeping cpumask, convert RCU boost to use
it instead of handling all the constraints by itself.

Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-01-08 18:15:03 +01:00
Uladzislau Rezki (Sony) 4b5c220552 Merge branches 'fixes.2024.12.14a', 'rcutorture.2024.12.14a', 'srcu.2024.12.14a' and 'torture-test.2024.12.14a' into rcu-merge.2024.12.14a
fixes.2024.12.14a: RCU fixes
rcutorture.2024.12.14a: Torture-test updates
srcu.2024.12.14a: SRCU updates
torture-test.2024.12.14a: Adding an extra test, fixes
2024-12-14 17:32:26 +01:00
Feng Lee 45c7c67643 srcu: Remove redundant GP sequence checks in srcu_funnel_gp_start
We will perform GP sequence checking at the beginning of srcu_gp_start,
thus making it safe to remove duplicate GP sequence checks prior to
calling srcu_gp_start.

Signed-off-by: Feng Lee <379943137@qq.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:13:24 +01:00
Paul E. McKenney d465492a22 srcu: Guarantee non-negative return value from srcu_read_lock()
For almost 20 years, the int return value from srcu_read_lock() has
been always either zero or one.  This commit therefore documents the
fact that it will be non-negative, and does the same for the underlying
__srcu_read_lock().

[ paulmck: Apply Andrii Nakryiko feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:13:09 +01:00
Paul E. McKenney 1bb03ad383 rcu: Add lockdep_assert_irqs_disabled() to rcu_exp_need_qs()
Callers to rcu_exp_need_qs() are supposed to disable interrupts, so this
commit enlists lockdep's aid in checking this.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:10:38 +01:00
Paul E. McKenney ecc5e6b0d3 rcu: Add KCSAN exclusive-writer assertions for rdp->cpu_no_qs.b.exp
The value of rdp->cpu_no_qs.b.exp may be changed only by the corresponding
CPU, and that CPU is not even allowed to race with itself, for example,
via interrupt handlers.  This commit therefore adds KCSAN exclusive-writer
assertions to check this constraint.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:10:29 +01:00
Paul E. McKenney 7a32337119 rcu: Make preemptible rcu_exp_handler() check idempotency
Although the non-preemptible implementation of rcu_exp_handler()
contains checks to enforce idempotency, the preemptible version does not.
The reason for this omission is that in preemptible kernels, there is
no reporting of quiescent states from CPU hotplug notifiers, and thus
no need for idempotency.

In theory, anyway.

In practice, accidents happen.  This commit therefore adds checks under
WARN_ON_ONCE() to catch any such accidents.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:10:20 +01:00
Paul E. McKenney 6ae4c30fee rcu: Replace open-coded rcu_exp_need_qs() from rcu_exp_handler() with call
Currently, the preemptible implementation of rcu_exp_handler()
almost open-codes rcu_exp_need_qs().  A call to that function would be
shorter and would improve expediting in cases where rcu_exp_handler()
interrupted a preemption-disabled or bh-disabled region of code.
This commit therefore moves rcu_exp_need_qs() out of the non-preemptible
leg of the enclosing #ifdef and replaces the open coding in preemptible
rcu_exp_handler() with a call to rcu_exp_need_qs().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:10:14 +01:00
Paul E. McKenney e2bd168295 rcu: Move rcu_report_exp_rdp() setting of ->cpu_no_qs.b.exp under lock
This commit reduces the state space of rcu_report_exp_rdp() by moving
the setting of ->cpu_no_qs.b.exp under the rcu_node structure's ->lock.
The lock isn't really all that important here, given that this per-CPU
field is supposed to be written only by its CPU, but the disabling of
interrupts excludes things like rcu_exp_handler(), which also can write
to this same field.  Avoiding this sort of interleaved access reduces
the state space.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:10:07 +01:00
Paul E. McKenney d16e32f75f rcu: Make rcu_report_exp_cpu_mult() caller acquire lock
There is a hard-to-trigger bug in the expedited grace-period computation
whose fix requires that the __sync_rcu_exp_select_node_cpus() function
to check that the grace-period sequence number has not changed before
invoking rcu_report_exp_cpu_mult().  However, this check must be done
while holding the leaf rcu_node structure's ->lock.

This commit therefore prepares for that fix by moving this lock's
acquisition from rcu_report_exp_cpu_mult() to its callers (all two
of them).

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:09:59 +01:00
Frederic Weisbecker 049dfe96ba rcu: Report callbacks enqueued on offline CPU blind spot
Callbacks enqueued after rcutree_report_cpu_dead() fall into RCU barrier
blind spot. Report any potential misuse.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:09:45 +01:00
Paul E. McKenney 0fef924e39 rcutorture: Use symbols for SRCU reader flavors
This commit converts rcutorture.c values for the reader_flavor module
parameter from hexadecimal to the SRCU_READ_FLAVOR_* C-preprocessor
macros.  The actual modprobe or kernel-boot-parameter values for
read_flavor must still be entered in hexadecimal.

Link: https://lore.kernel.org/all/c48c9dca-fe07-4833-acaa-28c827e5a79e@amd.com/

Suggested-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:06:08 +01:00
Paul E. McKenney 223f16b87d rcutorture: Add per-reader-segment preemption diagnostics
For preemptible RCU, this commit adds an indication for each
reader segments to whether the rcu_torture_reader() task was
on the ->blkd_tasks lists, though only in kernels built with
CONFIG_RCU_TORTURE_TEST_LOG_CPU=y.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:05:52 +01:00
Paul E. McKenney 885a6f4729 rcutorture: Read CPU ID for decoration protected by both reader types
Currently, rcutorture_one_extend() reads the CPU ID before making any
change to the type of RCU reader.  This can be confusing because the
properties of the code from which the CPU ID is read are not that of
the reader segment that this same CPU ID is listed with.

This commit therefore causes rcutorture_one_extend() to read the CPU
ID just after the new protections have been added, but before the old
protections have been removed.  With this change in place, all of the
protections of a given reader segment apply from the reading of one CPU ID
to the reading of the next.  This change therefore also allows a single
read of the CPU ID to work for both the old and the new reader segment.
And this dual use of a single read of the CPU ID avoids inflicting any
additional to heisenbugs.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:05:43 +01:00
Paul E. McKenney c31569eec4 rcutorture: Add preempt_count() to rcutorture_one_extend_check() diagnostics
This commit adds the value of preempt_count() to the diagnostics produced
by rcutorture_one_extend_check() to improve debugging.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:05:36 +01:00
Paul E. McKenney 282e06cc8f rcutorture: Add parameters to control polled/conditional wait interval
This commit adds rcutorture module parameters gp_cond_wi, gp_cond_wi_exp,
gp_poll_wi, and gp_poll_wi_exp to control the wait interval for
conditional, conditional expedited, polled, and polled expedited grace
periods, respectively.  When rcu_torture_writer() is testing these types
of grace periods, hrtimers are used to randomly wait up to the specified
number of microseconds, but with nanosecond granularity.

In the case of conditional grace periods (get_state_synchronize_rcu()
and cond_synchronize_rcu(), for example) there is just one
wait.  For polled grace periods (start_poll_synchronize_rcu() and
poll_state_synchronize_rcu(), for example), there is a repeated series
of waits until the grace period ends.

For normal grace periods, the default is 16 jiffies (for example, 16,000
microseconds on a HZ=1000 system) and for expedited grace periods the
default is 128 microseconds.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:05:27 +01:00
Paul E. McKenney a2ab1e4578 rcutorture: Ignore attempts to test preemption and forward progress
Use of the rcutorture preempt_duration and the default-on fwd_progress
kernel parameters can result in preemption of callback processing during
forward-progress testing, which is an excellent way to OOM your test
if your kernel offloads RCU callbacks.  This commit therefore treats
preempt_duration in the same way as stall_cpu in CONFIG_RCU_NOCB_CPU=y
kernels, prohibiting fwd_progress testing and splatting when rcutorture
is built in (as opposed to being a loadable module).

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:05:03 +01:00
Paul E. McKenney ec9d6356bf rcutorture: Make rcutorture_one_extend() check reader state
This commit adds reader-state debugging checks to a new function named
rcutorture_one_extend_check(), which is invoked before and after setting
new reader states by the existing rcutorture_one_extend() function.
These checks have proven to be rather heavyweight, reducing reproduction
rate of some failures by a factor of two.  They are therefore hidden
behind a new RCU_TORTURE_TEST_CHK_RDR_STATE Kconfig option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Tested-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:04:55 +01:00
Paul E. McKenney 16338e7cb7 rcutorture: Pretty-print rcutorture reader segments
The current "Failure/close-call rcutorture reader segments" output is
good and sufficient, but annoying when you have to interpret several
tens of them after an all-night rcutorture run.  This commit therefore
makes them a bit more human-readable.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:04:39 +01:00
Paul E. McKenney b27a34f908 rcutorture: Add full read-side contexts to "busted" torture type
The purpose of the "busted" torture type is to test rcutorture code paths
used only when a too-short grace period is detected.  Currently, "busted"
only uses normal rcu_read_lock()-style readers, which fails to exercise
much of the "Failure/close-call rcutorture reader segments" functionality.
This commit therefore sets the .extendables field of rcu_busted_ops to
RCUTORTURE_MAX_EXTEND in order to more fully exercise the reporting.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:04:23 +01:00
Paul E. McKenney 3b476823b9 rcutorture: Decorate failing reader segments with last CPU ID
In kernels built with CONFIG_RCU_TORTURE_TEST_LOG_CPU=y, the CPU is
logged at the beginning of each reader segment.  This commit further
logs it at the end of the full set of reader segments in order to show
any migration that might have occurred during the last reader segment.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:04:08 +01:00
Paul E. McKenney 0f38c06cab rcutorture: Check preemption for failing reader
This commit checks to see if the RCU reader has been preempted within
its read-side critical section for RCU flavors supporting this notion
(currently only preemptible RCU).  If such a preemption occurred, then
this is printed at the end of the "Failure/close-call rcutorture reader
segments" list at the end of the rcutorture run.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Tested-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:03:41 +01:00
Paul E. McKenney 4569cf60b6 rcutorture: Add ->cond_sync_exp_full function to rcu_ops structure
The rcu_ops structure currently lacks a ->cond_sync_exp_full function,
which prevents testign of conditional full-state polled grace periods.
This commit therefore adds them, enabling testing this option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:03:07 +01:00
Paul E. McKenney 7b6c1648bb rcutorture: Use finer-grained timeouts for rcu_torture_writer() polling
The rcu_torture_writer() polling currently uses timeouts ranging from
zero to 16 milliseconds to wait for the polled grace period to end.
This works, but it would be better to have a higher probability of
exercising races with the code that cleans up after a grace period.
This commit therefore switches from these millisecond-scale timeouts
to timeouts ranging from zero to 128 microseconds, and with a full
microsecond's worth of timeout fuzz.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:02:28 +01:00
Paul E. McKenney 579a05da40 rcutorture: Decorate failing reader segments with CPU ID
This commit adds CPU number to the "Failure/close-call rcutorture reader
segments" list printed at the end of an rcutorture run that had too-short
grace periods.  This information can help debugging interactions with
migration and CPU hotplug.

However, experience indicates that sampling the CPU number in rcutorture's
read-side code can reduce the probability of too-short bugs by a small
integer factor.  And small integer factors are crucial to RCU bug hunting,
so this commit also introduces a default-off RCU_TORTURE_TEST_LOG_CPU
Kconfig option to enable this CPU-number-logging functionality at
build time.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:02:11 +01:00
Paul E. McKenney 584975ccb7 rcutorture: Add random real-time preemption
This commit adds the rcutorture.preempt_duration kernel module parameter,
which gives the real-time preemption duration in milliseconds (zero to
disable, which is the default) and also the rcutorture.preempt_interval
module parameter, which gives the interval between successive preemptions,
also in milliseconds, defaulting to one second.  The CPU to preempt is
chosen at random from those online at that time.  Races between preempting
a given CPU and that CPU going offline are ignored, and preemption is
forgone when this occurs.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 17:01:05 +01:00
Paul E. McKenney 0203b485d2 torture: Add dowarn argument to torture_sched_setaffinity()
Current use cases of torture_sched_setaffinity() are well served by its
unconditional warning on error.  However, an upcoming use case for a
preemption kthread needs to avoid warnings that might otherwise arise
when that kthread attempted to bind itself to a CPU on its way offline.
This commit therefore adds a dowarn argument that, when false, suppresses
the warning.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14 16:38:23 +01:00