Commit Graph

1331544 Commits

Author SHA1 Message Date
Andrea Righi
0aaaf89df8 sched_ext: idle: Introduce SCX_OPS_BUILTIN_IDLE_PER_NODE
Add the new scheduler flag SCX_OPS_BUILTIN_IDLE_PER_NODE, which allows
BPF schedulers to select between using a global flat idle cpumask or
multiple per-node cpumasks.

This only introduces the flag and the mechanism to enable/disable this
feature without affecting any scheduling behavior.

Cc: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:20 -10:00
Andrea Righi
d73249f887 sched_ext: idle: Make idle static keys private
Make all the static keys used by the idle CPU selection policy private
to ext_idle.c. This avoids unnecessary exposure in headers and improves
code encapsulation.

Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:20 -10:00
Andrea Righi
f09177ca5f sched/topology: Introduce for_each_node_numadist() iterator
Introduce the new helper for_each_node_numadist() to iterate over node
IDs in order of increasing NUMA distance from a given starting node.

This iterator is somehow similar to for_each_numa_hop_mask(), but
instead of providing a cpumask at each iteration, it provides a node ID.

Example usage:

  nodemask_t unvisited = NODE_MASK_ALL;
  int node, start = cpu_to_node(smp_processor_id());

  node = start;
  for_each_node_numadist(node, unvisited)
  	pr_info("node (%d, %d) -> %d\n",
  		 start, node, node_distance(start, node));

On a system with equidistant nodes:

 $ numactl -H
 ...
 node distances:
 node     0    1    2    3
    0:   10   20   20   20
    1:   20   10   20   20
    2:   20   20   10   20
    3:   20   20   20   10

Output of the example above (on node 0):

[    7.367022] node (0, 0) -> 10
[    7.367151] node (0, 1) -> 20
[    7.367186] node (0, 2) -> 20
[    7.367247] node (0, 3) -> 20

On a system with non-equidistant nodes (simulated using virtme-ng):

 $ numactl -H
 ...
 node distances:
 node     0    1    2    3
    0:   10   51   31   41
    1:   51   10   21   61
    2:   31   21   10   11
    3:   41   61   11   10

Output of the example above (on node 0):

 [    8.953644] node (0, 0) -> 10
 [    8.953712] node (0, 2) -> 31
 [    8.953764] node (0, 3) -> 41
 [    8.953817] node (0, 1) -> 51

Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:20 -10:00
Andrea Righi
16d79f2a4f mm/numa: Introduce nearest_node_nodemask()
Introduce the new helper nearest_node_nodemask() to find the closest
node in a specified nodemask from a given starting node.

Returns MAX_NUMNODES if no node is found.

Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:19 -10:00
Yury Norov
14a8262f50 nodemask: numa: reorganize inclusion path
Nodemasks now pull linux/numa.h for MAX_NUMNODES and NUMA_NO_NODE
macros. This series makes numa.h depending on nodemasks, so we hit
a circular dependency.

Nodemasks library is highly employed by NUMA code, and it would be
logical to resolve the circular dependency by making NUMA headers
dependent nodemask.h.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:19 -10:00
Yury Norov
7665054ee0 nodemask: add nodes_copy()
Nodemasks API misses the plain nodes_copy() which is required in this
series.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16 06:52:19 -10:00
Tejun Heo
f2c880fc81 tools/sched_ext: Sync with scx repo
Synchronize with https://github.com/sched-ext/scx at d384453984a0 ("kernel:
Sync at ad3b301aa05a ("sched_ext: Provides a sysfs 'events' to expose core
event counters")").

Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-14 08:46:20 -10:00
Changwoo Min
ad3b301aa0 sched_ext: Provides a sysfs 'events' to expose core event counters
Add a sysfs entry at /sys/kernel/sched_ext/root/events to expose core
event counters through the files system interface. Each line of the file
shows the event name and its counter value.

In addition, the format of scx_dump_event() is adjusted as the event name
gets longer.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-14 07:15:41 -10:00
Tejun Heo
3539c6411a sched_ext: Implement SCX_OPS_ALLOW_QUEUED_WAKEUP
A task wakeup can be either processed on the waker's CPU or bounced to the
wakee's previous CPU using an IPI (ttwu_queue). Bouncing to the wakee's CPU
avoids the waker's CPU locking and accessing the wakee's rq which can be
expensive across cache and node boundaries.

When ttwu_queue path is taken, select_task_rq() and thus ops.select_cpu()
may be skipped in some cases (racing against the wakee switching out). As
this confused some BPF schedulers, there wasn't a good way for a BPF
scheduler to tell whether idle CPU selection has been skipped, ops.enqueue()
couldn't insert tasks into foreign local DSQs, and the performance
difference on machines with simple toplogies were minimal, sched_ext
disabled ttwu_queue.

However, this optimization makes noticeable difference on more complex
topologies and a BPF scheduler now has an easy way tell whether
ops.select_cpu() was skipped since 9b671793c7d9 ("sched_ext, scx_qmap: Add
and use SCX_ENQ_CPU_SELECTED") and can insert tasks into foreign local DSQs
since 5b26f7b920f7 ("sched_ext: Allow SCX_DSQ_LOCAL_ON for direct
dispatches").

Implement SCX_OPS_ALLOW_QUEUED_WAKEUP which allows BPF schedulers to choose
to enable ttwu_queue optimization.

v2: Update the patch description and comment re. ops.select_cpu() being
    skipped in some cases as opposed to always as per Neel.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Neel Natu <neelnatu@google.com>
Reported-by: Barret Rhoden <brho@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
2025-02-13 07:30:46 -10:00
Tejun Heo
78e4690de4 Merge branch 'for-6.14-fixes' into for-6.15
Pull to receive f3f08c3acfb8 ("sched_ext: Fix incorrect assumption about
migration disabled tasks in task_can_run_on_remote_rq()") which conflicts
with 26176116d931 ("sched_ext: Count SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE in
the right spot") in for-6.15.
2025-02-10 10:45:43 -10:00
Tejun Heo
f3f08c3acf sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq()
While fixing migration disabled task handling, 32966821574c ("sched_ext: Fix
migration disabled handling in targeted dispatches") assumed that a
migration disabled task's ->cpus_ptr would only have the pinned CPU. While
this is eventually true for migration disabled tasks that are switched out,
->cpus_ptr update is performed by migrate_disable_switch() which is called
right before context_switch() in __scheduler(). However, the task is
enqueued earlier during pick_next_task() via put_prev_task_scx(), so there
is a race window where another CPU can see the task on a DSQ.

If the CPU tries to dispatch the migration disabled task while in that
window, task_allowed_on_cpu() will succeed and task_can_run_on_remote_rq()
will subsequently trigger SCHED_WARN(is_migration_disabled()).

  WARNING: CPU: 8 PID: 1837 at kernel/sched/ext.c:2466 task_can_run_on_remote_rq+0x12e/0x140
  Sched_ext: layered (enabled+all), task: runnable_at=-10ms
  RIP: 0010:task_can_run_on_remote_rq+0x12e/0x140
  ...
   <TASK>
   consume_dispatch_q+0xab/0x220
   scx_bpf_dsq_move_to_local+0x58/0xd0
   bpf_prog_84dd17b0654b6cf0_layered_dispatch+0x290/0x1cfa
   bpf__sched_ext_ops_dispatch+0x4b/0xab
   balance_one+0x1fe/0x3b0
   balance_scx+0x61/0x1d0
   prev_balance+0x46/0xc0
   __pick_next_task+0x73/0x1c0
   __schedule+0x206/0x1730
   schedule+0x3a/0x160
   __do_sys_sched_yield+0xe/0x20
   do_syscall_64+0xbb/0x1e0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fix it by converting the SCHED_WARN() back to a regular failure path. Also,
perform the migration disabled test before task_allowed_on_cpu() test so
that BPF schedulers which fail to handle migration disabled tasks can be
noticed easily.

While at it, adjust scx_ops_error() message for !task_allowed_on_cpu() case
for brevity and consistency.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches")
Acked-by: Andrea Righi <arighi@nvidia.com>
Reported-by: Jake Hillion <jakehillion@meta.com>
2025-02-10 10:40:47 -10:00
Changwoo Min
2e7df12bdd tools/sched_ext: Update enum_defs.autogen.h
Add where the script is located to the comment lines of the header file.
This helps anyone re-generate the header file if required.

Note that this is a sync from the PR [1] in the scx repo.

  [1] https://github.com/sched-ext/scx/pull/1322

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-10 07:22:42 -10:00
Li RongQing
1a4e0d8682 sched_ext: Take NUMA node into account when allocating per-CPU cpumasks
per-CPU cpumasks are dominantly accessed from their own local CPUs,
so allocate them node-local to improve performance.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-10 07:20:13 -10:00
Changwoo Min
372033ad9e tools/sched_ext: Compatible testing of SCX_ENQ_CPU_SELECTED
This provides compatible testing of SCX_ENQ_CPU_SELECTED.
More specifically, it handles two cases:

  1. a BPF scheduler is compiled against vmlinux.h where
  SCX_ENQ_CPU_SELECTED is defined, but it runs on a kernel that does not
  have SCX_ENQ_CPU_SELECTED. In this case, the test result of
  'enq_flags & SCX_ENQ_CPU_SELECTED' will always be false. That test result
  is semantically incorrect because the kernel before SCX_ENQ_CPU_SELECTED
  has never skipped select_task_rq_scx(), so the result should be true.

  2. a BPF scheduler is compiling against vmlinux.h where
  SCX_ENQ_CPU_SELECTED is not defined. In this case, directly using
  SCX_ENQ_CPU_SELECTED causes compilation errors.

To hide such complexity, introduce __COMPAT_is_enq_cpu_selected(),
which checks if SCX_ENQ_CPU_SELECTED exists in runtime using BPF CO-RE.
This consists of three parts:

  1. Add enum_defs.autogen.h, which has macros (HAVE_{enum name}) denoting
  whether SCX enums are defined in the vmlinux.h or not.

  2. Implement __COMPAT_is_enq_cpu_selected(), which provide the test of
  SCX_ENQ_CPU_SELECTED in a compatible way.

  3. Use  __COMPAT_is_enq_cpu_selected() in scx_qmap.

Note that this is a sync of the relevant PR [1] in the scx repo.

  [1] https://github.com/sched-ext/scx/pull/1314

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-08 20:42:50 -10:00
Tejun Heo
eace54dff0 sched_ext: Add SCX_EV_ENQ_SKIP_MIGRATION_DISABLED
Count the number of times a migration disabled task is automatically
dispatched to its local DSQ.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Changwoo Min <changwoo@igalia.com>
2025-02-08 20:39:11 -10:00
Tejun Heo
26176116d9 sched_ext: Count SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE in the right spot
SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE wasn't quite right in two aspects:

- It counted both migration disabled and offline events.

- It didn't count events from scx_bpf_dsq_move() path.

Fix it by moving the counting into task_can_run_on_remote_rq() which is
shared by both paths and can distinguish the different rejection conditions.
The argument @trigger_error is renamed to @enforce as it now does more than
just triggering error.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Changwoo Min <changwoo@igalia.com>
2025-02-08 20:39:11 -10:00
Tejun Heo
46a0e16158 tool/sched_ext: Event counter dumping updates
- There's no need to dump event counters from both scx_qmap and scx_central.
  Drop counter dumping from scx_central.

- bpf_printk() implies a trailing new line and the explicit new line leads
  to double new lines. Drop the explicit new lines.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Changwoo Min <changwoo@igalia.com>
2025-02-08 20:39:11 -10:00
Tejun Heo
29ef4a2fcf Merge branch 'for-6.14-fixes' into for-6.15
Pull to receive:

- 2fa0fbeb69ed ("sched_ext: Implement auto local dispatching of migration disabled tasks")
- 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches")

as planned for-6.15 changes depend on them (e.g. adding event counter for
implicit migration disabled task handling).
2025-02-08 20:34:43 -10:00
Tejun Heo
3296682157 sched_ext: Fix migration disabled handling in targeted dispatches
A dispatch operation that can target a specific local DSQ -
scx_bpf_dsq_move_to_local() or scx_bpf_dsq_move() - checks whether the task
can be migrated to the target CPU using task_can_run_on_remote_rq(). If the
task can't be migrated to the targeted CPU, it is bounced through a global
DSQ.

task_can_run_on_remote_rq() assumes that the task is on a CPU that's
different from the targeted CPU but the callers doesn't uphold the
assumption and may call the function when the task is already on the target
CPU. When such task has migration disabled, task_can_run_on_remote_rq() ends
up returning %false incorrectly unnecessarily bouncing the task to a global
DSQ.

Fix it by updating the callers to only call task_can_run_on_remote_rq() when
the task is on a different CPU than the target CPU. As this is a bit subtle,
for clarity and documentation:

- Make task_can_run_on_remote_rq() trigger SCHED_WARN_ON() if the task is on
  the same CPU as the target CPU.

- is_migration_disabled() test in task_can_run_on_remote_rq() cannot trigger
  if the task is on a different CPU than the target CPU as the preceding
  task_allowed_on_cpu() test should fail beforehand. Convert the test into
  SCHED_WARN_ON().

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Fixes: 0366017e0973 ("sched_ext: Use task_can_run_on_remote_rq() test in dispatch_to_local_dsq()")
Cc: stable@vger.kernel.org # v6.12+
2025-02-08 20:33:25 -10:00
Tejun Heo
2fa0fbeb69 sched_ext: Implement auto local dispatching of migration disabled tasks
Migration disabled tasks are special and pinned to their previous CPUs. They
tripped up some unsuspecting BPF schedulers as their ->nr_cpus_allowed may
not agree with the bits set in ->cpus_ptr. Make it easier for BPF schedulers
by automatically dispatching them to the pinned local DSQs by default. If a
BPF scheduler wants to handle migration disabled tasks explicitly, it can
set SCX_OPS_ENQ_MIGRATION_DISABLED.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
2025-02-08 20:32:54 -10:00
Changwoo Min
38d65cd692 sched_ext: Print an event, SCX_EV_ENQ_SLICE_DFL, in scx_qmap/central
Modify the scx_qmap and scx_celtral schedulers
to print the SCX_EV_ENQ_SLICE_DFL event every second.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-07 11:24:59 -10:00
Changwoo Min
6d3f8fb4b2 sched_ext: Add an event, SCX_EV_ENQ_SLICE_DFL
Add a core event, SCX_EV_ENQ_SLICE_DFL, which represents how many
tasks have been enqueued (or pick_task-ed or select_cpu-ed) with
a default time slice (SCX_SLICE_DFL).

Scheduling a task with SCX_SLICE_DFL unintentionally would be a source
of latency spikes because SCX_SLICE_DFL is relatively long (20 msec).
Thus, soaring the SCX_EV_ENQ_SLICE_DFL value would be a sign of BPF
scheduler bugs, causing latency spikes, especially when ops.select_cpu()
is provided.

__scx_add_event() is used since the caller holds an rq lock or p->pi_lock,
so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-07 11:24:59 -10:00
Changwoo Min
2494e555fb sched_ext: Print core event count in scx_qmap scheduler
Modify the scx_qmap scheduler to print the core event counter
every second.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
6df93804b7 sched_ext: Print core event count in scx_central scheduler
Modify the scx_central scheduler to print the core event counter
every second.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
9865f31d85 sched_ext: Add scx_bpf_events() and scx_read_event() for BPF schedulers
scx_bpf_events() is added to the header files so the BPF scheduler
can use it. Also, scx_read_event() is added to read an event type in a
compatible way.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
d46457c31c sched_ext: Add an event, SCX_EV_BYPASS_DURATION
Add a core event, SCX_EV_BYPASS_DURATION, which represents the
total duration of bypass modes in nanoseconds.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
5c605cd33c sched_ext: Add an event, SCX_EV_BYPASS_DISPATCH
Add a core event, SCX_EV_BYPASS_DISPATCH, which represents how many
tasks have been dispatched in the bypass mode.

__scx_add_event() is used since the caller holds an rq lock or
p->pi_lock, so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
4f7a38c7c9 sched_ext: Add an event, SCX_EV_BYPASS_ACTIVATE
Add a core event, SCX_EV_BYPASS_ACTIVATE, which represents how many
times the bypass mode has been triggered.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
824d4f2dce sched_ext: Add an event, SCX_EV_ENQ_SKIP_EXITING
Add a core event, SCX_EV_ENQ_SKIP_EXITING, which represents how many
times a task is enqueued to a local DSQ when exiting if
SCX_OPS_ENQ_EXITING is not set.

__scx_add_event() is used since the caller holds an rq lock,
so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-04 10:36:47 -10:00
Changwoo Min
029b6ce733 sched_ext: Fix incorrect time delta calculation in time_delta()
When (s64)(after - before) > 0, the code returns the result of
(s64)(after - before) > 0 while the intended result should be
(s64)(after - before). That happens because the middle operand of
the ternary operator was omitted incorrectly, returning the result of
(s64)(after - before) > 0. Thus, add the middle operand
-- (s64)(after - before) -- to return the correct time calculation.

Fixes: d07be814fc71 ("sched_ext: Add time helpers for BPF schedulers")
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-02 07:40:35 -10:00
Changwoo Min
7125660bc1 sched_ext: Add an event, SCX_EV_DISPATCH_KEEP_LAST
Add a core event, SCX_EV_DISPATCH_KEEP_LAST, which represents how many
times a task is continued to run without ops.enqueue() when
SCX_OPS_ENQ_LAST is not set.

__scx_add_event() is used since the caller holds an rq lock,
so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-02 07:23:18 -10:00
Changwoo Min
9be0a1b0c8 sched_ext: Add an event, SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE
Add a core event, SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE, which represents how
many times a BPF scheduler tries to dispatch to an offlined local DSQ.

__scx_add_event() is used since the caller holds an rq lock,
so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-02 07:23:18 -10:00
Changwoo Min
f7f6142107 sched_ext: Add an event, SCX_EV_SELECT_CPU_FALLBACK
Add a core event, SCX_EV_SELECT_CPU_FALLBACK, which represents how many times
ops.select_cpu() returns a CPU that the task can't use.

__scx_add_event() is used since the caller holds an rq lock,
so the preemption has already been disabled.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-02 07:23:18 -10:00
Changwoo Min
17103b8504 sched_ext: Implement event counter infrastructure
Collect the statistics of specific types of behavior in the sched_ext core,
which are not easily visible but still interesting to an scx scheduler.

An event type is defined in 'struct scx_event_stats.' When an event occurs,
its counter is accumulated using 'scx_add_event()' and '__scx_add_event()'
to per-CPU 'struct scx_event_stats' for efficiency. 'scx_bpf_events()'
aggregates all the per-CPU counters and exposes a system-wide counters.

For convenience and readability of the code, 'scx_agg_event()' and
'scx_dump_event()' are provided.

The collected events can be observed after a BPF scheduler is unloaded
beforea new BPF scheduler is loaded so the per-CPU 'struct scx_event_stats'
are reset.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-02 07:23:18 -10:00
Andrea Righi
337d1b354a sched_ext: Move built-in idle CPU selection policy to a separate file
As ext.c is becoming quite large, move the idle CPU selection policy to
separate files (ext_idle.c / ext_idle.h) for better code readability.

Moreover, group together all the idle CPU selection kfunc's to the same
btf_kfunc_id_set block.

No functional changes, this is purely code reorganization.

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27 12:43:43 -10:00
Andrea Righi
1626e5ef0b sched_ext: Fix lock imbalance in dispatch_to_local_dsq()
While performing the rq locking dance in dispatch_to_local_dsq(), we may
trigger the following lock imbalance condition, in particular when
multiple tasks are rapidly changing CPU affinity (i.e., running a
`stress-ng --race-sched 0`):

[   13.413579] =====================================
[   13.413660] WARNING: bad unlock balance detected!
[   13.413729] 6.13.0-virtme #15 Not tainted
[   13.413792] -------------------------------------
[   13.413859] kworker/1:1/80 is trying to release lock (&rq->__lock) at:
[   13.413954] [<ffffffff873c6c48>] dispatch_to_local_dsq+0x108/0x1a0
[   13.414111] but there are no more locks to release!
[   13.414176]
[   13.414176] other info that might help us debug this:
[   13.414258] 1 lock held by kworker/1:1/80:
[   13.414318]  #0: ffff8b66feb41698 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x20/0x90
[   13.414612]
[   13.414612] stack backtrace:
[   13.415255] CPU: 1 UID: 0 PID: 80 Comm: kworker/1:1 Not tainted 6.13.0-virtme #15
[   13.415505] Workqueue:  0x0 (events)
[   13.415567] Sched_ext: dsp_local_on (enabled+all), task: runnable_at=-2ms
[   13.415570] Call Trace:
[   13.415700]  <TASK>
[   13.415744]  dump_stack_lvl+0x78/0xe0
[   13.415806]  ? dispatch_to_local_dsq+0x108/0x1a0
[   13.415884]  print_unlock_imbalance_bug+0x11b/0x130
[   13.415965]  ? dispatch_to_local_dsq+0x108/0x1a0
[   13.416226]  lock_release+0x231/0x2c0
[   13.416326]  _raw_spin_unlock+0x1b/0x40
[   13.416422]  dispatch_to_local_dsq+0x108/0x1a0
[   13.416554]  flush_dispatch_buf+0x199/0x1d0
[   13.416652]  balance_one+0x194/0x370
[   13.416751]  balance_scx+0x61/0x1e0
[   13.416848]  prev_balance+0x43/0xb0
[   13.416947]  __pick_next_task+0x6b/0x1b0
[   13.417052]  __schedule+0x20d/0x1740

This happens because dispatch_to_local_dsq() is racing with
dispatch_dequeue() and, when the latter wins, we incorrectly assume that
the task has been moved to dst_rq.

Fix by properly tracking the currently locked rq.

Fixes: 4d3ca89bdd31 ("sched_ext: Refactor consume_remote_task()")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27 12:41:12 -10:00
Andrea Righi
3c7d51b0d2 sched_ext: selftests/dsp_local_on: Fix selftest on UP systems
In UP systems p->migration_disabled is not available. Fix this by using
the portable helper is_migration_disabled(p).

Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27 09:00:09 -10:00
Andrea Righi
5f52bbf2f6 tools/sched_ext: Add helper to check task migration state
Introduce a new helper for BPF schedulers to determine whether a task
can migrate or not (supporting both SMP and UP systems).

Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27 08:59:55 -10:00
Tejun Heo
d6f3e7d564 sched_ext: Fix incorrect autogroup migration detection
scx_move_task() is called from sched_move_task() and tells the BPF scheduler
that cgroup migration is being committed. sched_move_task() is used by both
cgroup and autogroup migrations and scx_move_task() tried to filter out
autogroup migrations by testing the destination cgroup and PF_EXITING but
this is not enough. In fact, without explicitly tagging the thread which is
doing the cgroup migration, there is no good way to tell apart
scx_move_task() invocations for racing migration to the root cgroup and an
autogroup migration.

This led to scx_move_task() incorrectly ignoring a migration from non-root
cgroup to an autogroup of the root cgroup triggering the following warning:

  WARNING: CPU: 7 PID: 1 at kernel/sched/ext.c:3725 scx_cgroup_can_attach+0x196/0x340
  ...
  Call Trace:
  <TASK>
    cgroup_migrate_execute+0x5b1/0x700
    cgroup_attach_task+0x296/0x400
    __cgroup_procs_write+0x128/0x140
    cgroup_procs_write+0x17/0x30
    kernfs_fop_write_iter+0x141/0x1f0
    vfs_write+0x31d/0x4a0
    __x64_sys_write+0x72/0xf0
    do_syscall_64+0x82/0x160
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fix it by adding an argument to sched_move_task() that indicates whether the
moving is for a cgroup or autogroup migration. After the change,
scx_move_task() is called only for cgroup migrations and renamed to
scx_cgroup_move_task().

Link: https://github.com/sched-ext/scx/issues/370
Fixes: 819513666966 ("sched_ext: Add cgroup support")
Cc: stable@vger.kernel.org # v6.12+
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27 08:31:50 -10:00
Tejun Heo
e9fe182772 sched_ext: selftests/dsp_local_on: Fix sporadic failures
dsp_local_on has several incorrect assumptions, one of which is that
p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task
is scheduled out while migration is disabled - p->cpus_ptr is temporarily
overridden to the previous CPU while p->nr_cpus_allowed remains unchanged.

This led to sporadic test faliures when dsp_local_on_dispatch() tries to put
a migration disabled task to a different CPU. Fix it by keeping the previous
CPU when migration is disabled.

There are SCX schedulers that make use of p->nr_cpus_allowed. They should
also implement explicit handling for p->migration_disabled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Changwoo Min <changwoo@igalia.com>
2025-01-24 10:48:25 -10:00
Andrea Righi
74ca334338 selftests/sched_ext: Fix enum resolution
All scx enums are now automatically generated from vmlinux.h and they
must be initialized using the SCX_ENUM_INIT() macro.

Fix the scx selftests to use this macro to properly initialize these
values.

Fixes: 8da7bf2cee27 ("tools/sched_ext: Receive updates from SCX repo")
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24 08:09:52 -10:00
Andrea Righi
2279563e3a sched_ext: Include task weight in the error state dump
Report the task weight when dumping the task state during an error exit.
Moreover, adjust the output format to display dsq_vtime, slice, and
weight on the same line.

This can help identify whether certain tasks were excessively
prioritized or de-prioritized due to large niceness gaps.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24 08:07:42 -10:00
Atul Kumar Pant
be8ee18152 sched_ext: Fixes typos in comments
Fixes some spelling errors in the comments.

Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24 08:06:05 -10:00
Linus Torvalds
ab18b8fff1 auxdisplay for v6.14-1
* A couple of cleanups to img-ascii-lcd driver
 
 The following is an automated git shortlog grouped by driver:
 
 img-ascii-lcd:
  -  Constify struct img_ascii_lcd_config
  -  Remove an unused field in struct img_ascii_lcd_ctx
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAmeSUpMACgkQb7wzTHR8
 rChpvw/9FgTm6s8V5WmyNDKMm1E08mnsj7oRiiG9fprt+9iYoW3ejTK8taIF9IWX
 mvZANPnd9NCW11dv/DVkHtyJkcXj7EAo++VaDE5JagZIbC4LEp712lISaM+UiHOJ
 uQXEci48/cyR5BW5bbdVR6RAG0pI119y4U3kUJgHr/B9jIv9EMeI5y7NPQ62qWsT
 hhDBe9buMX0QD4jLUJiyjRto54jCMEZ7f/kto1db8OlWoAQGbK23R55mwLeo3bTa
 3JrGPlNzMT1KJo/pburPuAB7j8XkSmfu0F60h780TgvyDYRsZJZ+Y/UsDYkCs0xk
 qsxMVwBAVyIEZjVtbRp6lcdwhashqUsBpq4t2Fp1tjWULC8+5gsfE3mCRHvRPDMT
 cFURSGQQwSIaQIODwROZj+RhhvwvZ6Iswuh9OvDyquBHvSe88fKcPTByrSofXNCz
 0pYiEmPCMPaOZUqQCg8MJRUP5JQTHNLjjt/eVHBv7VZtxYI8wVtUviZrtUNrMq7w
 UaSqlyXs9Ri7cgTxTNM1eOSMSYRbQp/J6HcwkBoQdxJBL+kmvwUPAAxuPTpCEiGi
 uUYj90fENHTO7+Z67p/+jMK3r2W1eIjtbyu5IE878ukgU/nlM2LVvmlbNsVFNedS
 o+BwqTjv979tC1W5m67cXZG2jxCpaYNzFjNNGjKHG5YSkz8/iBc=
 =CZTc
 -----END PGP SIGNATURE-----

Merge tag 'auxdisplay-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay

Pull auxdisplay updates from Andy Shevchenko:

 - A couple of cleanups to img-ascii-lcd driver

* tag 'auxdisplay-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
  auxdisplay: img-ascii-lcd: Constify struct img_ascii_lcd_config
  auxdisplay: img-ascii-lcd: Remove an unused field in struct img_ascii_lcd_ctx
2025-01-24 08:03:52 -08:00
Linus Torvalds
2c8d2a510c sound updates for 6.14-rc1
This was a relatively calm cycle, and most of changes are rather small
 device-specific fixes.  Here are highlights:
 
 * Core:
 - Further enhancements of ALSA rawmidi and sequencer APIs for MIDI 2.0
 - compress-offload API extensions for ASRC support
 
 * ASoC:
 - Allow clocking on each DAI in an audio graph card to be configured
   separately
 - Improved power management for Renesas RZ-SSI
 - KUnit testing for the Cirrus DSP framework
 - Memory to meory operation support for Freescale/NXP platforms
 - Support for pause operations in SOF
 - Support for Allwinner suinv F1C100s, Awinc AW88083, Realtek
   ALC5682I-VE
 
 * HD- and USB-audio:
 - Add support for Focusrite Scarlett 4th Gen 16i16, 18i16, and 18i20
   interfaces via new FCP driver
 - TAS2781 SPI HD-audio sub-codec support
 - Various device-specific quirks as usual
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmeSRrwOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+mWw//QbiwfdfjNjXVTOBpJOI7auFj408sB2IgoOcM
 wj3h94ONG+45DZSbzpAB5t8sNN2EBrx7KGtdLcSqlPoehLq3KduIaZhDjnjtYKVL
 2GujsSqhsBRr/dr4Qr+48idtlcv1HJsJZGkf/nSHdyHgqVkk9jd42okHOOsOHF6g
 hn3zocPSVBoAH6UTPnxNmX9jD4sKFu+jBayxTmWXMY+qS+iD9dCRa8kyeLyERqAO
 uB5IFQdaNM7y4DHmDSgH3sBohGZ/38x/hNpS6sspoMrkVNoLLa7W2g7DL2r2qMtg
 JMMHlyx1VGqFpQC6TPgFWSb2WJpxo18cVGD5NuXGnT5SjTzKMrHOtp9dmVt+p9Am
 mqveyEMijV/5lmKo4I515xJM5AllQkYYFHhlEOySFx0iUchNCogEBHwMG8ufTAk6
 WwdxR8YPzzE2lxJk5UrVvliA1s/B4pwHtbZEOR29QnfOzQuKttaES+sWisZXHaEG
 Nqyy+xta4oiixdneeTEp8Dr4bpHBlUaO4gA/f/iJ0apJK+3KeNAuC57Xi5yBJS5p
 LtbUd+hRYNLQ5mUtRYIiOLLeiKnD62bYRSYgoRC0MOLZiT49G3Kr1BuyUqj2x0OW
 vLY/Tl4D2c+4aNwPwhsq5tSiTmQqZis9njT7FJL+OFpIHlKXANubZaMZnU+COM1V
 aa9Mye0=
 =IBfz
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "This was a relatively calm cycle, and most of changes are rather small
  device-specific fixes. Here are highlights:

  Core:
   - Further enhancements of ALSA rawmidi and sequencer APIs for MIDI
     2.0
   - compress-offload API extensions for ASRC support

  ASoC:
   - Allow clocking on each DAI in an audio graph card to be configured
     separately
   - Improved power management for Renesas RZ-SSI
   - KUnit testing for the Cirrus DSP framework
   - Memory to meory operation support for Freescale/NXP platforms
   - Support for pause operations in SOF
   - Support for Allwinner suinv F1C100s, Awinc AW88083, Realtek
     ALC5682I-VE

  HD- and USB-audio:
   - Add support for Focusrite Scarlett 4th Gen 16i16, 18i16, and 18i20
     interfaces via new FCP driver
   - TAS2781 SPI HD-audio sub-codec support
   - Various device-specific quirks as usual"

* tag 'sound-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (235 commits)
  ALSA: hda: tas2781-spi: Fix bogus error handling in tas2781_hda_spi_probe()
  ALSA: hda: tas2781-spi: Fix error code in tas2781_read_acpi()
  ALSA: hda: tas2781-spi: Delete some dead code
  ALSA: usb: fcp: Fix return code from poll ops
  ALSA: usb: fcp: Fix incorrect resp->opcode retrieval
  ALSA: usb: fcp: Fix meter_levels type to __le32
  ALSA: hda/realtek: Enable Mute LED on HP Laptop 14s-fq1xxx
  ALSA: hda: tas2781-spi: Fix -Wsometimes-uninitialized in tasdevice_spi_switch_book()
  ALSA: ctxfi: Simplify dao_clear_{left,right}_input() functions
  ALSA: hda: tas2781-spi: select CRC32 instead of CRC32_SARWATE
  ALSA: usb: fcp: Fix hwdep read ops types
  ALSA: scarlett2: Add device_setup option to use FCP driver
  ALSA: FCP: Add Focusrite Control Protocol driver
  ALSA: hda/tas2781: Add tas2781 hda SPI driver
  ALSA: hda/realtek - Fixed headphone distorted sound on Acer Aspire A115-31 laptop
  ASoC: xilinx: xlnx_spdif: Simpify using devm_clk_get_enabled()
  ALSA: hda: Support for Ideapad hotkey mute LEDs
  ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83JX, 83MC and 83NM
  ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83LC
  ASoC: dapm: add support for preparing streams
  ...
2025-01-24 07:54:34 -08:00
Linus Torvalds
454cb97726 This update includes the following changes:
API:
 
 - Remove physical address skcipher walking.
 - Fix boot-up self-test race.
 
 Algorithms:
 
 - Optimisations for x86/aes-gcm.
 - Optimisations for x86/aes-xts.
 - Remove VMAC.
 - Remove keywrap.
 
 Drivers:
 
 - Remove n2.
 
 Others:
 
 - Fixes for padata UAF.
 - Fix potential rhashtable deadlock by moving schedule_work outside lock.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmeSIvwACgkQxycdCkmx
 i6dkYw//bJ6OxIXdtsDWVtJF4GnfxLYSU33GGGMWrbwxS/EihL12rkB3JPw2avJb
 oFBP8rWl5Qv9tDF2gjn6TyBaydVnKMA9nUbsqKN6m/DZ/RcCpHigQ21HVzny3bhw
 rHsZcWoy14TXMuni1DhLnYPftbF+7qZ/pdT5WYr4MEchQhzQc6XWaS2T5by16bjn
 HHsPHNZj+kFDf4kKYab3jmnly8Qo0wpTMvuX1tsiUqt7YABcg3dobIisMPatxg8A
 CIgdBZJRivC55Cqm4JT7P+y63PsJVGCyoLXOAGoZN5CLwdTSGND12DJ1awEcOswc
 7fMlCk0gDrhniUTUzP8VsP8EUCezIIpaIfne9v/0OERo6DbiuX+NeEwxWJNdIHeS
 vZocY5a6hS84iBdsuPrUaPqZI6oUSYFIwKPJUwbyaY4j1cfowHz8zbgmmPO5TUV7
 NAI7/QpoMA3GNWn3p+64eeXekT2DcU5o3i14dbJ31FQhlFbzVWA7/2Z5ydu18Fex
 ntTEplPCzYrsqwuxmFDb/3dsk3Z98RquZZJzIKAXKSXTNBOYJaFOCTyugdkn18Nq
 p6dJNXEvl6lnjylgILa0ltv6TI8h7IRpuqi+FAqExOXR3H3gelVXUjMXnC0fmjrd
 +ARAzq223xPWwsKEd00Rb3FEoq0XyChvxh4n3BqM4XhSenWggOc=
 =/75o
 -----END PGP SIGNATURE-----

Merge tag 'v6.14-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Remove physical address skcipher walking
   - Fix boot-up self-test race

  Algorithms:
   - Optimisations for x86/aes-gcm
   - Optimisations for x86/aes-xts
   - Remove VMAC
   - Remove keywrap

  Drivers:
   - Remove n2

  Others:
   - Fixes for padata UAF
   - Fix potential rhashtable deadlock by moving schedule_work outside
     lock"

* tag 'v6.14-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (75 commits)
  rhashtable: Fix rhashtable_try_insert test
  dt-bindings: crypto: qcom,inline-crypto-engine: Document the SM8750 ICE
  dt-bindings: crypto: qcom,prng: Document SM8750 RNG
  dt-bindings: crypto: qcom-qce: Document the SM8750 crypto engine
  crypto: asymmetric_keys - Remove unused key_being_used_for[]
  padata: avoid UAF for reorder_work
  padata: fix UAF in padata_reorder
  padata: add pd get/put refcnt helper
  crypto: skcipher - call cond_resched() directly
  crypto: skcipher - optimize initializing skcipher_walk fields
  crypto: skcipher - clean up initialization of skcipher_walk::flags
  crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt()
  crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW
  crypto: skcipher - remove redundant clamping to page size
  crypto: skcipher - remove unnecessary page alignment of bounce buffer
  crypto: skcipher - document skcipher_walk_done() and rename some vars
  crypto: omap - switch from scatter_walk to plain offset
  crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data
  crypto: bcm - Drop unused setting of local 'ptr' variable
  crypto: hisilicon/qm - support new function communication
  ...
2025-01-24 07:48:10 -08:00
Linus Torvalds
ae2d4fc540 Hi,
Updates for the TPM driver (yep, only one patch).
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYKAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZ5IEIgAKCRAaerohdGur
 0hXZAP0f6M8O1t9ZffklflNXJMdWk/i0p7NkU6tvlpOD6FnGzgD4+3nOLL5rtzMs
 bU5vhJvJX5g6H45a0z6YIXsO1Yi4Dw==
 =kpgF
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull TPM update from Jarkko Sakkinen.

* tag 'tpmdd-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Change to kvalloc() in eventlog/acpi.c
2025-01-24 07:46:33 -08:00
Linus Torvalds
68732c0bf9 pmdomain core:
- Add support for naming idlestates through DT
 
 pmdomain providers:
  - arm: Explicitly request the current state at init for the SCMI PM domain
  - mediatek: Add Airoha CPU PM Domain support for CPU frequency scaling
  - ti: Add per-device latency constraint management to the ti_sci PM domain
 
 cpuidle-psci:
  - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmeSTf0XHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmeThAAh7odiUJo0+nGiy4mdbMh2mik
 aF3hN419zZhVVNlmNBSVQhRIW2GFxMGX7FdaT+8yXmvhvE0cMbBiDtEyCySLmUOd
 43hXkboJ4T+7pFMsSJ7BSn3fzZNJb0135JZVQZY6UhrIFUsDOH4oSI350qllln61
 V294ABMNFHw6q4yD1m94Cqw7xraCMvZ3OxpK5p215uhDutHwA0BdOKUq/hcCojSi
 5hy1y3SIghhJS0Ug7NxoP+6I/tQg5T5LZV23YKRz5XZPSnvXoldjKEiKzNqBqmZu
 4E56M9MNDVhWz5CZt3aed/XvLW+X/uTgz63g81kDKnhV/HQXAp5iZrVsmnF2oaXl
 1Ls48fo25Vk6bVWZW3Z7tkgGfknjSYZhlSmFrbRxazsZfpvmwmXHJznnwQITyRia
 Ju8YnFA0kcbA18nNE9Jk8GI5pNt/GZyqZ5Z0t46Re6dYxmDxGLAlRL5/XoVbof+Y
 HIJzONTvJKap7qilYkmdqTWJF7CVMAWdsUwVb7fzHjoXegPNzZIPv8jEqbZNOWqh
 K4fYrx1OSDGLDwuQlTdK8FOgUoHwwiDOJb0G1UA2c6sUIMBCAOKyG2HpWIBB9CMK
 B5GNMwjV/9XQ7+xmA1hyysR2r6nipXaQPhREo/rLZa1GYL/jYy50jGJENo4M7gsM
 yBK5CwqVvWs1TEhYFog=
 =gGf+
 -----END PGP SIGNATURE-----

Merge tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add support for naming idlestates through DT

  pmdomain providers:
   - arm: Explicitly request the current state at init for the SCMI PM
     domain
   - mediatek: Add Airoha CPU PM Domain support for CPU frequency
     scaling
   - ti: Add per-device latency constraint management to the ti_sci PM
     domain

  cpuidle-psci:
   - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP"

* tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode
  pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state
  pmdomain: airoha: Add Airoha CPU PM Domain support
  pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups
  pmdomain: ti_sci: add wakeup constraint management
  pmdomain: ti_sci: add per-device latency constraint management
  pmdomain: imx-gpcv2: Suppress bind attrs
  pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs
  pmdomain: core: Support naming idle states
  dt-bindings: power: domain-idle-state: Allow idle-state-name
  cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
2025-01-24 07:43:35 -08:00
Linus Torvalds
b746043cb3 Pin control changes for the v6.14 kernel cycle:
No core changes this time
 
 New drivers:
 
 - New subdriver for the Qualcomm MSM8917 SoC TLMM
 
 - New subdriver for the Mediatek MT7988 SoC
 
 - New subdriver for the Rockchip RK3562 SoC
 
 - New subdriver for the Renesas RZ/G3E SoC
 
 Improvements:
 
 - Fix some missing pins in the Qualcomm IPQ5424 TLMM
 
 - Fix some missing LVDS pins in the Sunxi A100/A133
 
 - Support Sunxi V853 (simple compatible string)
 
 - Cleanups in the Samsung driver
 
 - Fix some AMD suspend behaviour
 
 - Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmeR+5IACgkQQRCzN7AZ
 XXNkTQ/7Blf3alkeaUTjTj3LYFfaky2UjV8i2QLtRiznaLsNitNzRAOq8VYWFKHl
 ysDB8Bheal0vmUEig5I+CboTq2cTt5M5abcQHccOzq29Ln0O9nhlEgds2VjqRgxc
 6rD5TtbeXISkaLBUQdOQ0CkfmoAq0ihH8krFT8gOP01iiiGdLMq3g+v0F6g+wgOX
 0YV6v10wDTsxyh0xlBlshcEt+wTe4EC85WsQf9EerCnSbAqMk32lGLRGeIkazZyX
 l0mTm/JuB7m1UKS2eaZiyOpikK7aq7NirtFTdb5EGa3KYwlwYueiI0AjqCkx//NF
 0txtrk83EU+ApQuP3cXt5mCm19BBpSbIMMuv5qmz1EiX3kBEK++NM7L00ERatHDV
 LXwWObedtkMwzLPmOSPW/RXg7LhnoO34HgbEjznJQ6vV06a9hCMyJZCcIXOrxIP+
 +M5HHbVNrEt5MpcG5Q0FCTALiEKd7cmkQDoZAM9tracl4g5NdZawi4A+PwMdX6UO
 YYnAMYEdJGlCEvcpKNjTebdHz1PDRThk/bLtNNEJ9X8k0inA2H6XIOKW2LFjfR9c
 jNVBxQyOc6o2SKTWdzJm6+F4b/Rzo7awZzXFvURSguCRnbiJ81hIOwooE/o5+IJF
 Fg1N+7cC5AQdtKCy4rh7Hy0H+132XIFPSdeknk5BxUMtXpSGjh8=
 =qLiC
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "No core changes this time

  New drivers:

   - New subdriver for the Qualcomm MSM8917 SoC TLMM

   - New subdriver for the Mediatek MT7988 SoC

   - New subdriver for the Rockchip RK3562 SoC

   - New subdriver for the Renesas RZ/G3E SoC

  Improvements:

   - Fix some missing pins in the Qualcomm IPQ5424 TLMM

   - Fix some missing LVDS pins in the Sunxi A100/A133

   - Support Sunxi V853 (simple compatible string)

   - Cleanups in the Samsung driver

   - Fix some AMD suspend behaviour

   - Cleanups"

* tag 'pinctrl-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (29 commits)
  dt-bindings: pinctrl: sunxi: add compatible for V853
  pinctrl: Use str_enable_disable-like helpers
  dt-bindings: pinctrl: Correct indentation and style in DTS example
  pinctrl: amd: Take suspend type into consideration which pins are non-wake
  pinctrl: stm32: Add check for clk_enable()
  pinctrl: renesas: rzg2l: Fix PFC_MASK for RZ/V2H and RZ/G3E
  pinctrl: sunxi: add missed lvds pins for a100/a133
  pinctrl: mediatek: Drop mtk_pinconf_bias_set_pd()
  pinctrl: renesas: rzg2l: Add support for RZ/G3E SoC
  pinctrl: renesas: rzg2l: Update r9a09g057_variable_pin_cfg table
  dt-bindings: pinctrl: renesas: Document RZ/G3E SoC
  dt-bindings: pinctrl: renesas: Add alpha-numerical port support for RZ/V2H
  pinctrl: rockchip: add rk3562 support
  dt-bindings: pinctrl: Add rk3562 pinctrl support
  pinctrl: Fix the clean up on pinconf_apply_setting failure
  dt-bindings: pinctrl: add binding for MT7988 SoC
  pinctrl: mediatek: add MT7988 pinctrl driver
  pinctrl: mediatek: add support for MTK_PULL_PD_TYPE
  pinctrl: ocelot: Constify some structures
  pinctrl: renesas: rzg2l: Add audio clock pins on RZ/G3S
  ...
2025-01-24 07:38:50 -08:00
Linus Torvalds
f1c243fc78 IOMMU Updates for Linux v6.14
Including:
 
 	- Core changes:
 	  - PASID support for the blocked_domain.
 
 	- ARM-SMMU Updates:
 	  - SMMUv2:
 	    * Implement per-client prefetcher configuration on Qualcomm SoCs.
 	    * Support for the Adreno SMMU on Qualcomm's SDM670 SOC.
 	  - SMMUv3:
 	    * Pretty-printing of event records.
 	    * Drop the ->domain_alloc_paging implementation in favour of
 	      ->domain_alloc_paging_flags(flags==0).
 	  - IO-PGTable:
 	    * Generalisation of the page-table walker to enable external walkers
 	      (e.g. for debugging unexpected page-faults from the GPU).
 	    * Minor fix for handling concatenated PGDs at stage-2 with 16KiB pages.
 	  - Misc:
 	    * Clean-up device probing and replace the crufty probe-deferral hack
 	      with a more robust implementation of arm_smmu_get_by_fwnode().
 	    * Device-tree binding updates for a bunch of Qualcomm platforms.
 
 	- Intel VT-d Updates:
 	  - Remove domain_alloc_paging().
 	  - Remove capability audit code.
 	  - Draining PRQ in sva unbind path when FPD bit set.
 	  - Link cache tags of same iommu unit together.
 
 	- AMD-Vi Updates:
 	  - Use CMPXCHG128 to update DTE.
 	  - Cleanups of the domain_alloc_paging() path.
 
 	- RiscV IOMMU:
 	  - Platform MSI support.
 	  - Shutdown support.
 
 	- Rockchip IOMMU:
 	  - Add DT bindings for Rockchip RK3576.
 
 	- More smaller fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmeQsnQACgkQK/BELZcB
 GuOHug/8DDIuKlUHU2+3U2kKxMb8o1kDxGPkfKgXBaxxpprvY9DARRv7N4mnF6Vu
 Db8P7QihXfQKnZHXEiqHE7TlA5B+IVQd3Kz96P4sY3OlVWGZYqyKv2GEHyG5CjN/
 bay7bfgeo2EVEiAio6VToFFWTm+oxFZzhoYFIlAyZAuIQUp17gHXf7YyhUwk4rOz
 8g0XMH6uldidID6BVpArxHh/bN9MOTdHzkyhwPF3FL8E94ziX6rWILH9ADYxBn2o
 wqHR1STxv398k62toPpWb78c2RdANI8treDXsYpCyDF87dygdP+SA0qkK3G6kAVA
 /IiPothAj6oNm+Gvwd04tEkuqVVrVqrmWE3VXSps33Tk+apYtcmLCtdpwY/F93D1
 EZwTVqveBKk2hSWYVDlEyj9XKmZ9dYWDGvg2Fx844gltQHoHtEgYL+azCMU/ry7k
 3+KlkUqFZZxUDbVBbbDdMF+NpxyZTtfKsaLB8f5laOP1R8Os3+dC69Gw6bhoxaMc
 xfL3v245V5kWSRy+w02TyZGmZSzRx0FKUbFLKpLOvZD6pfx8t8oqJTG9FwN1KzpL
 lvcBpPB5AUeNJQpKcVSjs/ne8WiOqdABt7ge4E9J5TwrDI8sXk0mwJaPrlDgK1V/
 0xkkLmxWnqsu04CyDay+Uv3Bls/rRBpikR/iGt9P3BbZs7Cyryw=
 =Xei0
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - PASID support for the blocked_domain

  ARM-SMMU Updates:
   - SMMUv2:
      - Implement per-client prefetcher configuration on Qualcomm SoCs
      - Support for the Adreno SMMU on Qualcomm's SDM670 SOC
   - SMMUv3:
      - Pretty-printing of event records
      - Drop the ->domain_alloc_paging implementation in favour of
        domain_alloc_paging_flags(flags==0)
   - IO-PGTable:
      - Generalisation of the page-table walker to enable external
        walkers (e.g. for debugging unexpected page-faults from the GPU)
      - Minor fix for handling concatenated PGDs at stage-2 with 16KiB
        pages
   - Misc:
      - Clean-up device probing and replace the crufty probe-deferral
        hack with a more robust implementation of
        arm_smmu_get_by_fwnode()
      - Device-tree binding updates for a bunch of Qualcomm platforms

  Intel VT-d Updates:
   - Remove domain_alloc_paging()
   - Remove capability audit code
   - Draining PRQ in sva unbind path when FPD bit set
   - Link cache tags of same iommu unit together

  AMD-Vi Updates:
   - Use CMPXCHG128 to update DTE
   - Cleanups of the domain_alloc_paging() path

  RiscV IOMMU:
   - Platform MSI support
   - Shutdown support

  Rockchip IOMMU:
   - Add DT bindings for Rockchip RK3576

  More smaller fixes and cleanups"

* tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits)
  iommu: Use str_enable_disable-like helpers
  iommu/amd: Fully decode all combinations of alloc_paging_flags
  iommu/amd: Move the nid to pdom_setup_pgtable()
  iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode
  iommu/amd: Remove type argument from do_iommu_domain_alloc() and related
  iommu/amd: Remove dev == NULL checks
  iommu/amd: Remove domain_alloc()
  iommu/amd: Remove unused amd_iommu_domain_update()
  iommu/riscv: Fixup compile warning
  iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h
  iommu/arm-smmu-v3: Use str_read_write helper w/ logs
  iommu/io-pgtable-arm: Add way to debug pgtable walk
  iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys
  iommu/io-pgtable-arm: Make pgtable walker more generic
  iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500
  iommu/arm-smmu: Introduce ACTLR custom prefetcher settings
  iommu/arm-smmu: Add support for PRR bit setup
  iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer
  iommu/arm-smmu: Re-enable context caching in smmu reset operation
  iommu/vt-d: Link cache tags of same iommu unit together
  ...
2025-01-24 07:33:46 -08:00