twx-linux

Author	SHA1	Message	Date
Guo Ren	eb87e56d65	riscv: xchg: Prefetch the destination word for sc.w The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of prefetch.w to prefetch cachelines for write prior to lr/sc loops when using the xchg_small atomic routine. This patch is inspired by commit `0ea366f5e1` ("arm64: atomics: prefetch the destination word for write prior to stxr"). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20231231082955.16516-4-guoren@kernel.org Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-5-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:39 -07:00
Guo Ren	a5f947c731	riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop Enable Linux prefetch and prefetchw primitives using Zicbop. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20231231082955.16516-3-guoren@kernel.org Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-4-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:38 -07:00
Alexandre Ghiti	8d496b5a98	riscv: Add support for Zicbop Zicbop introduces cache blocks prefetching instructions, add the necessary support for the kernel to use it in the coming commits. Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-3-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:37 -07:00
Alexandre Ghiti	f0f4e64b9e	riscv: Introduce Zicbop instructions The S-type instructions are first introduced and then used to define the encoding of the Zicbop prefetching instructions. Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-2-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:36 -07:00
Yao Zi	850d7b14c8	riscv/kexec_file: Fix comment in purgatory relocator Apparently sec_base doesn't mean relocated symbol value, which seems a copy-pasting error in the comment. Assigned with the address of section indexed by sym->st_shndx, it should represent base address of the relevant section. Let's fix the comment to avoid possible confusion. Fixes: `838b3e2848` ("RISC-V: Load purgatory in kexec_file") Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250326073450.57648-2-ziyao@disroot.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:35 -07:00
Song Shuai	809a11eea8	riscv: kexec_file: Support loading Image binary file This patch creates image_kexec_ops to load Image binary file for kexec_file_load() syscall. Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250409193004.643839-3-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:34 -07:00
Song Shuai	1df45f8a9f	riscv: kexec_file: Split the loading of kernel and others This is the preparative patch for kexec_file_load Image support. It separates the elf_kexec_load() as two parts: - the first part loads the vmlinux (or Image) - the second part loads other segments (e.g. initrd,fdt,purgatory) And the second part is exported as the load_extra_segments() function which would be used in both kexec-elf.c and kexec-image.c. No functional change intended. Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250409193004.643839-2-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:33 -07:00
Andy Chiu	d8ac85dad4	riscv: Documentation: add a description about dynamic ftrace Add a section in cmodx to describe how dynamic ftrace works on riscv, limitations, and assumptions. Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-12-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:32 -07:00
Andy Chiu	b21cdb9523	riscv: ftrace: support direct call using call_ops jump to FTRACE_ADDR if distance is out of reach Co-developed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-11-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:31 -07:00
Puranjay Mohan	c217157bcd	riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V. This allows each ftrace callsite to provide an ftrace_ops to the common ftrace trampoline, allowing each callsite to invoke distinct tracer functions without the need to fall back to list processing or to allocate custom trampolines for each callsite. This significantly speeds up cases where multiple distinct trace functions are used and callsites are mostly traced by a single tracer. The idea and most of the implementation is taken from the ARM64's implementation of the same feature. The idea is to place a pointer to the ftrace_ops as a literal at a fixed offset from the function entry point, which can be recovered by the common ftrace trampoline. We use -fpatchable-function-entry to reserve 8 bytes above the function entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer to the associated ftrace_ops for that callsite. Functions are aligned to 8 bytes to make sure that the accesses to this literal are atomic. This approach allows for directly invoking ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or part of a module), without going via ftrace_ops_list_func. We've benchamrked this with the ftrace_ops sample module on Spacemit K1 Jupiter: Without this patch: baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29 +-----------------------+-----------------+----------------------------+ \| Number of tracers \| Total time (ns) \| Per-call average time \| \|-----------------------+-----------------+----------------------------\| \| Relevant \| Irrelevant \| 100000 calls \| Total (ns) \| Overhead (ns) \| \|----------+------------+-----------------+------------+---------------\| \| 0 \| 0 \| 1357958 \| 13 \| - \| \| 0 \| 1 \| 1302375 \| 13 \| - \| \| 0 \| 2 \| 1302375 \| 13 \| - \| \| 0 \| 10 \| 1379084 \| 13 \| - \| \| 0 \| 100 \| 1302458 \| 13 \| - \| \| 0 \| 200 \| 1302333 \| 13 \| - \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 13677833 \| 136 \| 123 \| \| 1 \| 1 \| 18500916 \| 185 \| 172 \| \| 1 \| 2 \| `22856459` \| 228 \| 215 \| \| 1 \| 10 \| 58824709 \| 588 \| 575 \| \| 1 \| 100 \| 505141584 \| 5051 \| 5038 \| \| 1 \| 200 \| 1580473126 \| 15804 \| 15791 \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 13561000 \| 135 \| 122 \| \| 2 \| 0 \| 19707292 \| 197 \| 184 \| \| 10 \| 0 \| 67774750 \| 677 \| 664 \| \| 100 \| 0 \| 714123125 \| 7141 \| 7128 \| \| 200 \| 0 \| 1918065668 \| 19180 \| 19167 \| +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. With this patch: v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29 +-----------------------+-----------------+----------------------------+ \| Number of tracers \| Total time (ns) \| Per-call average time \| \|-----------------------+-----------------+----------------------------\| \| Relevant \| Irrelevant \| 100000 calls \| Total (ns) \| Overhead (ns) \| \|----------+------------+-----------------+------------+---------------\| \| 0 \| 0 \| 1459917 \| 14 \| - \| \| 0 \| 1 \| 1408000 \| 14 \| - \| \| 0 \| 2 \| 1383792 \| 13 \| - \| \| 0 \| 10 \| 1430709 \| 14 \| - \| \| 0 \| 100 \| 1383791 \| 13 \| - \| \| 0 \| 200 \| 1383750 \| 13 \| - \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 5238041 \| 52 \| 38 \| \| 1 \| 1 \| 5228542 \| 52 \| 38 \| \| 1 \| 2 \| 5325917 \| 53 \| 40 \| \| 1 \| 10 \| 5299667 \| 52 \| 38 \| \| 1 \| 100 \| 5245250 \| 52 \| 39 \| \| 1 \| 200 \| 5238459 \| 52 \| 39 \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 5239083 \| 52 \| 38 \| \| 2 \| 0 \| 19449417 \| 194 \| 181 \| \| 10 \| 0 \| 67718584 \| 677 \| 663 \| \| 100 \| 0 \| 709840708 \| 7098 \| 7085 \| \| 200 \| 0 \| 2203580626 \| 22035 \| 22022 \| +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. As can be seen from the above: a) Whenever there is a single relevant tracer function associated with a tracee, the overhead of invoking the tracer is constant, and does not scale with the number of tracers which are not associated with that tracee. b) The overhead for a single relevant tracer has dropped to ~1/3 of the overhead prior to this series (from 122ns to 38ns). This is largely due to permitting calls to dynamically-allocated ftrace_ops without going through ftrace_ops_list_func. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> [update kconfig, asm, refactor] Signed-off-by: Andy Chiu <andybnac@gmail.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-10-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:30 -07:00
Andy Chiu	d0262e907e	riscv: ftrace: support PREEMPT Now, we can safely enable dynamic ftrace with kernel preemption. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-9-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:29 -07:00
Andy Chiu	ca358692de	riscv: add a data fence for CMODX in the kernel mode RISC-V spec explicitly calls out that a local fence.i is not enough for the code modification to be visble from a remote hart. In fact, it states: To make a store to instruction memory visible to all RISC-V harts, the writing hart also has to execute a data FENCE before requesting that all remote RISC-V harts execute a FENCE.I. Although current riscv drivers for IPI use ordered MMIO when sending IPIs in order to synchronize the action between previous csd writes, riscv does not restrict itself to any particular flavor of IPI. Any driver or firmware implementation that does not order data writes before the IPI may pose a risk for code-modifying race. Thus, add a fence here to order data writes before making the IPI. Signed-off-by: Andy Chiu <andybnac@gmail.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-8-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:28 -07:00
Andy Chiu	d1049fc0de	riscv: vector: Support calling schedule() for preemptible Vector Each function entry implies a call to ftrace infrastructure. And it may call into schedule in some cases. So, it is possible for preemptible kernel-mode Vector to implicitly call into schedule. Since all V-regs are caller-saved, it is possible to drop all V context when a thread voluntarily call schedule(). Besides, we currently don't pass argument through vector register, so we don't have to save/restore V-regs in ftrace trampoline. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-7-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:27 -07:00
Andy Chiu	5aa4ef9558	riscv: ftrace: do not use stop_machine to update code Now it is safe to remove dependency from stop_machine() for us to patch code in ftrace. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-6-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:26 -07:00
Andy Chiu	b2137c3b6d	riscv: ftrace: prepare ftrace for atomic code patching We use an AUIPC+JALR pair to jump into a ftrace trampoline. Since instruction fetch can break down to 4 byte at a time, it is impossible to update two instructions without a race. In order to mitigate it, we initialize the patchable entry to AUIPC + NOP4. Then, the run-time code patching can change NOP4 to JALR to eable/disable ftrcae from a function. This limits the reach of each ftrace entry to +-2KB displacing from ftrace_caller. Starting from the trampoline, we add a level of indirection for it to reach ftrace caller target. Now, it loads the target address from a memory location, then perform the jump. This enable the kernel to update the target atomically. The new don't-stop-the-world text patching on change only one RISC-V instruction: \| -8: &ftrace_ops of the associated tracer function. \| <ftrace enable>: \| 0: auipc t0, hi(ftrace_caller) \| 4: jalr t0, lo(ftrace_caller) \| \| -8: &ftrace_nop_ops \| <ftrace disable>: \| 0: auipc t0, hi(ftrace_caller) \| 4: nop This means that f+0x0 is fixed, and should not be claimed by ftrace, e.g. kprobe should be able to put a probe in f+0x0. Thus, we adjust the offset and MCOUNT_INSN_SIZE accordingly. [ alex: Fix build errors with !CONFIG_DYNAMIC_FTRACE ] Co-developed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-5-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:25 -07:00
Andy Chiu	500e626c4a	kernel: ftrace: export ftrace_sync_ipi The following ftrace patch for riscv uses a data store to update ftrace function. Therefore, a romote fence is required to order it against function_trace_op updates. The mechanism is similar to the fence between function_trace_op and update_ftrace_func in the generic ftrace, so we leverage the same ftrace_sync_ipi function. [ alex: Fix build warning when !CONFIG_DYNAMIC_FTRACE ] Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-4-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:24 -07:00
Andy Chiu	c41bf4326c	riscv: ftrace: align patchable functions to 4 Byte boundary We are changing ftrace code patching in order to remove dependency from stop_machine() and enable kernel preemption. This requires us to align functions entry at a 4-B align address. However, -falign-functions on older versions of GCC alone was not strong enoungh to align all functions. In fact, cold functions are not aligned after turning on optimizations. We consider this is a bug in GCC and turn off guess-branch-probility as a workaround to align all functions. GCC bug id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 The option -fmin-function-alignment is able to align all functions properly on newer versions of gcc. So, we add a cc-option to test if the toolchain supports it. Suggested-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-3-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:23 -07:00
Andy Chiu	54ecbc8d85	riscv: ftrace factor out code defined by !WITH_ARG DYNAMIC_FTRACE selects DYNAMIC_FTRACE_WITH_ARGS and mcount-dyn.S in riscv, so we can remove ifdef jargons of WITH_ARG when it is known that DYNAMIC_FTRACE is true. Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-2-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:22 -07:00
Andy Chiu	f8693f6dff	riscv: ftrace: support fastcc in Clang for WITH_ARGS Some caller-saved registers which are not defined as function arguments in the ABI can still be passed as arguments when the kernel is compiled with Clang. As a result, we must save and restore those registers to prevent ftrace from clobbering them. - [1]: https://reviews.llvm.org/D68559 Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Closes: https://lore.kernel.org/linux-riscv/7e7c7914-445d-426d-89a0-59a9199c45b1@yadro.com/ Fixes: `7caa976546` ("ftrace: riscv: move from REGS to ARGS") Acked-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-1-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-05 11:09:21 -07:00
Maciej Patelczyk	7c7c5cb5b5	drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init() There is unmatched xe_vm_unlock() in the __xe_exec_queue_init(). Leftover from commit `fbeaad071a` ("drm/xe: Create LRC BO without VM") Fixes: `2b0a0ce0c2` ("drm/xe: Create LRC BO without VM") Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://lore.kernel.org/r/20250530135627.2821612-1-maciej.patelczyk@intel.com (cherry picked from commit `28b996ce73`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:55:46 +02:00
Niranjana Vishwanathapura	2b0a0ce0c2	drm/xe: Create LRC BO without VM Specifying VM during lrc->bo creation requires VM's reference to be held for the lifetime of lrc->bo as it will use VM's dma reservation object. Using VM's dma reservation object for lrc->bo doesn't provide any advantage. Hence do not pass VM while creating lrc->bo. v2: Use xe_bo_unpin_map_no_vm (Matthew Brost) Fixes: `264eecdba2` ("drm/xe: Decouple xe_exec_queue and xe_lrc") Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250529052031.2429120-2-niranjana.vishwanathapura@intel.com (cherry picked from commit `fbeaad071a`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:55:28 +02:00
Matthew Auld	2e824747cf	drm/xe/guc_submit: add back fix Daniele noticed that the fix in commit `2d2be279f1` ("drm/xe: fix UAF around queue destruction") looks to have been unintentionally removed as part of handling a conflict in some past merge commit. Add it back. Fixes: `ac44ff7cec` ("Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes") Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250603174213.1543579-2-matthew.auld@intel.com (cherry picked from commit `9d9fca62dc`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:08:10 +02:00
Daniele Ceraolo Spurio	69a58ef4fa	drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready The expected flow of operations when using PXP is to query the PXP status and wait for it to transition to "ready" before attempting to create an exec_queue. This flow is followed by the Mesa driver, but there is no guarantee that an incorrectly coded (or malicious) app will not attempt to create the queue first without querying the status. Therefore, we need to clarify what the expected behavior of the queue creation ioctl is in this scenario. Currently, the ioctl always fails with an -EBUSY code no matter the error, but for consistency it is better to distinguish between "failed to init" (-EIO) and "not ready" (-EBUSY), the same way the query ioctl does. Note that, while this is a change in the return code of an ioctl, the behavior of the ioctl in this particular corner case was not clearly spec'd, so no one should have been relying on it (and we know that Mesa, which is the only known userspace for this, didn't). v2: Minor rework of the doc (Rodrigo) Fixes: `72d479601d` ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-7-daniele.ceraolospurio@intel.com (cherry picked from commit `21784ca960`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:07:33 +02:00
Daniele Ceraolo Spurio	6bf4d56492	drm/xe/pxp: Use the correct define in the set_property_funcs array The define of the extension type was accidentally used instead of the one of the property itself. They're both zero, so no functional issue, but we should use the correct define for code correctness. Fixes: `41a97c4a12` ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com (cherry picked from commit `1d891ee820`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:07:25 +02:00
Matthew Auld	0ee54d5cac	drm/xe/sched: stop re-submitting signalled jobs Customer is reporting a really subtle issue where we get random DMAR faults, hangs and other nasties for kernel migration jobs when stressing stuff like s2idle/s3/s4. The explosions seems to happen somewhere after resuming the system with splats looking something like: PM: suspend exit rfkill: input handler disabled xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0 xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1] xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out The likely cause appears to be a race between suspend cancelling the worker that processes the free_job()'s, such that we still have pending jobs to be freed after the cancel. Following from this, on resume the pending_list will now contain at least one already complete job, but it looks like we call drm_sched_resubmit_jobs(), which will then call run_job() on everything still on the pending_list. But if the job was already complete, then all the resources tied to the job, like the bb itself, any memory that is being accessed, the iommu mappings etc. might be long gone since those are usually tied to the fence signalling. This scenario can be seen in ftrace when running a slightly modified xe_pm IGT (kernel was only modified to inject artificial latency into free_job to make the race easier to hit): xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3 xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... ..... xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13 So the job_run() is clearly triggered twice for the same job, even though the first must have already signalled to completion during suspend. We can also see a CAT error after the re-submit. To prevent this only resubmit jobs on the pending_list that have not yet signalled. v2: - Make sure to re-arm the fence callbacks with sched_start(). v3 (Matt B): - Stop using drm_sched_resubmit_jobs(), which appears to be deprecated and just open-code a simple loop such that we skip calling run_job() on anything already signalled. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856 Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: William Tseng <william.tseng@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com (cherry picked from commit `38fafa9f39`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:07:15 +02:00
Thomas Hellström	5cc3325584	drm/xe: Rework eviction rejection of bound external bos For preempt_fence mode VM's we're rejecting eviction of shared bos during VM_BIND. However, since we do this in the move() callback, we're getting an eviction failure warning from TTM. The TTM callback intended for these things is eviction_valuable(). However, the latter doesn't pass in the struct ttm_operation_ctx needed to determine whether the caller needs this. Instead, attach the needed information to the vm under the vm->resv, until we've been able to update TTM to provide the needed information. And add sufficient lockdep checks to prevent misuse and races. v2: - Fix a copy-paste error in xe_vm_clear_validating() v3: - Fix kerneldoc errors. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: `0af944f0e3` ("drm/xe: Reject BO eviction if BO is bound to current VM") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com (cherry picked from commit `9d5558649f`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:07:07 +02:00
Arnd Bergmann	2182f358fb	drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency The XE driver can be built with or without VSEC support, but fails to link as built-in if vsec is in a loadable module: x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init': (.text+0x1e83e16): undefined reference to `intel_vsec_register' The normal fix for this is to add a 'depends on INTEL_VSEC \|\| !INTEL_VSEC', forcing XE to be a loadable module as well, but that causes a circular dependency: symbol DRM_XE depends on INTEL_VSEC symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES symbol X86_PLATFORM_DEVICES is selected by DRM_XE The problem here is selecting a symbol from another subsystem, so change that as well and rephrase the 'select' into the corresponding dependency. Since X86_PLATFORM_DEVICES is 'default y', there is no change to defconfig builds here. Fixes: `0c45e76fcc` ("drm/xe/vsec: Support BMG devices") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `e4931f8be3`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:06:59 +02:00
Raag Jadav	9411082792	drm/xe: drop redundant conversion to bool The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Fixes: `0e414bf7ad` ("drm/xe: Expose PCIe link downgrade attributes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/ Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `61761a6b57`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:06:01 +02:00
Karthik Poosa	b885ae2e9d	drm/xe/hwmon: Move card reactive critical power under channel card Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this represents the entire card critical power/current. v2: Update the date of curr1_crit also in hwmon documentation. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: `345dadc4f6` ("drm/xe/hwmon: Add infra to support card power and energy attributes") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `25e963a09e`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:05:54 +02:00
Karthik Poosa	25a2aa779f	drm/xe/hwmon: Add support to manage power limits though mailbox Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: `e90f7a58e6` ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `7596d839f6`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:05:44 +02:00
Matthew Auld	8cf8cde41a	drm/xe/vm: move xe_svm_init() earlier In xe_vm_close_and_put() we need to be able to call xe_svm_fini(), however during vm creation we can call this on the error path, before having actually initialised the svm state, leading to various splats followed by a fatal NPD. Fixes: `6fd979c2f3` ("drm/xe: Add SVM init / close / fini to faulting VMs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com (cherry picked from commit `4f296d77cf`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:05:19 +02:00
Matthew Auld	a63e99b4d6	drm/xe/vm: move rebind_work init earlier In xe_vm_close_and_put() we need to be able to call flush_work(rebind_work), however during vm creation we can call this on the error path, before having actually set up the worker, leading to a splat from flush_work(). It looks like we can simply move the worker init step earlier to fix this. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com (cherry picked from commit `96af397aa1`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-06-05 18:05:10 +02:00
Linus Torvalds	7fdaba9129	Merge tag 'rtc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "There are two new drivers this cycle. There is also support for a negative offset for RTCs that have been shipped with a date set using an epoch that is before 1970. This unfortunately happens with some products that ship with a vendor kernel and an out of tree driver. Core: - support negative offsets for RTCs that have shipped with an epoch earlier than 1970 New drivers: - NXP S32G2/S32G3 - Sophgo CV1800 Drivers: - loongson: fix missing alarm notifications for ACPI - m41t80: kickstart ocillator upon failure - mt6359: mt6357 support - pcf8563: fix wrong alarm register - sh: cleanups" * tag 'rtc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (39 commits) rtc: mt6359: Add mt6357 support rtc: test: Test date conversion for dates starting in 1900 rtc: test: Also test time and wday outcome of rtc_time64_to_tm() rtc: test: Emit the seconds-since-1970 value instead of days-since-1970 rtc: Fix offset calculation for .start_secs < 0 rtc: Make rtc_time64_to_tm() support dates before 1970 rtc: pcf8563: fix wrong alarm register rtc: rzn1: support input frequencies other than 32768Hz rtc: rzn1: Disable controller before initialization dt-bindings: rtc: rzn1: add optional second clock rtc: m41t80: reduce verbosity rtc: m41t80: kickstart ocillator upon failure rtc: s32g: add NXP S32G2/S32G3 SoC support dt-bindings: rtc: add schema for NXP S32G2/S32G3 SoCs dt-bindings: at91rm9260-rtt: add microchip,sama7d65-rtt dt-bindings: rtc: at91rm9200: add microchip,sama7d65-rtc rtc: loongson: Add missing alarm notifications for ACPI RTC events rtc: sophgo: add rtc support for Sophgo CV1800 SoC rtc: stm32: drop unused module alias rtc: s3c: drop unused module alias ...	2025-06-05 08:54:47 -07:00
Thomas Zimmermann	f670b50ef5	sysfb: Fix screen_info type check for VGA Use the helper screen_info_video_type() to get the framebuffer type from struct screen_info. Handle supported values in sorted switch statement. Reading orig_video_isVGA is unreliable. On most systems it is a VIDEO_TYPE_ constant. On some systems with VGA it is simply set to 1 to signal the presence of a VGA output. See vga_probe() for an example. Retrieving the screen_info type with the helper screen_info_video_type() detects these cases and returns the appropriate VIDEO_TYPE_ constant. For VGA, sysfb creates a device named "vga-framebuffer". The sysfb code has been taken from vga16fb, where it likely didn't work correctly either. With this bugfix applied, vga16fb loads for compatible vga-framebuffer devices. Fixes: `0db5b61e0d` ("fbdev/vga16fb: Create EGA/VGA devices in sysfb code") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Tzung-Bi Shih <tzungbi@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: "Uwe Kleine-König" <u.kleine-koenig@baylibre.com> Cc: Zsolt Kajtar <soci@c64.rulez.org> Cc: <stable@vger.kernel.org> # v6.1+ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/20250603154838.401882-1-tzimmermann@suse.de	2025-06-05 17:54:31 +02:00
Thomas Zimmermann	2f29b5c231	video: screen_info: Relocate framebuffers behind PCI bridges Apply PCI host-bridge window offsets to screen_info framebuffers. Fixes invalid access to I/O memory. Resources behind a PCI host bridge can be relocated by a certain offset in the kernel's CPU address range used for I/O. The framebuffer memory range stored in screen_info refers to the CPU addresses as seen during boot (where the offset is 0). During boot up, firmware may assign a different memory offset to the PCI host bridge and thereby relocating the framebuffer address of the PCI graphics device as seen by the kernel. The information in screen_info must be updated as well. The helper pcibios_bus_to_resource() performs the relocation of the screen_info's framebuffer resource (given in PCI bus addresses). The result matches the I/O-memory resource of the PCI graphics device (given in CPU addresses). As before, we store away the information necessary to later update the information in screen_info itself. Commit `78aa89d1df` ("firmware/sysfb: Update screen_info for relocated EFI framebuffers") added the code for updating screen_info. It is based on similar functionality that pre-existed in efifb. Efifb uses a pointer to the PCI resource, while the newer code does a memcpy of the region. Hence efifb sees any updates to the PCI resource and avoids the issue. v3: - Only use struct pci_bus_region for PCI bus addresses (Bjorn) - Clarify address semantics in commit messages and comments (Bjorn) v2: - Fixed tags (Takashi, Ivan) - Updated information on efifb Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reported-by: "Ivan T. Ivanov" <iivanov@suse.de> Closes: https://bugzilla.suse.com/show_bug.cgi?id=1240696 Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Fixes: `78aa89d1df` ("firmware/sysfb: Update screen_info for relocated EFI framebuffers") Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.9+ Link: https://lore.kernel.org/r/20250528080234.7380-1-tzimmermann@suse.de	2025-06-05 17:54:06 +02:00
Linus Torvalds	bfdf35c5dc	Merge tag 'dmaengine-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "A fairly small update for the dmaengine subsystem. This has a new ARM dmaengine driver and couple of new device support and few driver changes: New support: - Renesas RZ/V2H(P) dma support for r9a09g057 - Arm DMA-350 driver - Tegra Tegra264 ADMA support Updates: - AMD ptdma driver code removal and optimizations - Freescale edma error interrupt handler support" * tag 'dmaengine-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (27 commits) dmaengine: idxd: Remove unused pointer and macro arm64: dts: renesas: r9a09g057: Add DMAC nodes dmaengine: sh: rz-dmac: Add RZ/V2H(P) support dmaengine: sh: rz-dmac: Allow for multiple DMACs irqchip/renesas-rzv2h: Add rzv2h_icu_register_dma_req() dt-bindings: dma: rz-dmac: Document RZ/V2H(P) family of SoCs dt-bindings: dma: rz-dmac: Restrict properties for RZ/A1H dmaengine: idxd: Narrow the restriction on BATCH to ver. 1 only dmaengine: ti: Add NULL check in udma_probe() fsldma: Set correct dma_mask based on hw capability dmaengine: idxd: Check availability of workqueue allocated by idxd wq driver before using dmaengine: xilinx_dma: Set dma_device directions dmaengine: tegra210-adma: Add Tegra264 support dt-bindings: Document Tegra264 ADMA support dmaengine: dw-edma: Add HDMA NATIVE map check dmaegnine: fsl-edma: add edma error interrupt handler dt-bindings: dma: fsl-edma: increase maxItems of interrupts and interrupt-names dmaengine: ARM_DMA350 should depend on ARM/ARM64 dt-bindings: dma: qcom,bam: Document dma-coherent property dmaengine: Add Arm DMA-350 driver ...	2025-06-05 08:49:30 -07:00
Steve French	8e9d6efccd	cifs: update internal version number to 2.55 Signed-off-by: Steve French <stfrench@microsoft.com>	2025-06-05 10:21:17 -05:00
Paulo Alcantara	e889a450a6	MAINTAINERS, mailmap: Update Paulo Alcantara's email address Update my email address in MAINTAINERS and .mailmap files. Signed-off-by: Paulo Alcantara <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-06-05 10:21:14 -05:00
Meetakshi Setiya	1c6bbc45d8	cifs: add documentation for smbdirect setup Document steps to use SMB over RDMA using the linux SMB client and KSMBD server Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-06-05 10:20:48 -05:00
Linus Torvalds	d12ed2b7e1	Merge tag 'phy-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "As usual featuring couple of new driver and bunch of new device support and some driver changes to Freescale, rockchip driver along with couple of yaml binding conversions. New Support: - Qualcomm IPQ5424 qusb2 support, IPQ5018 uniphy-pcie driver - Rockchip usb2 support for RK3562, RK3036 usb2 phy support - Samsung exynos2200 eusb2 phy support and driver refactoring for this support, exynos7870 USBDRD support - Mediatek MT7988 xs-phy support - Broadcom BCM74110 usb phy support - Renesas RZ/V2H(P) usb2 phy support Updates: - Freescale phy rate claculation updates, i.MX95 tuning support - Better error handling for amlogic pcie phy - Rockchip color depth configuration and management support - Yaml binding conversion for RK3399 Type-C and PCIe Phy" * tag 'phy-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (77 commits) phy: tegra: p2u: Broaden architecture dependency phy: rockchip: inno-usb2: Add usb2 phy support for rk3562 dt-bindings: phy: rockchip,inno-usb2phy: add rk3562 phy: rockchip: inno-usb2: add phy definition for rk3036 dt-bindings: phy: rockchip,inno-usb2phy: add rk3036 compatible phy: freescale: fsl-samsung-hdmi: Improve LUT search for best clock phy: freescale: fsl-samsung-hdmi: Refactor finding PHY settings phy: freescale: fsl-samsung-hdmi: Rename phy_clk_round_rate phy: renesas: phy-rcar-gen3-usb2: Add USB2.0 PHY support for RZ/V2H(P) phy: renesas: phy-rcar-gen3-usb2: Sort compatible entries by SoC part number dt-bindings: phy: renesas,usb2-phy: Document RZ/V2H(P) SoC dt-bindings: phy: renesas,usb2-phy: Add clock constraint for RZ/G2L family phy: exynos5-usbdrd: support Exynos USBDRD 3.2 4nm controller phy: phy-snps-eusb2: add support for exynos2200 phy: phy-snps-eusb2: refactor reference clock init phy: phy-snps-eusb2: make reset control optional phy: phy-snps-eusb2: make repeater optional phy: phy-snps-eusb2: split phy init code phy: phy-snps-eusb2: refactor constructs names phy: move phy-qcom-snps-eusb2 out of its vendor sub-directory ...	2025-06-05 08:20:21 -07:00
Linus Torvalds	a479ebb269	Merge tag 'soundwire-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: "A couple of small core changes and an Intel driver change: - sdw_assign_device_num() logic simplification, using internal slave id for irqs and optimizing computing of port params in specific stream states - Intel driver updates for ACE3+ microphone privacy status reporting and enabling the status in HDA Intel driver" * tag 'soundwire-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: only compute port params in specific stream states ASoC: SOF: Intel: hda: Set the mic_privacy flag for soundwire with ACE3+ soundwire: intel: Add awareness of ACE3+ microphone privacy soundwire: bus: Add internal slave ID and use for IRQs soundwire: bus: Simplify sdw_assign_device_num()	2025-06-05 08:07:24 -07:00
Eric Dumazet	3cae906e1a	calipso: unlock rcu before returning -EAFNOSUPPORT syzbot reported that a recent patch forgot to unlock rcu in the error path. Adopt the convention that netlbl_conn_setattr() is already using. Fixes: `6e9f2df1c5` ("calipso: Don't call calipso functions for AF_INET sk.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250604133826.1667664-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:03:38 -07:00
Ido Schimmel	7632fedb26	seg6: Fix validation of nexthop addresses The kernel currently validates that the length of the provided nexthop address does not exceed the specified length. This can lead to the kernel reading uninitialized memory if user space provided a shorter length than the specified one. Fix by validating that the provided length exactly matches the specified one. Fixes: `d1df6fd8a1` ("ipv6: sr: define core operations for seg6local lightweight tunnel") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250604113252.371528-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:03:17 -07:00
Eric Dumazet	feafc73f3e	net: prevent a NULL deref in rtnl_create_link() At the time rtnl_create_link() is running, dev->netdev_ops is NULL, we must not use netdev_lock_ops() or risk a NULL deref if CONFIG_NET_SHAPER is defined. Use netif_set_group() instead of dev_set_group(). RIP: 0010:netdev_need_ops_lock include/net/netdev_lock.h:33 [inline] RIP: 0010:netdev_lock_ops include/net/netdev_lock.h:41 [inline] RIP: 0010:dev_set_group+0xc0/0x230 net/core/dev_api.c:82 Call Trace: <TASK> rtnl_create_link+0x748/0xd10 net/core/rtnetlink.c:3674 rtnl_newlink_create+0x25c/0xb00 net/core/rtnetlink.c:3813 __rtnl_newlink net/core/rtnetlink.c:3940 [inline] rtnl_newlink+0x16d6/0x1c70 net/core/rtnetlink.c:4055 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6944 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2534 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:712 [inline] Reported-by: syzbot+9fc858ba0312b42b577e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6840265f.a00a0220.d4325.0009.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `7e4d784f58` ("net: hold netdev instance lock during rtnetlink operations") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250604105815.1516973-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:03:00 -07:00
Eric Dumazet	535caaca92	net: annotate data-races around cleanup_net_task from_cleanup_net() reads cleanup_net_task locklessly. Add READ_ONCE()/WRITE_ONCE() annotations to avoid a potential KCSAN warning, even if the race is harmless. Fixes: `0734d7c3d9` ("net: expedite synchronize_net() for cleanup_net()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250604093928.1323333-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:02:26 -07:00
Jakub Kicinski	e6854be4d8	selftests: drv-net: tso: make bkg() wait for socat to quit Commit `846742f7e3` ("selftests: drv-net: add a warning for bkg + shell + terminate") added a warning for bkg() used with terminate=True. The tso test was missed as we didn't have it running anywhere in NIPA. Add exit_wait=True, to avoid: # Warning: combining shell and terminate is risky! # SIGTERM may not reach the child on zsh/ksh! getting printed twice for every variant. Fixes: `0d0f4174f6` ("selftests: drv-net: add a simple TSO test") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604012055.891431-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:01:00 -07:00
Jakub Kicinski	c68804c934	selftests: drv-net: tso: fix the GRE device name The device type for IPv4 GRE is "gre" not "ipgre", unlike for IPv6 which uses "ip6gre". Not sure how I missed this when writing the test, perhaps because all HW I have access to is on an IPv6-only network. Fixes: `0d0f4174f6` ("selftests: drv-net: add a simple TSO test") Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604012031.891242-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:00:55 -07:00
Jakub Kicinski	7eb6b63aa3	selftests: drv-net: add configs for the TSO test Add missing config options for the tso.py test, specifically to make sure the kernel is built with vxlan and gre tunnels. I noticed this while adding a TSO-capable device QEMU to the CI. Previously we only run virtio tests and it doesn't report LSO stats on the QEMU we have. Fixes: `0d0f4174f6` ("selftests: drv-net: add a simple TSO test") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604001653.853008-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 08:00:50 -07:00
Jakub Kicinski	4bbe2e570f	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== iavf: get rid of the crit lock Przemek Kitszel says: Fix some deadlocks in iavf, and make it less error prone for the future. Patch 1 is simple and independent from the rest. Patches 2, 3, 4 are strictly a refactor, but it enables the last patch to be much smaller. (Technically Jake given his RB tags not knowing I will send it to -net). Patch 5 just adds annotations, this also helps prove last patch to be correct. Patch 6 removes the crit lock, with its unusual try_lock()s. I have more refactoring for scheduling done for -next, to be sent soon. There is a simple test: add VF; decrease number of queueus; remove VF that was way too hard to pass without this series :) * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: get rid of the crit lock iavf: sprinkle netdev_assert_locked() annotations iavf: extract iavf_watchdog_step() out of iavf_watchdog_task() iavf: simplify watchdog_task in terms of adminq task scheduling iavf: centralize watchdog requeueing itself iavf: iavf_suspend(): take RTNL before netdev_lock() ==================== Link: https://patch.msgid.link/20250603171710.2336151-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 07:59:31 -07:00
Mirco Barone	db9ae3b6b4	wireguard: device: enable threaded NAPI Enable threaded NAPI by default for WireGuard devices in response to low performance behavior that we observed when multiple tunnels (and thus multiple wg devices) are deployed on a single host. This affects any kind of multi-tunnel deployment, regardless of whether the tunnels share the same endpoints or not (i.e., a VPN concentrator type of gateway would also be affected). The problem is caused by the fact that, in case of a traffic surge that involves multiple tunnels at the same time, the polling of the NAPI instance of all these wg devices tends to converge onto the same core, causing underutilization of the CPU and bottlenecking performance. This happens because NAPI polling is hosted by default in softirq context, but the WireGuard driver only raises this softirq after the rx peer queue has been drained, which doesn't happen during high traffic. In this case, the softirq already active on a core is reused instead of raising a new one. As a result, once two or more tunnel softirqs have been scheduled on the same core, they remain pinned there until the surge ends. In our experiments, this almost always leads to all tunnel NAPIs being handled on a single core shortly after a surge begins, limiting scalability to less than 3× the performance of a single tunnel, despite plenty of unused CPU cores being available. The proposed mitigation is to enable threaded NAPI for all WireGuard devices. This moves the NAPI polling context to a dedicated per-device kernel thread, allowing the scheduler to balance the load across all available cores. On our 32-core gateways, enabling threaded NAPI yields a ~4× performance improvement with 16 tunnels, increasing throughput from ~13 Gbps to ~48 Gbps. Meanwhile, CPU usage on the receiver (which is the bottleneck) jumps from 20% to 100%. We have found no performance regressions in any scenario we tested. Single-tunnel throughput remains unchanged. More details are available in our Netdev paper. Link: https://netdevconf.info/0x18/docs/netdev-0x18-paper23-talk-paper.pdf Signed-off-by: Mirco Barone <mirco.barone@polito.it> Fixes: `e7096c131e` ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20250605120616.2808744-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-06-05 07:53:57 -07:00

... 41 42 43 44 45 ...

1369264 Commits