From d4a895e924b486f2a38463114509e1088ef4d7f5 Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Tue, 23 Aug 2022 08:45:32 -0700 Subject: [PATCH 01/29] seccomp: Move copy_seccomp() to no failure path. commit a1140cb215fa13dcec06d12ba0c3ee105633b7c4 upstream. Our syzbot instance reported memory leaks in do_seccomp() [0], similar to the report [1]. It shows that we miss freeing struct seccomp_filter and some objects included in it. We can reproduce the issue with the program below [2] which calls one seccomp() and two clone() syscalls. The first clone()d child exits earlier than its parent and sends a signal to kill it during the second clone(), more precisely before the fatal_signal_pending() test in copy_process(). When the parent receives the signal, it has to destroy the embryonic process and return -EINTR to user space. In the failure path, we have to call seccomp_filter_release() to decrement the filter's refcount. Initially, we called it in free_task() called from the failure path, but the commit 3a15fb6ed92c ("seccomp: release filter after task is fully dead") moved it to release_task() to notify user space as early as possible that the filter is no longer used. To keep the change and current seccomp refcount semantics, let's move copy_seccomp() just after the signal check and add a WARN_ON_ONCE() in free_task() for future debugging. [0]: unreferenced object 0xffff8880063add00 (size 256): comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.914s) hex dump (first 32 bytes): 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ backtrace: do_seccomp (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/seccomp.c:666 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) unreferenced object 0xffffc90000035000 (size 4096): comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 05 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: __vmalloc_node_range (mm/vmalloc.c:3226) __vmalloc_node (mm/vmalloc.c:3261 (discriminator 4)) bpf_prog_alloc_no_stats (kernel/bpf/core.c:91) bpf_prog_alloc (kernel/bpf/core.c:129) bpf_prog_create_from_user (net/core/filter.c:1414) do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) unreferenced object 0xffff888003fa1000 (size 1024): comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: bpf_prog_alloc_no_stats (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/bpf/core.c:95) bpf_prog_alloc (kernel/bpf/core.c:129) bpf_prog_create_from_user (net/core/filter.c:1414) do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) unreferenced object 0xffff888006360240 (size 16): comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s) hex dump (first 16 bytes): 01 00 37 00 76 65 72 6c e0 83 01 06 80 88 ff ff ..7.verl........ backtrace: bpf_prog_store_orig_filter (net/core/filter.c:1137) bpf_prog_create_from_user (net/core/filter.c:1428) do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) unreferenced object 0xffff8880060183e0 (size 8): comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s) hex dump (first 8 bytes): 06 00 00 00 00 00 ff 7f ........ backtrace: kmemdup (mm/util.c:129) bpf_prog_store_orig_filter (net/core/filter.c:1144) bpf_prog_create_from_user (net/core/filter.c:1428) do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [1]: https://syzkaller.appspot.com/bug?id=2809bb0ac77ad9aa3f4afe42d6a610aba594a987 [2]: #define _GNU_SOURCE #include #include #include #include #include #include void main(void) { struct sock_filter filter[] = { BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW), }; struct sock_fprog fprog = { .len = sizeof(filter) / sizeof(filter[0]), .filter = filter, }; long i, pid; syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 0, &fprog); for (i = 0; i < 2; i++) { pid = syscall(__NR_clone, CLONE_NEWNET | SIGKILL, NULL, NULL, 0); if (pid == 0) return; } } Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead") Reported-by: syzbot+ab17848fe269b573eb71@syzkaller.appspotmail.com Reported-by: Ayushman Dutta Suggested-by: Kees Cook Signed-off-by: Kuniyuki Iwashima Reviewed-by: Christian Brauner (Microsoft) Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20220823154532.82913-1-kuniyu@amazon.com Signed-off-by: Greg Kroah-Hartman --- kernel/fork.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index a5bc0c6a00fd..c6a289317e89 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -441,6 +441,9 @@ void put_task_stack(struct task_struct *tsk) void free_task(struct task_struct *tsk) { +#ifdef CONFIG_SECCOMP + WARN_ON_ONCE(tsk->seccomp.filter); +#endif scs_release(tsk); #ifndef CONFIG_THREAD_INFO_IN_TASK @@ -2248,12 +2251,6 @@ static __latent_entropy struct task_struct *copy_process( spin_lock(¤t->sighand->siglock); - /* - * Copy seccomp details explicitly here, in case they were changed - * before holding sighand lock. - */ - copy_seccomp(p); - rseq_fork(p, clone_flags); /* Don't start children in a dying pid namespace */ @@ -2268,6 +2265,14 @@ static __latent_entropy struct task_struct *copy_process( goto bad_fork_cancel_cgroup; } + /* No more failure paths after this point. */ + + /* + * Copy seccomp details explicitly here, in case they were changed + * before holding sighand lock. + */ + copy_seccomp(p); + init_task_pid_links(p); if (likely(p->pid)) { ptrace_init_task(p, (clone_flags & CLONE_PTRACE) || trace); From 0f29d0e8fc77db0cbf08df3aafb6206b66714277 Mon Sep 17 00:00:00 2001 From: William Breathitt Gray Date: Sun, 12 Mar 2023 19:15:49 -0400 Subject: [PATCH 02/29] counter: 104-quad-8: Fix race condition between FLAG and CNTR reads commit 4aa3b75c74603c3374877d5fd18ad9cc3a9a62ed upstream. The Counter (CNTR) register is 24 bits wide, but we can have an effective 25-bit count value by setting bit 24 to the XOR of the Borrow flag and Carry flag. The flags can be read from the FLAG register, but a race condition exists: the Borrow flag and Carry flag are instantaneous and could change by the time the count value is read from the CNTR register. Since the race condition could result in an incorrect 25-bit count value, remove support for 25-bit count values from this driver; hard-coded maximum count values are replaced by a LS7267_CNTR_MAX define for consistency and clarity. Fixes: 28e5d3bb0325 ("iio: 104-quad-8: Add IIO support for the ACCES 104-QUAD-8") Cc: # 6.1.x Cc: # 6.2.x Link: https://lore.kernel.org/r/20230312231554.134858-1-william.gray@linaro.org/ Signed-off-by: William Breathitt Gray Signed-off-by: Greg Kroah-Hartman --- drivers/counter/104-quad-8.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 21bb2bb767a1..89c9cb850a34 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -62,10 +62,6 @@ struct quad8_iio { #define QUAD8_REG_CHAN_OP 0x11 #define QUAD8_REG_INDEX_INPUT_LEVELS 0x16 #define QUAD8_DIFF_ENCODER_CABLE_STATUS 0x17 -/* Borrow Toggle flip-flop */ -#define QUAD8_FLAG_BT BIT(0) -/* Carry Toggle flip-flop */ -#define QUAD8_FLAG_CT BIT(1) /* Error flag */ #define QUAD8_FLAG_E BIT(4) /* Up/Down flag */ @@ -104,9 +100,6 @@ static int quad8_read_raw(struct iio_dev *indio_dev, { struct quad8_iio *const priv = iio_priv(indio_dev); const int base_offset = priv->base + 2 * chan->channel; - unsigned int flags; - unsigned int borrow; - unsigned int carry; int i; switch (mask) { @@ -117,12 +110,7 @@ static int quad8_read_raw(struct iio_dev *indio_dev, return IIO_VAL_INT; } - flags = inb(base_offset + 1); - borrow = flags & QUAD8_FLAG_BT; - carry = !!(flags & QUAD8_FLAG_CT); - - /* Borrow XOR Carry effectively doubles count range */ - *val = (borrow ^ carry) << 24; + *val = 0; mutex_lock(&priv->lock); @@ -643,17 +631,9 @@ static int quad8_count_read(struct counter_device *counter, { struct quad8_iio *const priv = counter->priv; const int base_offset = priv->base + 2 * count->id; - unsigned int flags; - unsigned int borrow; - unsigned int carry; int i; - flags = inb(base_offset + 1); - borrow = flags & QUAD8_FLAG_BT; - carry = !!(flags & QUAD8_FLAG_CT); - - /* Borrow XOR Carry effectively doubles count range */ - *val = (unsigned long)(borrow ^ carry) << 24; + *val = 0; mutex_lock(&priv->lock); @@ -1198,8 +1178,8 @@ static ssize_t quad8_count_ceiling_read(struct counter_device *counter, mutex_unlock(&priv->lock); - /* By default 0x1FFFFFF (25 bits unsigned) is maximum count */ - return sprintf(buf, "33554431\n"); + /* By default 0xFFFFFF (24 bits unsigned) is maximum count */ + return sprintf(buf, "16777215\n"); } static ssize_t quad8_count_ceiling_write(struct counter_device *counter, From 1dd95b2109de223d98956b54cf8990bb226c285e Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 19 Apr 2023 13:16:13 +0300 Subject: [PATCH 03/29] KVM: arm64: Fix buffer overflow in kvm_arm_set_fw_reg() commit a25bc8486f9c01c1af6b6c5657234b2eee2c39d6 upstream. The KVM_REG_SIZE() comes from the ioctl and it can be a power of two between 0-32768 but if it is more than sizeof(long) this will corrupt memory. Fixes: 99adb567632b ("KVM: arm/arm64: Add save/restore support for firmware workaround state") Signed-off-by: Dan Carpenter Reviewed-by: Steven Price Reviewed-by: Eric Auger Reviewed-by: Marc Zyngier Link: https://lore.kernel.org/r/4efbab8c-640f-43b2-8ac6-6d68e08280fe@kili.mountain Signed-off-by: Oliver Upton [will: kvm_arm_set_fw_reg() lives in psci.c not hypercalls.c] Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman --- arch/arm64/kvm/psci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c index 20ba5136ac3d..32bb26be8a9b 100644 --- a/arch/arm64/kvm/psci.c +++ b/arch/arm64/kvm/psci.c @@ -499,6 +499,8 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) u64 val; int wa_level; + if (KVM_REG_SIZE(reg->id) != sizeof(val)) + return -ENOENT; if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id))) return -EFAULT; From 549825602e3e6449927ca1ea1a08fd89868439df Mon Sep 17 00:00:00 2001 From: Jisoo Jang Date: Thu, 9 Mar 2023 19:44:57 +0900 Subject: [PATCH 04/29] wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies() commit 0da40e018fd034d87c9460123fa7f897b69fdee7 upstream. Fix a slab-out-of-bounds read that occurs in kmemdup() called from brcmf_get_assoc_ies(). The bug could occur when assoc_info->req_len, data from a URB provided by a USB device, is bigger than the size of buffer which is defined as WL_EXTRA_BUF_MAX. Add the size check for req_len/resp_len of assoc_info. Found by a modified version of syzkaller. [ 46.592467][ T7] ================================================================== [ 46.594687][ T7] BUG: KASAN: slab-out-of-bounds in kmemdup+0x3e/0x50 [ 46.596572][ T7] Read of size 3014656 at addr ffff888019442000 by task kworker/0:1/7 [ 46.598575][ T7] [ 46.599157][ T7] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G O 5.14.0+ #145 [ 46.601333][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 46.604360][ T7] Workqueue: events brcmf_fweh_event_worker [ 46.605943][ T7] Call Trace: [ 46.606584][ T7] dump_stack_lvl+0x8e/0xd1 [ 46.607446][ T7] print_address_description.constprop.0.cold+0x93/0x334 [ 46.608610][ T7] ? kmemdup+0x3e/0x50 [ 46.609341][ T7] kasan_report.cold+0x79/0xd5 [ 46.610151][ T7] ? kmemdup+0x3e/0x50 [ 46.610796][ T7] kasan_check_range+0x14e/0x1b0 [ 46.611691][ T7] memcpy+0x20/0x60 [ 46.612323][ T7] kmemdup+0x3e/0x50 [ 46.612987][ T7] brcmf_get_assoc_ies+0x967/0xf60 [ 46.613904][ T7] ? brcmf_notify_vif_event+0x3d0/0x3d0 [ 46.614831][ T7] ? lock_chain_count+0x20/0x20 [ 46.615683][ T7] ? mark_lock.part.0+0xfc/0x2770 [ 46.616552][ T7] ? lock_chain_count+0x20/0x20 [ 46.617409][ T7] ? mark_lock.part.0+0xfc/0x2770 [ 46.618244][ T7] ? lock_chain_count+0x20/0x20 [ 46.619024][ T7] brcmf_bss_connect_done.constprop.0+0x241/0x2e0 [ 46.620019][ T7] ? brcmf_parse_configure_security.isra.0+0x2a0/0x2a0 [ 46.620818][ T7] ? __lock_acquire+0x181f/0x5790 [ 46.621462][ T7] brcmf_notify_connect_status+0x448/0x1950 [ 46.622134][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 46.622736][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0 [ 46.623390][ T7] ? find_held_lock+0x2d/0x110 [ 46.623962][ T7] ? brcmf_fweh_event_worker+0x19f/0xc60 [ 46.624603][ T7] ? mark_held_locks+0x9f/0xe0 [ 46.625145][ T7] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0 [ 46.625871][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0 [ 46.626545][ T7] brcmf_fweh_call_event_handler.isra.0+0x90/0x100 [ 46.627338][ T7] brcmf_fweh_event_worker+0x557/0xc60 [ 46.627962][ T7] ? brcmf_fweh_call_event_handler.isra.0+0x100/0x100 [ 46.628736][ T7] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 46.629396][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 46.629970][ T7] ? lockdep_hardirqs_on_prepare+0x273/0x3e0 [ 46.630649][ T7] process_one_work+0x92b/0x1460 [ 46.631205][ T7] ? pwq_dec_nr_in_flight+0x330/0x330 [ 46.631821][ T7] ? rwlock_bug.part.0+0x90/0x90 [ 46.632347][ T7] worker_thread+0x95/0xe00 [ 46.632832][ T7] ? __kthread_parkme+0x115/0x1e0 [ 46.633393][ T7] ? process_one_work+0x1460/0x1460 [ 46.633957][ T7] kthread+0x3a1/0x480 [ 46.634369][ T7] ? set_kthread_struct+0x120/0x120 [ 46.634933][ T7] ret_from_fork+0x1f/0x30 [ 46.635431][ T7] [ 46.635687][ T7] Allocated by task 7: [ 46.636151][ T7] kasan_save_stack+0x1b/0x40 [ 46.636628][ T7] __kasan_kmalloc+0x7c/0x90 [ 46.637108][ T7] kmem_cache_alloc_trace+0x19e/0x330 [ 46.637696][ T7] brcmf_cfg80211_attach+0x4a0/0x4040 [ 46.638275][ T7] brcmf_attach+0x389/0xd40 [ 46.638739][ T7] brcmf_usb_probe+0x12de/0x1690 [ 46.639279][ T7] usb_probe_interface+0x2aa/0x760 [ 46.639820][ T7] really_probe+0x205/0xb70 [ 46.640342][ T7] __driver_probe_device+0x311/0x4b0 [ 46.640876][ T7] driver_probe_device+0x4e/0x150 [ 46.641445][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.642000][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.642543][ T7] __device_attach+0x23f/0x3a0 [ 46.643065][ T7] bus_probe_device+0x1da/0x290 [ 46.643644][ T7] device_add+0xb7b/0x1eb0 [ 46.644130][ T7] usb_set_configuration+0xf59/0x16f0 [ 46.644720][ T7] usb_generic_driver_probe+0x82/0xa0 [ 46.645295][ T7] usb_probe_device+0xbb/0x250 [ 46.645786][ T7] really_probe+0x205/0xb70 [ 46.646258][ T7] __driver_probe_device+0x311/0x4b0 [ 46.646804][ T7] driver_probe_device+0x4e/0x150 [ 46.647387][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.647926][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.648454][ T7] __device_attach+0x23f/0x3a0 [ 46.648939][ T7] bus_probe_device+0x1da/0x290 [ 46.649478][ T7] device_add+0xb7b/0x1eb0 [ 46.649936][ T7] usb_new_device.cold+0x49c/0x1029 [ 46.650526][ T7] hub_event+0x1c98/0x3950 [ 46.650975][ T7] process_one_work+0x92b/0x1460 [ 46.651535][ T7] worker_thread+0x95/0xe00 [ 46.651991][ T7] kthread+0x3a1/0x480 [ 46.652413][ T7] ret_from_fork+0x1f/0x30 [ 46.652885][ T7] [ 46.653131][ T7] The buggy address belongs to the object at ffff888019442000 [ 46.653131][ T7] which belongs to the cache kmalloc-2k of size 2048 [ 46.654669][ T7] The buggy address is located 0 bytes inside of [ 46.654669][ T7] 2048-byte region [ffff888019442000, ffff888019442800) [ 46.656137][ T7] The buggy address belongs to the page: [ 46.656720][ T7] page:ffffea0000651000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x19440 [ 46.657792][ T7] head:ffffea0000651000 order:3 compound_mapcount:0 compound_pincount:0 [ 46.658673][ T7] flags: 0x100000000010200(slab|head|node=0|zone=1) [ 46.659422][ T7] raw: 0100000000010200 0000000000000000 dead000000000122 ffff888100042000 [ 46.660363][ T7] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 [ 46.661236][ T7] page dumped because: kasan: bad access detected [ 46.661956][ T7] page_owner tracks the page as allocated [ 46.662588][ T7] page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 7, ts 31136961085, free_ts 0 [ 46.664271][ T7] prep_new_page+0x1aa/0x240 [ 46.664763][ T7] get_page_from_freelist+0x159a/0x27c0 [ 46.665340][ T7] __alloc_pages+0x2da/0x6a0 [ 46.665847][ T7] alloc_pages+0xec/0x1e0 [ 46.666308][ T7] allocate_slab+0x380/0x4e0 [ 46.666770][ T7] ___slab_alloc+0x5bc/0x940 [ 46.667264][ T7] __slab_alloc+0x6d/0x80 [ 46.667712][ T7] kmem_cache_alloc_trace+0x30a/0x330 [ 46.668299][ T7] brcmf_usbdev_qinit.constprop.0+0x50/0x470 [ 46.668885][ T7] brcmf_usb_probe+0xc97/0x1690 [ 46.669438][ T7] usb_probe_interface+0x2aa/0x760 [ 46.669988][ T7] really_probe+0x205/0xb70 [ 46.670487][ T7] __driver_probe_device+0x311/0x4b0 [ 46.671031][ T7] driver_probe_device+0x4e/0x150 [ 46.671604][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.672192][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.672739][ T7] page_owner free stack trace missing [ 46.673335][ T7] [ 46.673620][ T7] Memory state around the buggy address: [ 46.674213][ T7] ffff888019442700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 46.675083][ T7] ffff888019442780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 46.675994][ T7] >ffff888019442800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.676875][ T7] ^ [ 46.677323][ T7] ffff888019442880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.678190][ T7] ffff888019442900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.679052][ T7] ================================================================== [ 46.679945][ T7] Disabling lock debugging due to kernel taint [ 46.680725][ T7] Kernel panic - not syncing: Reviewed-by: Arend van Spriel Signed-off-by: Jisoo Jang Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20230309104457.22628-1-jisoo.jang@yonsei.ac.kr Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c index 0a069bc7f156..df5970619712 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -5834,6 +5834,11 @@ static s32 brcmf_get_assoc_ies(struct brcmf_cfg80211_info *cfg, (struct brcmf_cfg80211_assoc_ielen_le *)cfg->extra_buf; req_len = le32_to_cpu(assoc_info->req_len); resp_len = le32_to_cpu(assoc_info->resp_len); + if (req_len > WL_EXTRA_BUF_MAX || resp_len > WL_EXTRA_BUF_MAX) { + bphy_err(drvr, "invalid lengths in assoc info: req %u resp %u\n", + req_len, resp_len); + return -EINVAL; + } if (req_len) { err = brcmf_fil_iovar_data_get(ifp, "assoc_req_ies", cfg->extra_buf, From dc110b20f4ce752c232ad710a428dd97cf138338 Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Tue, 4 Apr 2023 21:40:36 +0200 Subject: [PATCH 05/29] drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var commit 1935f0deb6116dd785ea64d8035eab0ff441255b upstream. Drivers are supposed to fix this up if needed if they don't outright reject it. Uncovered by 6c11df58fd1a ("fbmem: Check virtual screen sizes in fb_set_var()"). Reported-by: syzbot+20dcf81733d43ddff661@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=c5faf983bfa4a607de530cd3bb008888bf06cefc Cc: stable@vger.kernel.org # v5.4+ Cc: Daniel Vetter Cc: Javier Martinez Canillas Cc: Thomas Zimmermann Reviewed-by: Javier Martinez Canillas Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20230404194038.472803-1-daniel.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_fb_helper.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index ac5d61e65124..04f2ec2254e9 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -1299,6 +1299,9 @@ int drm_fb_helper_check_var(struct fb_var_screeninfo *var, return -EINVAL; } + var->xres_virtual = fb->width; + var->yres_virtual = fb->height; + /* * Workaround for SDL 1.2, which is known to be setting all pixel format * fields values to zero in some cases. We treat this situation as a From 98cfbad52fc286c2a1a75e04bf47b98d6489db1f Mon Sep 17 00:00:00 2001 From: Ruihan Li Date: Sun, 16 Apr 2023 16:14:04 +0800 Subject: [PATCH 06/29] bluetooth: Perform careful capability checks in hci_sock_ioctl() commit 25c150ac103a4ebeed0319994c742a90634ddf18 upstream. Previously, capability was checked using capable(), which verified that the caller of the ioctl system call had the required capability. In addition, the result of the check would be stored in the HCI_SOCK_TRUSTED flag, making it persistent for the socket. However, malicious programs can abuse this approach by deliberately sharing an HCI socket with a privileged task. The HCI socket will be marked as trusted when the privileged task occasionally makes an ioctl call. This problem can be solved by using sk_capable() to check capability, which ensures that not only the current task but also the socket opener has the specified capability, thus reducing the risk of privilege escalation through the previously identified vulnerability. Cc: stable@vger.kernel.org Fixes: f81f5b2db869 ("Bluetooth: Send control open and close messages for HCI raw sockets") Signed-off-by: Ruihan Li Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman --- net/bluetooth/hci_sock.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c index 73779af2fed6..4dcc1a8a8954 100644 --- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -996,7 +996,14 @@ static int hci_sock_ioctl(struct socket *sock, unsigned int cmd, if (hci_sock_gen_cookie(sk)) { struct sk_buff *skb; - if (capable(CAP_NET_ADMIN)) + /* Perform careful checks before setting the HCI_SOCK_TRUSTED + * flag. Make sure that not only the current task but also + * the socket opener has the required capability, since + * privileged programs can be tricked into making ioctl calls + * on HCI sockets, and the socket should not be marked as + * trusted simply because the ioctl caller is privileged. + */ + if (sk_capable(sk, CAP_NET_ADMIN)) hci_sock_set_flag(sk, HCI_SOCK_TRUSTED); /* Send event to monitor */ From c0e921422359f6796fc0ccf50a66739b919344dd Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sun, 1 May 2022 21:31:43 +0200 Subject: [PATCH 07/29] x86/fpu: Prevent FPU state corruption commit 59f5ede3bc0f00eb856425f636dab0c10feb06d8 upstream. The FPU usage related to task FPU management is either protected by disabling interrupts (switch_to, return to user) or via fpregs_lock() which is a wrapper around local_bh_disable(). When kernel code wants to use the FPU then it has to check whether it is possible by calling irq_fpu_usable(). But the condition in irq_fpu_usable() is wrong. It allows FPU to be used when: !in_interrupt() || interrupted_user_mode() || interrupted_kernel_fpu_idle() The latter is checking whether some other context already uses FPU in the kernel, but if that's not the case then it allows FPU to be used unconditionally even if the calling context interrupted a fpregs_lock() critical region. If that happens then the FPU state of the interrupted context becomes corrupted. Allow in kernel FPU usage only when no other context has in kernel FPU usage and either the calling context is not hard interrupt context or the hard interrupt did not interrupt a local bottomhalf disabled region. It's hard to find a proper Fixes tag as the condition was broken in one way or the other for a very long time and the eager/lazy FPU changes caused a lot of churn. Picked something remotely connected from the history. This survived undetected for quite some time as FPU usage in interrupt context is rare, but the recent changes to the random code unearthed it at least on a kernel which had FPU debugging enabled. There is probably a higher rate of silent corruption as not all issues can be detected by the FPU debugging code. This will be addressed in a subsequent change. Fixes: 5d2bd7009f30 ("x86, fpu: decouple non-lazy/eager fpu restore from xsave") Reported-by: Filipe Manana Signed-off-by: Thomas Gleixner Tested-by: Filipe Manana Reviewed-by: Borislav Petkov Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220501193102.588689270@linutronix.de Signed-off-by: Can Sun Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/fpu/core.c | 67 +++++++++++++++----------------------- 1 file changed, 26 insertions(+), 41 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 571220ac8bea..835b948095cd 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -25,17 +25,7 @@ */ union fpregs_state init_fpstate __read_mostly; -/* - * Track whether the kernel is using the FPU state - * currently. - * - * This flag is used: - * - * - by IRQ context code to potentially use the FPU - * if it's unused. - * - * - to debug kernel_fpu_begin()/end() correctness - */ +/* Track in-kernel FPU usage */ static DEFINE_PER_CPU(bool, in_kernel_fpu); /* @@ -43,42 +33,37 @@ static DEFINE_PER_CPU(bool, in_kernel_fpu); */ DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx); -static bool kernel_fpu_disabled(void) -{ - return this_cpu_read(in_kernel_fpu); -} - -static bool interrupted_kernel_fpu_idle(void) -{ - return !kernel_fpu_disabled(); -} - -/* - * Were we in user mode (or vm86 mode) when we were - * interrupted? - * - * Doing kernel_fpu_begin/end() is ok if we are running - * in an interrupt context from user mode - we'll just - * save the FPU state as required. - */ -static bool interrupted_user_mode(void) -{ - struct pt_regs *regs = get_irq_regs(); - return regs && user_mode(regs); -} - /* * Can we use the FPU in kernel mode with the * whole "kernel_fpu_begin/end()" sequence? - * - * It's always ok in process context (ie "not interrupt") - * but it is sometimes ok even from an irq. */ bool irq_fpu_usable(void) { - return !in_interrupt() || - interrupted_user_mode() || - interrupted_kernel_fpu_idle(); + if (WARN_ON_ONCE(in_nmi())) + return false; + + /* In kernel FPU usage already active? */ + if (this_cpu_read(in_kernel_fpu)) + return false; + + /* + * When not in NMI or hard interrupt context, FPU can be used in: + * + * - Task context except from within fpregs_lock()'ed critical + * regions. + * + * - Soft interrupt processing context which cannot happen + * while in a fpregs_lock()'ed critical region. + */ + if (!in_irq()) + return true; + + /* + * In hard interrupt context it's safe when soft interrupts + * are enabled, which means the interrupt did not hit in + * a fpregs_lock()'ed critical region. + */ + return !softirq_count(); } EXPORT_SYMBOL(irq_fpu_usable); From f964a00386ca0b1013357657922c00c1b5e66b07 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ar=C4=B1n=C3=A7=20=C3=9CNAL?= Date: Mon, 17 Apr 2023 18:20:03 +0300 Subject: [PATCH 08/29] USB: serial: option: add UNISOC vendor and TOZED LT70C product MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit a095edfc15f0832e046ae23964e249ef5c95af87 upstream. Add UNISOC vendor ID and TOZED LT70-C modem which is based from UNISOC SL8563. The modem supports the NCM mode. Interface 0 is used for running the AT commands. Interface 12 is the ADB interface. T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1782 ProdID=4055 Rev=04.04 S: Manufacturer=Unisoc Phone S: Product=Unisoc Phone S: SerialNumber= C: #Ifs=14 Cfg#= 1 Atr=c0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=10 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=11 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=12 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=13 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=84(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 3 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=86(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=88(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 7 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Arınç ÜNAL Link: https://lore.kernel.org/r/20230417152003.243248-1-arinc.unal@arinc9.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index da8b7bd39703..5b474efeab6a 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -595,6 +595,11 @@ static void option_instat_callback(struct urb *urb); #define SIERRA_VENDOR_ID 0x1199 #define SIERRA_PRODUCT_EM9191 0x90d3 +/* UNISOC (Spreadtrum) products */ +#define UNISOC_VENDOR_ID 0x1782 +/* TOZED LT70-C based on UNISOC SL8563 uses UNISOC's vendor ID */ +#define TOZED_PRODUCT_LT70C 0x4055 + /* Device flags */ /* Highest interface number which can be used with NCTRL() and RSVD() */ @@ -2225,6 +2230,7 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_AND_INTERFACE_INFO(OPPO_VENDOR_ID, OPPO_PRODUCT_R11, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids); From 8aa079c2fdfc1bbf2eb8499e065ab509a2b3eea3 Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Wed, 12 Apr 2023 15:58:42 -0700 Subject: [PATCH 09/29] driver core: Don't require dynamic_debug for initcall_debug probe timing commit e2f06aa885081e1391916367f53bad984714b4db upstream. Don't require the use of dynamic debug (or modification of the kernel to add a #define DEBUG to the top of this file) to get the printk message about driver probe timing. This printk is only emitted when initcall_debug is enabled on the kernel commandline, and it isn't immediately obvious that you have to do something else to debug boot timing issues related to driver probe. Add a comment too so it doesn't get converted back to pr_debug(). Fixes: eb7fbc9fb118 ("driver core: Add missing '\n' in log messages") Cc: stable Cc: Christophe JAILLET Cc: Brian Norris Reviewed-by: Brian Norris Acked-by: Randy Dunlap Signed-off-by: Stephen Boyd Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman --- drivers/base/dd.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 497e3d4255c4..503c01d4015d 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -677,7 +677,12 @@ static int really_probe_debug(struct device *dev, struct device_driver *drv) calltime = ktime_get(); ret = really_probe(dev, drv); rettime = ktime_get(); - pr_debug("probe of %s returned %d after %lld usecs\n", + /* + * Don't change this to pr_debug() because that requires + * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the + * kernel commandline to print this all the time at the debug level. + */ + printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n", dev_name(dev), ret, ktime_us_delta(rettime, calltime)); return ret; } From f5e96af71eab491a19fb3926ab42d592118ca52c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Patrik=20Dahlstr=C3=B6m?= Date: Mon, 13 Mar 2023 21:50:29 +0100 Subject: [PATCH 10/29] iio: adc: palmas_gpadc: fix NULL dereference on rmmod MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 49f76c499d38bf67803438eee88c8300d0f6ce09 ] Calling dev_to_iio_dev() on a platform device pointer is undefined and will make adc NULL. Signed-off-by: Patrik Dahlström Link: https://lore.kernel.org/r/20230313205029.1881745-1-risca@dalakolonin.se Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin --- drivers/iio/adc/palmas_gpadc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iio/adc/palmas_gpadc.c b/drivers/iio/adc/palmas_gpadc.c index f4756671cddb..6ed0d151ad21 100644 --- a/drivers/iio/adc/palmas_gpadc.c +++ b/drivers/iio/adc/palmas_gpadc.c @@ -628,7 +628,7 @@ out: static int palmas_gpadc_remove(struct platform_device *pdev) { - struct iio_dev *indio_dev = dev_to_iio_dev(&pdev->dev); + struct iio_dev *indio_dev = dev_get_drvdata(&pdev->dev); struct palmas_gpadc *adc = iio_priv(indio_dev); if (adc->wakeup1_enable || adc->wakeup2_enable) From 925cbb725367b228039c62d35f4c75937463bb44 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Wed, 22 Mar 2023 15:53:32 +0100 Subject: [PATCH 11/29] ASoC: Intel: bytcr_rt5640: Add quirk for the Acer Iconia One 7 B1-750 [ Upstream commit e38c5e80c3d293a883c6f1d553f2146ec0bda35e ] The Acer Iconia One 7 B1-750 tablet mostly works fine with the defaults for an Bay Trail CR tablet. Except for the internal mic, instead of an analog mic on IN3 a digital mic on DMIC1 is uses. Add a quirk with these settings for this tablet. Acked-by: Pierre-Louis Bossart Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20230322145332.131525-1-hdegoede@redhat.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/intel/boards/bytcr_rt5640.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c index 8a99cb6dfcd6..9a5ab96f917d 100644 --- a/sound/soc/intel/boards/bytcr_rt5640.c +++ b/sound/soc/intel/boards/bytcr_rt5640.c @@ -393,6 +393,18 @@ static int byt_rt5640_aif1_hw_params(struct snd_pcm_substream *substream, /* Please keep this list alphabetically sorted */ static const struct dmi_system_id byt_rt5640_quirk_table[] = { + { /* Acer Iconia One 7 B1-750 */ + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "VESPA2"), + }, + .driver_data = (void *)(BYT_RT5640_DMIC1_MAP | + BYT_RT5640_JD_SRC_JD1_IN4P | + BYT_RT5640_OVCD_TH_1500UA | + BYT_RT5640_OVCD_SF_0P75 | + BYT_RT5640_SSP0_AIF1 | + BYT_RT5640_MCLK_EN), + }, { /* Acer Iconia Tab 8 W1-810 */ .matches = { DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Acer"), From 69fdbb334d6e6d33e27b0adc3234607af6135861 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 9 Jan 2023 15:11:52 +0200 Subject: [PATCH 12/29] asm-generic/io.h: suppress endianness warnings for readq() and writeq() [ Upstream commit d564fa1ff19e893e2971d66e5c8f49dc1cdc8ffc ] Commit c1d55d50139b ("asm-generic/io.h: Fix sparse warnings on big-endian architectures") missed fixing the 64-bit accessors. Arnd explains in the attached link why the casts are necessary, even if __raw_readq() and __raw_writeq() do not take endian-specific types. Link: https://lore.kernel.org/lkml/9105d6fc-880b-4734-857d-e3d30b87ccf6@app.fastmail.com/ Suggested-by: Arnd Bergmann Signed-off-by: Vladimir Oltean Reviewed-by: Jonathan Cameron Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin --- include/asm-generic/io.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index 9ea83d80eb6f..dcbd41048b4e 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -190,7 +190,7 @@ static inline u64 readq(const volatile void __iomem *addr) u64 val; __io_br(); - val = __le64_to_cpu(__raw_readq(addr)); + val = __le64_to_cpu((__le64 __force)__raw_readq(addr)); __io_ar(val); return val; } @@ -233,7 +233,7 @@ static inline void writel(u32 value, volatile void __iomem *addr) static inline void writeq(u64 value, volatile void __iomem *addr) { __io_bw(); - __raw_writeq(__cpu_to_le64(value), addr); + __raw_writeq((u64 __force)__cpu_to_le64(value), addr); __io_aw(); } #endif From 5434c7019d2308a474ddb8accbb7279f1a327036 Mon Sep 17 00:00:00 2001 From: "Jiri Slaby (SUSE)" Date: Tue, 13 Dec 2022 15:52:08 -0700 Subject: [PATCH 13/29] wireguard: timers: cast enum limits members to int in prints commit 2d4ee16d969c97996e80e4c9cb6de0acaff22c9f upstream. Since gcc13, each member of an enum has the same type as the enum. And that is inherited from its members. Provided "REKEY_AFTER_MESSAGES = 1ULL << 60", the named type is unsigned long. This generates warnings with gcc-13: error: format '%d' expects argument of type 'int', but argument 6 has type 'long unsigned int' Cast those particular enum members to int when printing them. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113 Cc: Martin Liska Signed-off-by: Jiri Slaby (SUSE) Signed-off-by: Jason A. Donenfeld Link: https://lore.kernel.org/all/20221213225208.3343692-2-Jason@zx2c4.com/ Signed-off-by: Jakub Kicinski Cc: Chris Clayton Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/timers.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireguard/timers.c b/drivers/net/wireguard/timers.c index d54d32ac9bc4..91f5d6d2d4e2 100644 --- a/drivers/net/wireguard/timers.c +++ b/drivers/net/wireguard/timers.c @@ -46,7 +46,7 @@ static void wg_expired_retransmit_handshake(struct timer_list *timer) if (peer->timer_handshake_attempts > MAX_TIMER_HANDSHAKES) { pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d attempts, giving up\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, MAX_TIMER_HANDSHAKES + 2); + &peer->endpoint.addr, (int)MAX_TIMER_HANDSHAKES + 2); del_timer(&peer->timer_send_keepalive); /* We drop all packets without a keypair and don't try again, @@ -64,7 +64,7 @@ static void wg_expired_retransmit_handshake(struct timer_list *timer) ++peer->timer_handshake_attempts; pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d seconds, retrying (try %d)\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, REKEY_TIMEOUT, + &peer->endpoint.addr, (int)REKEY_TIMEOUT, peer->timer_handshake_attempts + 1); /* We clear the endpoint address src address, in case this is @@ -94,7 +94,7 @@ static void wg_expired_new_handshake(struct timer_list *timer) pr_debug("%s: Retrying handshake with peer %llu (%pISpfsc) because we stopped hearing back after %d seconds\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, KEEPALIVE_TIMEOUT + REKEY_TIMEOUT); + &peer->endpoint.addr, (int)(KEEPALIVE_TIMEOUT + REKEY_TIMEOUT)); /* We clear the endpoint address src address, in case this is the cause * of trouble. */ @@ -126,7 +126,7 @@ static void wg_queued_expired_zero_key_material(struct work_struct *work) pr_debug("%s: Zeroing out all keys for peer %llu (%pISpfsc), since we haven't received a new one in %d seconds\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, REJECT_AFTER_TIME * 3); + &peer->endpoint.addr, (int)REJECT_AFTER_TIME * 3); wg_noise_handshake_clear(&peer->handshake); wg_noise_keypairs_clear(&peer->keypairs); wg_peer_put(peer); From 2f31633da843eb0457f6212eba435ac538c28f58 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Tue, 11 Apr 2023 08:21:02 +0200 Subject: [PATCH 14/29] PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock commit f5eff5591b8f9c5effd25c92c758a127765f74c1 upstream. In 2013, commits 2e35afaefe64 ("PCI: pciehp: Add reset_slot() method") 608c388122c7 ("PCI: Add slot reset option to pci_dev_reset()") amended PCIe hotplug to mask Presence Detect Changed events during a Secondary Bus Reset. The reset thus no longer causes gratuitous slot bringdown and bringup. However the commits neglected to serialize reset with code paths reading slot registers. For instance, a slot bringup due to an earlier hotplug event may see the Presence Detect State bit cleared during a concurrent Secondary Bus Reset. In 2018, commit 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") retrofitted the missing locking. It introduced a reset_lock which serializes a Secondary Bus Reset with other parts of pciehp. Unfortunately the locking turns out to be overzealous: reset_lock is held for the entire enumeration and de-enumeration of hotplugged devices, including driver binding and unbinding. Driver binding and unbinding acquires device_lock while the reset_lock of the ancestral hotplug port is held. A concurrent Secondary Bus Reset acquires the ancestral reset_lock while already holding the device_lock. The asymmetric locking order in the two code paths can lead to AB-BA deadlocks. Michael Haeuptle reports such deadlocks on simultaneous hot-removal and vfio release (the latter implies a Secondary Bus Reset): pciehp_ist() # down_read(reset_lock) pciehp_handle_presence_or_link_change() pciehp_disable_slot() __pciehp_disable_slot() remove_board() pciehp_unconfigure_device() pci_stop_and_remove_bus_device() pci_stop_bus_device() pci_stop_dev() device_release_driver() device_release_driver_internal() __device_driver_lock() # device_lock() SYS_munmap() vfio_device_fops_release() vfio_device_group_close() vfio_device_close() vfio_device_last_close() vfio_pci_core_close_device() vfio_pci_core_disable() # device_lock() __pci_reset_function_locked() pci_reset_bus_function() pci_dev_reset_slot_function() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Ian May reports the same deadlock on simultaneous hot-removal and an AER-induced Secondary Bus Reset: aer_recover_work_func() pcie_do_recovery() aer_root_reset() pci_bus_error_reset() pci_slot_reset() pci_slot_lock() # device_lock() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Fix by releasing the reset_lock during driver binding and unbinding, thereby splitting and shrinking the critical section. Driver binding and unbinding is protected by the device_lock() and thus serialized with a Secondary Bus Reset. There's no need to additionally protect it with the reset_lock. However, pciehp does not bind and unbind devices directly, but rather invokes PCI core functions which also perform certain enumeration and de-enumeration steps. The reset_lock's purpose is to protect slot registers, not enumeration and de-enumeration of hotplugged devices. That would arguably be the job of the PCI core, not the PCIe hotplug driver. After all, an AER-induced Secondary Bus Reset may as well happen during boot-time enumeration of the PCI hierarchy and there's no locking to prevent that either. Exempting *de-enumeration* from the reset_lock is relatively harmless: A concurrent Secondary Bus Reset may foil config space accesses such as PME interrupt disablement. But if the device is physically gone, those accesses are pointless anyway. If the device is physically present and only logically removed through an Attention Button press or the sysfs "power" attribute, PME interrupts as well as DMA cannot come through because pciehp_unconfigure_device() disables INTx and Bus Master bits. That's still protected by the reset_lock in the present commit. Exempting *enumeration* from the reset_lock also has limited impact: The exempted call to pci_bus_add_device() may perform device accesses through pcibios_bus_add_device() and pci_fixup_device() which are now no longer protected from a concurrent Secondary Bus Reset. Otherwise there should be no impact. In essence, the present commit seeks to fix the AB-BA deadlocks while still retaining a best-effort reset protection for enumeration and de-enumeration of hotplugged devices -- until a general solution is implemented in the PCI core. Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de Reported-by: Michael Haeuptle Reported-by: Ian May Reported-by: Andrey Grodzovsky Reported-by: Rahul Kumar Reported-by: Jialin Zhang Tested-by: Anatoli Antonovitch Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org # v4.19+ Cc: Dan Stein Cc: Ashok Raj Cc: Alex Michon Cc: Xiongfeng Wang Cc: Alex Williamson Cc: Mika Westerberg Cc: Sathyanarayanan Kuppuswamy Signed-off-by: Greg Kroah-Hartman --- drivers/pci/hotplug/pciehp_pci.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index d17f3bf36f70..ad12515a4a12 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -63,7 +63,14 @@ int pciehp_configure_device(struct controller *ctrl) pci_assign_unassigned_bridge_resources(bridge); pcie_bus_configure_settings(parent); + + /* + * Release reset_lock during driver binding + * to avoid AB-BA deadlock with device_lock. + */ + up_read(&ctrl->reset_lock); pci_bus_add_devices(parent); + down_read_nested(&ctrl->reset_lock, ctrl->depth); out: pci_unlock_rescan_remove(); @@ -104,7 +111,15 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) list_for_each_entry_safe_reverse(dev, temp, &parent->devices, bus_list) { pci_dev_get(dev); + + /* + * Release reset_lock during driver unbinding + * to avoid AB-BA deadlock with device_lock. + */ + up_read(&ctrl->reset_lock); pci_stop_and_remove_bus_device(dev); + down_read_nested(&ctrl->reset_lock, ctrl->depth); + /* * Ensure that no new Requests will be generated from * the device. From b978269ddad4ee2d5d6788eeb0e7e344a7d480e1 Mon Sep 17 00:00:00 2001 From: Manivannan Sadhasivam Date: Thu, 16 Mar 2023 13:40:59 +0530 Subject: [PATCH 15/29] PCI: qcom: Fix the incorrect register usage in v2.7.0 config commit 2542e16c392508800f1d9037feee881a9c444951 upstream. Qcom PCIe IP version v2.7.0 and its derivatives don't contain the PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT register. Instead, they have the new PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2 register. So fix the incorrect register usage which is modifying a different register. Also in this IP version, this register change doesn't depend on MSI being enabled. So remove that check also. Link: https://lore.kernel.org/r/20230316081117.14288-2-manivannan.sadhasivam@linaro.org Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller") Signed-off-by: Manivannan Sadhasivam Signed-off-by: Lorenzo Pieralisi Cc: # 5.6+ Signed-off-by: Greg Kroah-Hartman --- drivers/pci/controller/dwc/pcie-qcom.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 5fbd80908a99..c68e14271c02 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -1210,11 +1210,9 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) val |= BIT(4); writel(val, pcie->parf + PCIE20_PARF_MHI_CLOCK_RESET_CTRL); - if (IS_ENABLED(CONFIG_PCI_MSI)) { - val = readl(pcie->parf + PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT); - val |= BIT(31); - writel(val, pcie->parf + PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT); - } + val = readl(pcie->parf + PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2); + val |= BIT(31); + writel(val, pcie->parf + PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2); return 0; err_disable_clocks: From 27dc207c386eb6719e9cb695f7afb15e52b07d28 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Tue, 4 Apr 2023 09:25:14 +0200 Subject: [PATCH 16/29] USB: dwc3: fix runtime pm imbalance on probe errors commit 9a8ad10c9f2e0925ff26308ec6756b93fc2f4977 upstream. Make sure not to suspend the device when probe fails to avoid disabling clocks and phys multiple times. Fixes: 328082376aea ("usb: dwc3: fix runtime PM in error path") Cc: stable@vger.kernel.org # 4.8 Cc: Roger Quadros Acked-by: Thinh Nguyen Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20230404072524.19014-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/core.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index d73f624ed42a..fc84677635a3 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1567,13 +1567,11 @@ static int dwc3_probe(struct platform_device *pdev) spin_lock_init(&dwc->lock); mutex_init(&dwc->mutex); + pm_runtime_get_noresume(dev); pm_runtime_set_active(dev); pm_runtime_use_autosuspend(dev); pm_runtime_set_autosuspend_delay(dev, DWC3_DEFAULT_AUTOSUSPEND_DELAY); pm_runtime_enable(dev); - ret = pm_runtime_get_sync(dev); - if (ret < 0) - goto err1; pm_runtime_forbid(dev); @@ -1633,12 +1631,10 @@ err3: dwc3_free_event_buffers(dwc); err2: - pm_runtime_allow(&pdev->dev); - -err1: - pm_runtime_put_sync(&pdev->dev); - pm_runtime_disable(&pdev->dev); - + pm_runtime_allow(dev); + pm_runtime_disable(dev); + pm_runtime_set_suspended(dev); + pm_runtime_put_noidle(dev); disable_clks: clk_bulk_disable_unprepare(dwc->num_clks, dwc->clks); assert_reset: From a71cb92ec4315b7b61983bae35ec00be62901b25 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Tue, 4 Apr 2023 09:25:15 +0200 Subject: [PATCH 17/29] USB: dwc3: fix runtime pm imbalance on unbind commit 44d257e9012ee8040e41d224d0e5bfb5ef5427ea upstream. Make sure to balance the runtime PM usage count on driver unbind by adding back the pm_runtime_allow() call that had been erroneously removed. Fixes: 266d0493900a ("usb: dwc3: core: don't trigger runtime pm when remove driver") Cc: stable@vger.kernel.org # 5.9 Cc: Li Jun Acked-by: Thinh Nguyen Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20230404072524.19014-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index fc84677635a3..5709b959b1d9 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1655,6 +1655,7 @@ static int dwc3_remove(struct platform_device *pdev) dwc3_core_exit(dwc); dwc3_ulpi_exit(dwc); + pm_runtime_allow(&pdev->dev); pm_runtime_disable(&pdev->dev); pm_runtime_put_noidle(&pdev->dev); pm_runtime_set_suspended(&pdev->dev); From b009006887e32bf62ff5c0da73b9bafd18dd996b Mon Sep 17 00:00:00 2001 From: Babu Moger Date: Thu, 13 Apr 2023 16:39:58 -0500 Subject: [PATCH 18/29] hwmon: (k10temp) Check range scale when CUR_TEMP register is read-write MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 0c072385348e3ac5229145644055d3e2afb5b3db upstream. Spec says, when CUR_TEMP_TJ_SEL == 3 and CUR_TEMP_RANGE_SEL == 0, it should use RangeUnadjusted is 0, which is (CurTmp*0.125 -49) C. The CUR_TEMP register is read-write when CUR_TEMP_TJ_SEL == 3 (bit 17-16). Add the check to detect it. Sensors command's output before the patch. $sensors k10temp-pci-00c3 Adapter: PCI adapter Tctl: +76.6°C <- Wrong value Tccd1: +26.5°C Tccd2: +27.5°C Tccd3: +27.2°C Tccd4: +27.5°C Tccd5: +26.0°C Tccd6: +26.2°C Tccd7: +25.0°C Tccd8: +26.5°C Sensors command's output after the patch. $sensors k10temp-pci-00c3 Adapter: PCI adapter Tctl: +28.8°C <- corrected value Tccd1: +27.5°C Tccd2: +28.5°C Tccd3: +28.5°C Tccd4: +28.5°C Tccd5: +27.0°C Tccd6: +27.5°C Tccd7: +27.0°C Tccd8: +27.5°C Signed-off-by: Babu Moger Fixes: 1b59788979ac ("hwmon: (k10temp) Add temperature offset for Ryzen 2700X") Link: https://lore.kernel.org/r/20230413213958.847634-1-babu.moger@amd.com Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman --- drivers/hwmon/k10temp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c index 3bc2551577a3..74a67089ad07 100644 --- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -74,6 +74,7 @@ static DEFINE_MUTEX(nb_smu_ind_mutex); #define ZEN_CUR_TEMP_SHIFT 21 #define ZEN_CUR_TEMP_RANGE_SEL_MASK BIT(19) +#define ZEN_CUR_TEMP_TJ_SEL_MASK GENMASK(17, 16) #define ZEN_SVI_BASE 0x0005A000 @@ -173,7 +174,8 @@ static long get_raw_temp(struct k10temp_data *data) data->read_tempreg(data->pdev, ®val); temp = (regval >> ZEN_CUR_TEMP_SHIFT) * 125; - if (regval & data->temp_adjust_mask) + if ((regval & data->temp_adjust_mask) || + (regval & ZEN_CUR_TEMP_TJ_SEL_MASK) == ZEN_CUR_TEMP_TJ_SEL_MASK) temp -= 49000; return temp; } From aed39acf7ed6beb8f671d2b23709a863aed565e8 Mon Sep 17 00:00:00 2001 From: Chris Packham Date: Wed, 19 Apr 2023 11:36:55 +1200 Subject: [PATCH 19/29] hwmon: (adt7475) Use device_property APIs when configuring polarity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 2a8e41ad337508fc5d598c0f9288890214f8e318 upstream. On DT unaware platforms of_property_read_u32_array() returns -ENOSYS which wasn't handled by the code treating adi,pwm-active-state as optional. Update the code to use device_property_read_u32_array() which deals gracefully with DT unaware platforms. Fixes: 86da28eed4fb ("hwmon: (adt7475) Add support for inverting pwm output") Reported-by: Mariusz Białończyk Link: https://lore.kernel.org/linux-hwmon/52e26a67-9131-2dc0-40cb-db5c07370027@alliedtelesis.co.nz/T/#mdd0505801e0a4e72340de009a47c0fca4f771ed3 Signed-off-by: Chris Packham Link: https://lore.kernel.org/r/20230418233656.869055-2-chris.packham@alliedtelesis.co.nz Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman --- drivers/hwmon/adt7475.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index 6b84822e7d93..22e314725def 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -1515,9 +1515,9 @@ static int adt7475_set_pwm_polarity(struct i2c_client *client) int ret, i; u8 val; - ret = of_property_read_u32_array(client->dev.of_node, - "adi,pwm-active-state", states, - ARRAY_SIZE(states)); + ret = device_property_read_u32_array(&client->dev, + "adi,pwm-active-state", states, + ARRAY_SIZE(states)); if (ret) return ret; From 7c5811b95c573d0e080f9fd8d708fc2a706cf784 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 17 Apr 2023 15:37:55 +0200 Subject: [PATCH 20/29] posix-cpu-timers: Implement the missing timer_wait_running callback commit f7abf14f0001a5a47539d9f60bbdca649e43536b upstream. For some unknown reason the introduction of the timer_wait_running callback missed to fixup posix CPU timers, which went unnoticed for almost four years. Marco reported recently that the WARN_ON() in timer_wait_running() triggers with a posix CPU timer test case. Posix CPU timers have two execution models for expiring timers depending on CONFIG_POSIX_CPU_TIMERS_TASK_WORK: 1) If not enabled, the expiry happens in hard interrupt context so spin waiting on the remote CPU is reasonably time bound. Implement an empty stub function for that case. 2) If enabled, the expiry happens in task work before returning to user space or guest mode. The expired timers are marked as firing and moved from the timer queue to a local list head with sighand lock held. Once the timers are moved, sighand lock is dropped and the expiry happens in fully preemptible context. That means the expiring task can be scheduled out, migrated, interrupted etc. So spin waiting on it is more than suboptimal. The timer wheel has a timer_wait_running() mechanism for RT, which uses a per CPU timer-base expiry lock which is held by the expiry code and the task waiting for the timer function to complete blocks on that lock. This does not work in the same way for posix CPU timers as there is no timer base and expiry for process wide timers can run on any task belonging to that process, but the concept of waiting on an expiry lock can be used too in a slightly different way: - Add a mutex to struct posix_cputimers_work. This struct is per task and used to schedule the expiry task work from the timer interrupt. - Add a task_struct pointer to struct cpu_timer which is used to store a the task which runs the expiry. That's filled in when the task moves the expired timers to the local expiry list. That's not affecting the size of the k_itimer union as there are bigger union members already - Let the task take the expiry mutex around the expiry function - Let the waiter acquire a task reference with rcu_read_lock() held and block on the expiry mutex This avoids spin-waiting on a task which might not even be on a CPU and works nicely for RT too. Fixes: ec8f954a40da ("posix-timers: Use a callback for cancel synchronization on PREEMPT_RT") Reported-by: Marco Elver Signed-off-by: Thomas Gleixner Tested-by: Marco Elver Tested-by: Sebastian Andrzej Siewior Reviewed-by: Frederic Weisbecker Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87zg764ojw.ffs@tglx Signed-off-by: Greg Kroah-Hartman --- include/linux/posix-timers.h | 17 ++++--- kernel/time/posix-cpu-timers.c | 81 ++++++++++++++++++++++++++++------ kernel/time/posix-timers.c | 4 ++ 3 files changed, 82 insertions(+), 20 deletions(-) diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h index 913aa60228b1..8e284161b65e 100644 --- a/include/linux/posix-timers.h +++ b/include/linux/posix-timers.h @@ -4,6 +4,7 @@ #include #include +#include #include #include #include @@ -63,16 +64,18 @@ static inline int clockid_to_fd(const clockid_t clk) * cpu_timer - Posix CPU timer representation for k_itimer * @node: timerqueue node to queue in the task/sig * @head: timerqueue head on which this timer is queued - * @task: Pointer to target task + * @pid: Pointer to target task PID * @elist: List head for the expiry list * @firing: Timer is currently firing + * @handling: Pointer to the task which handles expiry */ struct cpu_timer { - struct timerqueue_node node; - struct timerqueue_head *head; - struct pid *pid; - struct list_head elist; - int firing; + struct timerqueue_node node; + struct timerqueue_head *head; + struct pid *pid; + struct list_head elist; + int firing; + struct task_struct __rcu *handling; }; static inline bool cpu_timer_enqueue(struct timerqueue_head *head, @@ -129,10 +132,12 @@ struct posix_cputimers { /** * posix_cputimers_work - Container for task work based posix CPU timer expiry * @work: The task work to be scheduled + * @mutex: Mutex held around expiry in context of this task work * @scheduled: @work has been scheduled already, no further processing */ struct posix_cputimers_work { struct callback_head work; + struct mutex mutex; unsigned int scheduled; }; diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c index 5d76edd0ad9c..bede1e608d95 100644 --- a/kernel/time/posix-cpu-timers.c +++ b/kernel/time/posix-cpu-timers.c @@ -782,6 +782,8 @@ static u64 collect_timerqueue(struct timerqueue_head *head, return expires; ctmr->firing = 1; + /* See posix_cpu_timer_wait_running() */ + rcu_assign_pointer(ctmr->handling, current); cpu_timer_dequeue(ctmr); list_add_tail(&ctmr->elist, firing); } @@ -1097,7 +1099,49 @@ static void handle_posix_cpu_timers(struct task_struct *tsk); #ifdef CONFIG_POSIX_CPU_TIMERS_TASK_WORK static void posix_cpu_timers_work(struct callback_head *work) { + struct posix_cputimers_work *cw = container_of(work, typeof(*cw), work); + + mutex_lock(&cw->mutex); handle_posix_cpu_timers(current); + mutex_unlock(&cw->mutex); +} + +/* + * Invoked from the posix-timer core when a cancel operation failed because + * the timer is marked firing. The caller holds rcu_read_lock(), which + * protects the timer and the task which is expiring it from being freed. + */ +static void posix_cpu_timer_wait_running(struct k_itimer *timr) +{ + struct task_struct *tsk = rcu_dereference(timr->it.cpu.handling); + + /* Has the handling task completed expiry already? */ + if (!tsk) + return; + + /* Ensure that the task cannot go away */ + get_task_struct(tsk); + /* Now drop the RCU protection so the mutex can be locked */ + rcu_read_unlock(); + /* Wait on the expiry mutex */ + mutex_lock(&tsk->posix_cputimers_work.mutex); + /* Release it immediately again. */ + mutex_unlock(&tsk->posix_cputimers_work.mutex); + /* Drop the task reference. */ + put_task_struct(tsk); + /* Relock RCU so the callsite is balanced */ + rcu_read_lock(); +} + +static void posix_cpu_timer_wait_running_nsleep(struct k_itimer *timr) +{ + /* Ensure that timr->it.cpu.handling task cannot go away */ + rcu_read_lock(); + spin_unlock_irq(&timr->it_lock); + posix_cpu_timer_wait_running(timr); + rcu_read_unlock(); + /* @timr is on stack and is valid */ + spin_lock_irq(&timr->it_lock); } /* @@ -1113,6 +1157,7 @@ void clear_posix_cputimers_work(struct task_struct *p) sizeof(p->posix_cputimers_work.work)); init_task_work(&p->posix_cputimers_work.work, posix_cpu_timers_work); + mutex_init(&p->posix_cputimers_work.mutex); p->posix_cputimers_work.scheduled = false; } @@ -1191,6 +1236,18 @@ static inline void __run_posix_cpu_timers(struct task_struct *tsk) lockdep_posixtimer_exit(); } +static void posix_cpu_timer_wait_running(struct k_itimer *timr) +{ + cpu_relax(); +} + +static void posix_cpu_timer_wait_running_nsleep(struct k_itimer *timr) +{ + spin_unlock_irq(&timr->it_lock); + cpu_relax(); + spin_lock_irq(&timr->it_lock); +} + static inline bool posix_cpu_timers_work_scheduled(struct task_struct *tsk) { return false; @@ -1299,6 +1356,8 @@ static void handle_posix_cpu_timers(struct task_struct *tsk) */ if (likely(cpu_firing >= 0)) cpu_timer_fire(timer); + /* See posix_cpu_timer_wait_running() */ + rcu_assign_pointer(timer->it.cpu.handling, NULL); spin_unlock(&timer->it_lock); } } @@ -1434,23 +1493,16 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags, expires = cpu_timer_getexpires(&timer.it.cpu); error = posix_cpu_timer_set(&timer, 0, &zero_it, &it); if (!error) { - /* - * Timer is now unarmed, deletion can not fail. - */ + /* Timer is now unarmed, deletion can not fail. */ posix_cpu_timer_del(&timer); + } else { + while (error == TIMER_RETRY) { + posix_cpu_timer_wait_running_nsleep(&timer); + error = posix_cpu_timer_del(&timer); + } } - spin_unlock_irq(&timer.it_lock); - while (error == TIMER_RETRY) { - /* - * We need to handle case when timer was or is in the - * middle of firing. In other cases we already freed - * resources. - */ - spin_lock_irq(&timer.it_lock); - error = posix_cpu_timer_del(&timer); - spin_unlock_irq(&timer.it_lock); - } + spin_unlock_irq(&timer.it_lock); if ((it.it_value.tv_sec | it.it_value.tv_nsec) == 0) { /* @@ -1560,6 +1612,7 @@ const struct k_clock clock_posix_cpu = { .timer_del = posix_cpu_timer_del, .timer_get = posix_cpu_timer_get, .timer_rearm = posix_cpu_timer_rearm, + .timer_wait_running = posix_cpu_timer_wait_running, }; const struct k_clock clock_process = { diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index 724ca7eb1a6e..d089627f2f2b 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -846,6 +846,10 @@ static struct k_itimer *timer_wait_running(struct k_itimer *timer, rcu_read_lock(); unlock_timer(timer, *flags); + /* + * kc->timer_wait_running() might drop RCU lock. So @timer + * cannot be touched anymore after the function returns! + */ if (!WARN_ON_ONCE(!kc->timer_wait_running)) kc->timer_wait_running(timer); From 68494eb75f1fd1366a969a8374fd72cfb9dacadb Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Wed, 14 Jul 2021 13:06:38 -0300 Subject: [PATCH 21/29] perf sched: Cast PTHREAD_STACK_MIN to int as it may turn into sysconf(__SC_THREAD_STACK_MIN_VALUE) commit d08c84e01afa7a7eee6badab25d5420fa847f783 upstream. In fedora rawhide the PTHREAD_STACK_MIN define may end up expanded to a sysconf() call, and that will return 'long int', breaking the build: 45 fedora:rawhide : FAIL gcc version 11.1.1 20210623 (Red Hat 11.1.1-6) (GCC) builtin-sched.c: In function 'create_tasks': /git/perf-5.14.0-rc1/tools/include/linux/kernel.h:43:24: error: comparison of distinct pointer types lacks a cast [-Werror] 43 | (void) (&_max1 == &_max2); \ | ^~ builtin-sched.c:673:34: note: in expansion of macro 'max' 673 | (size_t) max(16 * 1024, PTHREAD_STACK_MIN)); | ^~~ cc1: all warnings being treated as errors $ grep __sysconf /usr/include/*/*.h /usr/include/bits/pthread_stack_min-dynamic.h:extern long int __sysconf (int __name) __THROW; /usr/include/bits/pthread_stack_min-dynamic.h:# define PTHREAD_STACK_MIN __sysconf (__SC_THREAD_STACK_MIN_VALUE) /usr/include/bits/time.h:extern long int __sysconf (int); /usr/include/bits/time.h:# define CLK_TCK ((__clock_t) __sysconf (2)) /* 2 is _SC_CLK_TCK */ $ So cast it to int to cope with that. Signed-off-by: Arnaldo Carvalho de Melo Cc: Guenter Roeck Signed-off-by: Greg Kroah-Hartman --- tools/perf/builtin-sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index d3b5f5faf8c1..02e5774cabb6 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -670,7 +670,7 @@ static void create_tasks(struct perf_sched *sched) err = pthread_attr_init(&attr); BUG_ON(err); err = pthread_attr_setstacksize(&attr, - (size_t) max(16 * 1024, PTHREAD_STACK_MIN)); + (size_t) max(16 * 1024, (int)PTHREAD_STACK_MIN)); BUG_ON(err); err = pthread_mutex_lock(&sched->start_work_mutex); BUG_ON(err); From 874bdf43b4a7dc5463c31508f62b3e42eb237b08 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Wed, 3 May 2023 21:09:39 -0700 Subject: [PATCH 22/29] blk-mq: release crypto keyslot before reporting I/O complete commit 9cd1e566676bbcb8a126acd921e4e194e6339603 upstream. Once all I/O using a blk_crypto_key has completed, filesystems can call blk_crypto_evict_key(). However, the block layer currently doesn't call blk_crypto_put_keyslot() until the request is being freed, which happens after upper layers have been told (via bio_endio()) the I/O has completed. This causes a race condition where blk_crypto_evict_key() can see 'slot_refs != 0' without there being an actual bug. This makes __blk_crypto_evict_key() hit the 'WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)' and return without doing anything, eventually causing a use-after-free in blk_crypto_reprogram_all_keys(). (This is a very rare bug and has only been seen when per-file keys are being used with fscrypt.) There are two options to fix this: either release the keyslot before bio_endio() is called on the request's last bio, or make __blk_crypto_evict_key() ignore slot_refs. Let's go with the first solution, since it preserves the ability to report bugs (via WARN_ON_ONCE) where a key is evicted while still in-use. Fixes: a892c8d52c02 ("block: Inline encryption support for blk-mq") Cc: stable@vger.kernel.org Reviewed-by: Nathan Huckleberry Reviewed-by: Christoph Hellwig Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- block/blk-core.c | 7 +++++++ block/blk-crypto-internal.h | 25 +++++++++++++++++++++---- block/blk-crypto.c | 24 ++++++++++++------------ block/blk-merge.c | 2 ++ block/blk-mq.c | 2 +- 5 files changed, 43 insertions(+), 17 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9afb79b322fb..d0d0dd8151f7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1444,6 +1444,13 @@ bool blk_update_request(struct request *req, blk_status_t error, req->q->integrity.profile->complete_fn(req, nr_bytes); #endif + /* + * Upper layers may call blk_crypto_evict_key() anytime after the last + * bio_endio(). Therefore, the keyslot must be released before that. + */ + if (blk_crypto_rq_has_keyslot(req) && nr_bytes >= blk_rq_bytes(req)) + __blk_crypto_rq_put_keyslot(req); + if (unlikely(error && !blk_rq_is_passthrough(req) && !(req->rq_flags & RQF_QUIET))) print_req_error(req, error, __func__); diff --git a/block/blk-crypto-internal.h b/block/blk-crypto-internal.h index 0d36aae538d7..8e0834557620 100644 --- a/block/blk-crypto-internal.h +++ b/block/blk-crypto-internal.h @@ -60,6 +60,11 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq) return rq->crypt_ctx; } +static inline bool blk_crypto_rq_has_keyslot(struct request *rq) +{ + return rq->crypt_keyslot; +} + #else /* CONFIG_BLK_INLINE_ENCRYPTION */ static inline bool bio_crypt_rq_ctx_compatible(struct request *rq, @@ -93,6 +98,11 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq) return false; } +static inline bool blk_crypto_rq_has_keyslot(struct request *rq) +{ + return false; +} + #endif /* CONFIG_BLK_INLINE_ENCRYPTION */ void __bio_crypt_advance(struct bio *bio, unsigned int bytes); @@ -127,14 +137,21 @@ static inline bool blk_crypto_bio_prep(struct bio **bio_ptr) return true; } -blk_status_t __blk_crypto_init_request(struct request *rq); -static inline blk_status_t blk_crypto_init_request(struct request *rq) +blk_status_t __blk_crypto_rq_get_keyslot(struct request *rq); +static inline blk_status_t blk_crypto_rq_get_keyslot(struct request *rq) { if (blk_crypto_rq_is_encrypted(rq)) - return __blk_crypto_init_request(rq); + return __blk_crypto_rq_get_keyslot(rq); return BLK_STS_OK; } +void __blk_crypto_rq_put_keyslot(struct request *rq); +static inline void blk_crypto_rq_put_keyslot(struct request *rq) +{ + if (blk_crypto_rq_has_keyslot(rq)) + __blk_crypto_rq_put_keyslot(rq); +} + void __blk_crypto_free_request(struct request *rq); static inline void blk_crypto_free_request(struct request *rq) { @@ -173,7 +190,7 @@ static inline blk_status_t blk_crypto_insert_cloned_request(struct request *rq) { if (blk_crypto_rq_is_encrypted(rq)) - return blk_crypto_init_request(rq); + return blk_crypto_rq_get_keyslot(rq); return BLK_STS_OK; } diff --git a/block/blk-crypto.c b/block/blk-crypto.c index 5ffa9aab49de..0506adfd9ca6 100644 --- a/block/blk-crypto.c +++ b/block/blk-crypto.c @@ -216,26 +216,26 @@ static bool bio_crypt_check_alignment(struct bio *bio) return true; } -blk_status_t __blk_crypto_init_request(struct request *rq) +blk_status_t __blk_crypto_rq_get_keyslot(struct request *rq) { return blk_ksm_get_slot_for_key(rq->q->ksm, rq->crypt_ctx->bc_key, &rq->crypt_keyslot); } -/** - * __blk_crypto_free_request - Uninitialize the crypto fields of a request. - * - * @rq: The request whose crypto fields to uninitialize. - * - * Completely uninitializes the crypto fields of a request. If a keyslot has - * been programmed into some inline encryption hardware, that keyslot is - * released. The rq->crypt_ctx is also freed. - */ -void __blk_crypto_free_request(struct request *rq) +void __blk_crypto_rq_put_keyslot(struct request *rq) { blk_ksm_put_slot(rq->crypt_keyslot); + rq->crypt_keyslot = NULL; +} + +void __blk_crypto_free_request(struct request *rq) +{ + /* The keyslot, if one was needed, should have been released earlier. */ + if (WARN_ON_ONCE(rq->crypt_keyslot)) + __blk_crypto_rq_put_keyslot(rq); + mempool_free(rq->crypt_ctx, bio_crypt_ctx_pool); - blk_crypto_rq_set_defaults(rq); + rq->crypt_ctx = NULL; } /** diff --git a/block/blk-merge.c b/block/blk-merge.c index fbba277364f0..f3b016b31af8 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -801,6 +801,8 @@ static struct request *attempt_merge(struct request_queue *q, if (!blk_discard_mergable(req)) elv_merge_requests(q, req, next); + blk_crypto_rq_put_keyslot(next); + /* * 'next' is going away, so update stats accordingly */ diff --git a/block/blk-mq.c b/block/blk-mq.c index cf66de0f00fd..e153a36c9ba3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2193,7 +2193,7 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) blk_mq_bio_to_request(rq, bio, nr_segs); - ret = blk_crypto_init_request(rq); + ret = blk_crypto_rq_get_keyslot(rq); if (ret != BLK_STS_OK) { bio->bi_status = ret; bio_endio(bio); From 5072008bef2386237dab112dd0770bd82b7a46c1 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Wed, 3 May 2023 21:09:40 -0700 Subject: [PATCH 23/29] blk-crypto: make blk_crypto_evict_key() return void commit 70493a63ba04f754f7a7dd53a4fcc82700181490 upstream. blk_crypto_evict_key() is only called in contexts such as inode eviction where failure is not an option. So there is nothing the caller can do with errors except log them. (dm-table.c does "use" the error code, but only to pass on to upper layers, so it doesn't really count.) Just make blk_crypto_evict_key() return void and log errors itself. Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers Reviewed-by: Christoph Hellwig Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- block/blk-crypto.c | 22 ++++++++++------------ include/linux/blk-crypto.h | 4 ++-- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/block/blk-crypto.c b/block/blk-crypto.c index 0506adfd9ca6..d8c48ee44ba6 100644 --- a/block/blk-crypto.c +++ b/block/blk-crypto.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include "blk-crypto-internal.h" @@ -393,19 +394,16 @@ int blk_crypto_start_using_key(const struct blk_crypto_key *key, * Upper layers (filesystems) must call this function to ensure that a key is * evicted from any hardware that it might have been programmed into. The key * must not be in use by any in-flight IO when this function is called. - * - * Return: 0 on success or if key is not present in the q's ksm, -err on error. */ -int blk_crypto_evict_key(struct request_queue *q, - const struct blk_crypto_key *key) +void blk_crypto_evict_key(struct request_queue *q, + const struct blk_crypto_key *key) { - if (blk_ksm_crypto_cfg_supported(q->ksm, &key->crypto_cfg)) - return blk_ksm_evict_key(q->ksm, key); + int err; - /* - * If the request queue's associated inline encryption hardware didn't - * have support for the key, then the key might have been programmed - * into the fallback keyslot manager, so try to evict from there. - */ - return blk_crypto_fallback_evict_key(key); + if (blk_ksm_crypto_cfg_supported(q->ksm, &key->crypto_cfg)) + err = blk_ksm_evict_key(q->ksm, key); + else + err = blk_crypto_fallback_evict_key(key); + if (err) + pr_warn_ratelimited("error %d evicting key\n", err); } diff --git a/include/linux/blk-crypto.h b/include/linux/blk-crypto.h index 69b24fe92cbf..5e96bad54804 100644 --- a/include/linux/blk-crypto.h +++ b/include/linux/blk-crypto.h @@ -97,8 +97,8 @@ int blk_crypto_init_key(struct blk_crypto_key *blk_key, const u8 *raw_key, int blk_crypto_start_using_key(const struct blk_crypto_key *key, struct request_queue *q); -int blk_crypto_evict_key(struct request_queue *q, - const struct blk_crypto_key *key); +void blk_crypto_evict_key(struct request_queue *q, + const struct blk_crypto_key *key); bool blk_crypto_config_supported(struct request_queue *q, const struct blk_crypto_config *cfg); From 701a8220762ff90615dc91d3543f789391b63298 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Wed, 3 May 2023 21:09:41 -0700 Subject: [PATCH 24/29] blk-crypto: make blk_crypto_evict_key() more robust commit 5c7cb94452901a93e90c2230632e2c12a681bc92 upstream. If blk_crypto_evict_key() sees that the key is still in-use (due to a bug) or that ->keyslot_evict failed, it currently just returns while leaving the key linked into the keyslot management structures. However, blk_crypto_evict_key() is only called in contexts such as inode eviction where failure is not an option. So actually the caller proceeds with freeing the blk_crypto_key regardless of the return value of blk_crypto_evict_key(). These two assumptions don't match, and the result is that there can be a use-after-free in blk_crypto_reprogram_all_keys() after one of these errors occurs. (Note, these errors *shouldn't* happen; we're just talking about what happens if they do anyway.) Fix this by making blk_crypto_evict_key() unlink the key from the keyslot management structures even on failure. Also improve some comments. Fixes: 1b2628397058 ("block: Keyslot Manager for Inline Encryption") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers Reviewed-by: Christoph Hellwig Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- block/blk-crypto.c | 29 +++++++++++++++++++-------- block/keyslot-manager.c | 43 ++++++++++++++++++++--------------------- 2 files changed, 42 insertions(+), 30 deletions(-) diff --git a/block/blk-crypto.c b/block/blk-crypto.c index d8c48ee44ba6..87ec55d4354f 100644 --- a/block/blk-crypto.c +++ b/block/blk-crypto.c @@ -385,15 +385,20 @@ int blk_crypto_start_using_key(const struct blk_crypto_key *key, } /** - * blk_crypto_evict_key() - Evict a key from any inline encryption hardware - * it may have been programmed into - * @q: The request queue who's associated inline encryption hardware this key - * might have been programmed into - * @key: The key to evict + * blk_crypto_evict_key() - Evict a blk_crypto_key from a request_queue + * @q: a request_queue on which I/O using the key may have been done + * @key: the key to evict * - * Upper layers (filesystems) must call this function to ensure that a key is - * evicted from any hardware that it might have been programmed into. The key - * must not be in use by any in-flight IO when this function is called. + * For a given request_queue, this function removes the given blk_crypto_key + * from the keyslot management structures and evicts it from any underlying + * hardware keyslot(s) or blk-crypto-fallback keyslot it may have been + * programmed into. + * + * Upper layers must call this before freeing the blk_crypto_key. It must be + * called for every request_queue the key may have been used on. The key must + * no longer be in use by any I/O when this function is called. + * + * Context: May sleep. */ void blk_crypto_evict_key(struct request_queue *q, const struct blk_crypto_key *key) @@ -404,6 +409,14 @@ void blk_crypto_evict_key(struct request_queue *q, err = blk_ksm_evict_key(q->ksm, key); else err = blk_crypto_fallback_evict_key(key); + /* + * An error can only occur here if the key failed to be evicted from a + * keyslot (due to a hardware or driver issue) or is allegedly still in + * use by I/O (due to a kernel bug). Even in these cases, the key is + * still unlinked from the keyslot management structures, and the caller + * is allowed and expected to free it right away. There's nothing + * callers can do to handle errors, so just log them and return void. + */ if (err) pr_warn_ratelimited("error %d evicting key\n", err); } diff --git a/block/keyslot-manager.c b/block/keyslot-manager.c index 86f8195d8039..17a1f1ba44ef 100644 --- a/block/keyslot-manager.c +++ b/block/keyslot-manager.c @@ -305,44 +305,43 @@ bool blk_ksm_crypto_cfg_supported(struct blk_keyslot_manager *ksm, return true; } -/** - * blk_ksm_evict_key() - Evict a key from the lower layer device. - * @ksm: The keyslot manager to evict from - * @key: The key to evict - * - * Find the keyslot that the specified key was programmed into, and evict that - * slot from the lower layer device. The slot must not be in use by any - * in-flight IO when this function is called. - * - * Context: Process context. Takes and releases ksm->lock. - * Return: 0 on success or if there's no keyslot with the specified key, -EBUSY - * if the keyslot is still in use, or another -errno value on other - * error. +/* + * This is an internal function that evicts a key from an inline encryption + * device that can be either a real device or the blk-crypto-fallback "device". + * It is used only by blk_crypto_evict_key(); see that function for details. */ int blk_ksm_evict_key(struct blk_keyslot_manager *ksm, const struct blk_crypto_key *key) { struct blk_ksm_keyslot *slot; - int err = 0; + int err; blk_ksm_hw_enter(ksm); slot = blk_ksm_find_keyslot(ksm, key); - if (!slot) - goto out_unlock; + if (!slot) { + /* + * Not an error, since a key not in use by I/O is not guaranteed + * to be in a keyslot. There can be more keys than keyslots. + */ + err = 0; + goto out; + } if (WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)) { + /* BUG: key is still in use by I/O */ err = -EBUSY; - goto out_unlock; + goto out_remove; } err = ksm->ksm_ll_ops.keyslot_evict(ksm, key, blk_ksm_get_slot_idx(slot)); - if (err) - goto out_unlock; - +out_remove: + /* + * Callers free the key even on error, so unlink the key from the hash + * table and clear slot->key even on error. + */ hlist_del(&slot->hash_node); slot->key = NULL; - err = 0; -out_unlock: +out: blk_ksm_hw_exit(ksm); return err; } From c8714ddf3ccf94a50574ba5db2e2128eb6e77b75 Mon Sep 17 00:00:00 2001 From: Harshad Shirwadkar Date: Thu, 23 Dec 2021 12:21:37 -0800 Subject: [PATCH 25/29] ext4: use ext4_journal_start/stop for fast commit transactions commit 2729cfdcfa1cc49bef5a90d046fa4a187fdfcc69 upstream. This patch drops all calls to ext4_fc_start_update() and ext4_fc_stop_update(). To ensure that there are no ongoing journal updates during fast commit, we also make jbd2_fc_begin_commit() lock journal for updates. This way we don't have to maintain two different transaction start stop APIs for fast commit and full commit. This patch doesn't remove the functions altogether since in future we want to have inode level locking for fast commits. Signed-off-by: Harshad Shirwadkar Link: https://lore.kernel.org/r/20211223202140.2061101-2-harshads@google.com Signed-off-by: Theodore Ts'o Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman --- fs/ext4/acl.c | 2 -- fs/ext4/extents.c | 2 -- fs/ext4/file.c | 4 ---- fs/ext4/inode.c | 7 +------ fs/ext4/ioctl.c | 8 +------- fs/jbd2/journal.c | 2 ++ 6 files changed, 4 insertions(+), 21 deletions(-) diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c index 68aaed48315f..76f634d185f1 100644 --- a/fs/ext4/acl.c +++ b/fs/ext4/acl.c @@ -242,7 +242,6 @@ retry: handle = ext4_journal_start(inode, EXT4_HT_XATTR, credits); if (IS_ERR(handle)) return PTR_ERR(handle); - ext4_fc_start_update(inode); if ((type == ACL_TYPE_ACCESS) && acl) { error = posix_acl_update_mode(inode, &mode, &acl); @@ -260,7 +259,6 @@ retry: } out_stop: ext4_journal_stop(handle); - ext4_fc_stop_update(inode); if (error == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry; return error; diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index bf0872bb34f6..6c06ce9dd6bd 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4694,7 +4694,6 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) FALLOC_FL_INSERT_RANGE)) return -EOPNOTSUPP; - ext4_fc_start_update(inode); inode_lock(inode); ret = ext4_convert_inline_data(inode); inode_unlock(inode); @@ -4764,7 +4763,6 @@ out: inode_unlock(inode); trace_ext4_fallocate_exit(inode, offset, max_blocks, ret); exit: - ext4_fc_stop_update(inode); return ret; } diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 0f61e0aa85d6..f42cc1fe0ba1 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -260,7 +260,6 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, if (iocb->ki_flags & IOCB_NOWAIT) return -EOPNOTSUPP; - ext4_fc_start_update(inode); inode_lock(inode); ret = ext4_write_checks(iocb, from); if (ret <= 0) @@ -272,7 +271,6 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, out: inode_unlock(inode); - ext4_fc_stop_update(inode); if (likely(ret > 0)) { iocb->ki_pos += ret; ret = generic_write_sync(iocb, ret); @@ -559,9 +557,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) goto out; } - ext4_fc_start_update(inode); ret = ext4_orphan_add(handle, inode); - ext4_fc_stop_update(inode); if (ret) { ext4_journal_stop(handle); goto out; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 9bd5f8b0511b..a93b93de5a60 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5437,7 +5437,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) if (error) return error; } - ext4_fc_start_update(inode); + if ((ia_valid & ATTR_UID && !uid_eq(attr->ia_uid, inode->i_uid)) || (ia_valid & ATTR_GID && !gid_eq(attr->ia_gid, inode->i_gid))) { handle_t *handle; @@ -5461,7 +5461,6 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) if (error) { ext4_journal_stop(handle); - ext4_fc_stop_update(inode); return error; } /* Update corresponding info in inode so that everything is in @@ -5473,7 +5472,6 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) error = ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle); if (unlikely(error)) { - ext4_fc_stop_update(inode); return error; } } @@ -5488,12 +5486,10 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); if (attr->ia_size > sbi->s_bitmap_maxbytes) { - ext4_fc_stop_update(inode); return -EFBIG; } } if (!S_ISREG(inode->i_mode)) { - ext4_fc_stop_update(inode); return -EINVAL; } @@ -5619,7 +5615,6 @@ err_out: ext4_std_error(inode->i_sb, error); if (!error) error = rc; - ext4_fc_stop_update(inode); return error; } diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 53bdc67a815f..1171618f6549 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -1322,13 +1322,7 @@ out: long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { - long ret; - - ext4_fc_start_update(file_inode(filp)); - ret = __ext4_ioctl(filp, cmd, arg); - ext4_fc_stop_update(file_inode(filp)); - - return ret; + return __ext4_ioctl(filp, cmd, arg); } #ifdef CONFIG_COMPAT diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 6689d235de8a..fee325d62bfd 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -757,6 +757,7 @@ int jbd2_fc_begin_commit(journal_t *journal, tid_t tid) } journal->j_flags |= JBD2_FAST_COMMIT_ONGOING; write_unlock(&journal->j_state_lock); + jbd2_journal_lock_updates(journal); return 0; } @@ -768,6 +769,7 @@ EXPORT_SYMBOL(jbd2_fc_begin_commit); */ static int __jbd2_fc_end_commit(journal_t *journal, tid_t tid, bool fallback) { + jbd2_journal_unlock_updates(journal); if (journal->j_fc_cleanup_callback) journal->j_fc_cleanup_callback(journal, 0); write_lock(&journal->j_state_lock); From a863ac03fae0c07e9eba8129fc4b8dcabe0ad670 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Nuno=20S=C3=A1?= Date: Mon, 27 Mar 2023 16:54:14 +0200 Subject: [PATCH 26/29] staging: iio: resolver: ads1210: fix config mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 16313403d873ff17a587818b61f84c8cb4971cef upstream. As stated in the device datasheet [1], bits a0 and a1 have to be set to 1 for the configuration mode. [1]: https://www.analog.com/media/en/technical-documentation/data-sheets/ad2s1210.pdf Fixes: b19e9ad5e2cb9 ("staging:iio:resolver:ad2s1210 general driver cleanup") Cc: stable Signed-off-by: Nuno Sá Link: https://lore.kernel.org/r/20230327145414.1505537-1-nuno.sa@analog.com Signed-off-by: Greg Kroah-Hartman --- drivers/staging/iio/resolver/ad2s1210.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/iio/resolver/ad2s1210.c b/drivers/staging/iio/resolver/ad2s1210.c index 74adb82f37c3..a19cfb2998c9 100644 --- a/drivers/staging/iio/resolver/ad2s1210.c +++ b/drivers/staging/iio/resolver/ad2s1210.c @@ -101,7 +101,7 @@ struct ad2s1210_state { static const int ad2s1210_mode_vals[4][2] = { [MOD_POS] = { 0, 0 }, [MOD_VEL] = { 0, 1 }, - [MOD_CONFIG] = { 1, 0 }, + [MOD_CONFIG] = { 1, 1 }, }; static inline void ad2s1210_set_mode(enum ad2s1210_mode mode, From 29b89908fdd94127e9b6e2f29ecceae82a1314bc Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Wed, 5 Apr 2023 11:03:42 +0200 Subject: [PATCH 27/29] xhci: fix debugfs register accesses while suspended commit 735baf1b23458f71a8b15cb924af22c9ff9cd125 upstream. Wire up the debugfs regset device pointer so that the controller is resumed before accessing registers to avoid crashing or locking up if it happens to be runtime suspended. Fixes: 02b6fdc2a153 ("usb: xhci: Add debugfs interface for xHCI driver") Cc: stable@vger.kernel.org # 4.15: 30332eeefec8: debugfs: regset32: Add Runtime PM support Cc: stable@vger.kernel.org # 4.15 Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20230405090342.7363-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-debugfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/host/xhci-debugfs.c b/drivers/usb/host/xhci-debugfs.c index dc832ddf7033..bd40caeeb21c 100644 --- a/drivers/usb/host/xhci-debugfs.c +++ b/drivers/usb/host/xhci-debugfs.c @@ -133,6 +133,7 @@ static void xhci_debugfs_regset(struct xhci_hcd *xhci, u32 base, regset->regs = regs; regset->nregs = nregs; regset->base = hcd->regs + base; + regset->dev = hcd->self.controller; debugfs_create_regset32((const char *)rgs->name, 0444, parent, regset); } From 2884595932ea07481590b2c10e20fcb729eaaad0 Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Tue, 24 Jan 2023 17:31:26 +0000 Subject: [PATCH 28/29] tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem commit 58d7668242647e661a20efe065519abd6454287e upstream. For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined. However, cpu_is_hotpluggable() still returns true for those CPUs. This causes torture tests that do offlining to end up trying to offline this CPU causing test failures. Such failure happens on all architectures. Fix the repeated error messages thrown by this (even if the hotplug errors are harmless) by asking the opinion of the nohz subsystem on whether the CPU can be hotplugged. [ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ] For drivers/base/ portion: Acked-by: Greg Kroah-Hartman Acked-by: Frederic Weisbecker Cc: Frederic Weisbecker Cc: "Paul E. McKenney" Cc: Zhouyi Zhou Cc: Will Deacon Cc: Marc Zyngier Cc: rcu Cc: stable@vger.kernel.org Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel") Signed-off-by: Paul E. McKenney Signed-off-by: Joel Fernandes (Google) Signed-off-by: Greg Kroah-Hartman --- drivers/base/cpu.c | 3 ++- include/linux/tick.h | 2 ++ kernel/time/tick-sched.c | 11 ++++++++--- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 24dd532c8e5e..33e0526907eb 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -489,7 +489,8 @@ static const struct attribute_group *cpu_root_attr_groups[] = { bool cpu_is_hotpluggable(unsigned cpu) { struct device *dev = get_cpu_device(cpu); - return dev && container_of(dev, struct cpu, dev)->hotpluggable; + return dev && container_of(dev, struct cpu, dev)->hotpluggable + && tick_nohz_cpu_hotpluggable(cpu); } EXPORT_SYMBOL_GPL(cpu_is_hotpluggable); diff --git a/include/linux/tick.h b/include/linux/tick.h index 7340613c7eff..a90a8f7759a2 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -211,6 +211,7 @@ extern void tick_nohz_dep_set_signal(struct signal_struct *signal, enum tick_dep_bits bit); extern void tick_nohz_dep_clear_signal(struct signal_struct *signal, enum tick_dep_bits bit); +extern bool tick_nohz_cpu_hotpluggable(unsigned int cpu); /* * The below are tick_nohz_[set,clear]_dep() wrappers that optimize off-cases @@ -275,6 +276,7 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { } static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { } +static inline bool tick_nohz_cpu_hotpluggable(unsigned int cpu) { return true; } static inline void tick_dep_set(enum tick_dep_bits bit) { } static inline void tick_dep_clear(enum tick_dep_bits bit) { } diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 92fb738813f3..e4e0d032126b 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -426,7 +426,7 @@ void __init tick_nohz_full_setup(cpumask_var_t cpumask) tick_nohz_full_running = true; } -static int tick_nohz_cpu_down(unsigned int cpu) +bool tick_nohz_cpu_hotpluggable(unsigned int cpu) { /* * The tick_do_timer_cpu CPU handles housekeeping duty (unbound @@ -434,8 +434,13 @@ static int tick_nohz_cpu_down(unsigned int cpu) * CPUs. It must remain online when nohz full is enabled. */ if (tick_nohz_full_running && tick_do_timer_cpu == cpu) - return -EBUSY; - return 0; + return false; + return true; +} + +static int tick_nohz_cpu_down(unsigned int cpu) +{ + return tick_nohz_cpu_hotpluggable(cpu) ? 0 : -EBUSY; } void __init tick_nohz_init(void) From 47e61cadc7a5f3dffd42d2d6fda81be163f1ab82 Mon Sep 17 00:00:00 2001 From: Jiaxun Yang Date: Tue, 11 Apr 2023 12:14:26 +0100 Subject: [PATCH 29/29] MIPS: fw: Allow firmware to pass a empty env commit ee1809ed7bc456a72dc8410b475b73021a3a68d5 upstream. fw_getenv will use env entry to determine style of env, however it is legal for firmware to just pass a empty list. Check if first entry exist before running strchr to avoid null pointer dereference. Cc: stable@vger.kernel.org Link: https://github.com/clbr/n64bootloader/issues/5 Signed-off-by: Jiaxun Yang Signed-off-by: Thomas Bogendoerfer Signed-off-by: Greg Kroah-Hartman --- arch/mips/fw/lib/cmdline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/fw/lib/cmdline.c b/arch/mips/fw/lib/cmdline.c index f24cbb4a39b5..892765b742bb 100644 --- a/arch/mips/fw/lib/cmdline.c +++ b/arch/mips/fw/lib/cmdline.c @@ -53,7 +53,7 @@ char *fw_getenv(char *envname) { char *result = NULL; - if (_fw_envp != NULL) { + if (_fw_envp != NULL && fw_envp(0) != NULL) { /* * Return a pointer to the given environment variable. * YAMON uses "name", "value" pairs, while U-Boot uses