TWx Linux Repository
Go to file
Douglas Anderson ed62ca2f4f USB: core: Avoid race of async_completed() w/ usbdev_release()
While running reboot tests w/ a specific set of USB devices (and
slub_debug enabled), I found that once every few hours my device would
be crashed with a stack that looked like this:

[   14.012445] BUG: spinlock bad magic on CPU#0, modprobe/2091
[   14.012460]  lock: 0xffffffc0cb055978, .magic: ffffffc0, .owner: cryption contexts: %lu/%lu
[   14.012460] /1025536097, .owner_cpu: 0
[   14.012466] CPU: 0 PID: 2091 Comm: modprobe Not tainted 4.4.79 #352
[   14.012468] Hardware name: Google Kevin (DT)
[   14.012471] Call trace:
[   14.012483] [<....>] dump_backtrace+0x0/0x160
[   14.012487] [<....>] show_stack+0x20/0x28
[   14.012494] [<....>] dump_stack+0xb4/0xf0
[   14.012500] [<....>] spin_dump+0x8c/0x98
[   14.012504] [<....>] spin_bug+0x30/0x3c
[   14.012508] [<....>] do_raw_spin_lock+0x40/0x164
[   14.012515] [<....>] _raw_spin_lock_irqsave+0x64/0x74
[   14.012521] [<....>] __wake_up+0x2c/0x60
[   14.012528] [<....>] async_completed+0x2d0/0x300
[   14.012534] [<....>] __usb_hcd_giveback_urb+0xc4/0x138
[   14.012538] [<....>] usb_hcd_giveback_urb+0x54/0xf0
[   14.012544] [<....>] xhci_irq+0x1314/0x1348
[   14.012548] [<....>] usb_hcd_irq+0x40/0x50
[   14.012553] [<....>] handle_irq_event_percpu+0x1b4/0x3f0
[   14.012556] [<....>] handle_irq_event+0x4c/0x7c
[   14.012561] [<....>] handle_fasteoi_irq+0x158/0x1c8
[   14.012564] [<....>] generic_handle_irq+0x30/0x44
[   14.012568] [<....>] __handle_domain_irq+0x90/0xbc
[   14.012572] [<....>] gic_handle_irq+0xcc/0x18c

Investigation using kgdb() found that the wait queue that was passed
into wake_up() had been freed (it was filled with slub_debug poison).

I analyzed and instrumented the code and reproduced.  My current
belief is that this is happening:

1. async_completed() is called (from IRQ).  Moves "as" onto the
   completed list.
2. On another CPU, proc_reapurbnonblock_compat() calls
   async_getcompleted().  Blocks on spinlock.
3. async_completed() releases the lock; keeps running; gets blocked
   midway through wake_up().
4. proc_reapurbnonblock_compat() => async_getcompleted() gets the
   lock; removes "as" from completed list and frees it.
5. usbdev_release() is called.  Frees "ps".
6. async_completed() finally continues running wake_up().  ...but
   wake_up() has a pointer to the freed "ps".

The instrumentation that led me to believe this was based on adding
some trace_printk() calls in a select few functions and then using
kdb's "ftdump" at crash time.  The trace follows (NOTE: in the trace
below I cheated a little bit and added a udelay(1000) in
async_completed() after releasing the spinlock because I wanted it to
trigger quicker):

<...>-2104   0d.h2 13759034us!: async_completed at start: as=ffffffc0cc638200
mtpd-2055    3.... 13759356us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3d..1 13759362us : async_getcompleted after list_del_init: as=ffffffc0cc638200
mtpd-2055    3.... 13759371us+: proc_reapurbnonblock_compat: free_async(ffffffc0cc638200)
mtpd-2055    3.... 13759422us+: async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759479us : usbdev_release at start: ps=ffffffc0cc042080
mtpd-2055    3.... 13759487us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759497us!: usbdev_release after kfree(ps): ps=ffffffc0cc042080
<...>-2104   0d.h2 13760294us : async_completed before wake_up(): as=ffffffc0cc638200

To fix this problem we can just move the wake_up() under the ps->lock.
There should be no issues there that I'm aware of.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28 11:17:57 +02:00
arch Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2017-08-13 15:34:28 -07:00
block block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time 2017-08-09 20:24:38 -06:00
certs modsign: add markers to endif-statements in certs/Makefile 2017-07-14 11:01:37 +10:00
crypto crypto: authencesn - Fix digest_null crash 2017-07-18 17:01:11 +08:00
Documentation phy: for 4.14 2017-08-22 13:20:12 -07:00
drivers USB: core: Avoid race of async_completed() w/ usbdev_release() 2017-08-28 11:17:57 +02:00
firmware firmware/Makefile: force recompilation if makefile changes 2017-05-08 17:15:10 -07:00
fs Some more NFS client bugfixes for 4.13 2017-08-11 13:54:09 -07:00
include Chipidea changes for v4.14-rc1 2017-08-28 10:50:22 +02:00
init random: do not ignore early device randomness 2017-07-12 16:26:00 -07:00
ipc ipc: add missing container_of()s for randstruct 2017-08-02 17:16:12 -07:00
kernel mm: migrate: prevent racy access to tlb_flush_pending 2017-08-10 15:54:07 -07:00
lib fault-inject: fix wrong should_fail() decision in task context 2017-08-10 15:54:06 -07:00
mm rmap: do not call mmu_notifier_invalidate_page() under ptl 2017-08-10 15:54:07 -07:00
net packet: fix tp_reserve race in packet_set_ring 2017-08-10 09:52:12 -07:00
samples samples/bpf: fix bpf tunnel cleanup 2017-07-31 22:02:47 -07:00
scripts parse-maintainers: Move matching sections from MAINTAINERS 2017-08-08 11:16:14 -07:00
security Now that IPC and other changes have landed, enable manual markings for 2017-07-19 08:55:18 -07:00
sound ASoC: Fixes for v4.13 2017-08-02 17:11:45 +02:00
tools bpf: fix selftest/bpf/test_pkt_md_access on s390x 2017-08-07 10:06:27 -07:00
usr ramfs: clarify help text that compression applies to ramfs as well as legacy ramdisk. 2017-07-06 16:24:30 -07:00
virt KVM/ARM Fixes for v4.13-rc4 2017-08-03 17:59:58 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: Add support to generate LLVM assembly files 2017-04-25 08:13:52 +09:00
.mailmap power supply and reset changes for the v4.12 series (part 2) 2017-05-12 12:02:21 -07:00
COPYING
CREDITS avr32: remove support for AVR32 architecture 2017-05-01 09:27:15 +02:00
Kbuild kbuild: Consolidate header generation from ASM offset information 2017-04-13 05:43:37 +09:00
Kconfig
MAINTAINERS phy: for 4.14 2017-08-22 13:20:12 -07:00
Makefile Linux 4.13-rc5 2017-08-13 16:01:32 -07:00
README README: add a new README file, pointing to the Documentation/ 2016-10-24 08:12:35 -02:00

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.