twx-linux/drivers
Vladimir Oltean ce31e6ca68 iommu/arm-smmu: Don't unregister on shutdown
Michael Walle says he noticed the following stack trace while performing
a shutdown with "reboot -f". He suggests he got "lucky" and just hit the
correct spot for the reboot while there was a packet transmission in
flight.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 6.1.0-rc5-00088-gf3600ff8e322 #1930
Hardware name: Kontron KBox A-230-LS (DT)
pc : iommu_get_dma_domain+0x14/0x20
lr : iommu_dma_map_page+0x9c/0x254
Call trace:
 iommu_get_dma_domain+0x14/0x20
 dma_map_page_attrs+0x1ec/0x250
 enetc_start_xmit+0x14c/0x10b0
 enetc_xmit+0x60/0xdc
 dev_hard_start_xmit+0xb8/0x210
 sch_direct_xmit+0x11c/0x420
 __dev_queue_xmit+0x354/0xb20
 ip6_finish_output2+0x280/0x5b0
 __ip6_finish_output+0x15c/0x270
 ip6_output+0x78/0x15c
 NF_HOOK.constprop.0+0x50/0xd0
 mld_sendpack+0x1bc/0x320
 mld_ifc_work+0x1d8/0x4dc
 process_one_work+0x1e8/0x460
 worker_thread+0x178/0x534
 kthread+0xe0/0xe4
 ret_from_fork+0x10/0x20
Code: d503201f f9416800 d503233f d50323bf (f9404c00)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt

This appears to be reproducible when the board has a fixed IP address,
is ping flooded from another host, and "reboot -f" is used.

The following is one more manifestation of the issue:

$ reboot -f
kvm: exiting hardware virtualization
cfg80211: failed to load regulatory.db
arm-smmu 5000000.iommu: disabling translation
sdhci-esdhc 2140000.mmc: Removing from iommu group 11
sdhci-esdhc 2150000.mmc: Removing from iommu group 12
fsl-edma 22c0000.dma-controller: Removing from iommu group 17
dwc3 3100000.usb: Removing from iommu group 9
dwc3 3110000.usb: Removing from iommu group 10
ahci-qoriq 3200000.sata: Removing from iommu group 2
fsl-qdma 8380000.dma-controller: Removing from iommu group 20
platform f080000.display: Removing from iommu group 0
etnaviv-gpu f0c0000.gpu: Removing from iommu group 1
etnaviv etnaviv: Removing from iommu group 1
caam_jr 8010000.jr: Removing from iommu group 13
caam_jr 8020000.jr: Removing from iommu group 14
caam_jr 8030000.jr: Removing from iommu group 15
caam_jr 8040000.jr: Removing from iommu group 16
fsl_enetc 0000:00:00.0: Removing from iommu group 4
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu:         GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000
fsl_enetc 0000:00:00.1: Removing from iommu group 5
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu:         GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu:         GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000
fsl_enetc 0000:00:00.2: Removing from iommu group 6
fsl_enetc_mdio 0000:00:00.3: Removing from iommu group 8
mscc_felix 0000:00:00.5: Removing from iommu group 3
fsl_enetc 0000:00:00.6: Removing from iommu group 7
pcieport 0001:00:00.0: Removing from iommu group 18
arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications
arm-smmu 5000000.iommu:         GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000
pcieport 0002:00:00.0: Removing from iommu group 19
Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8
pc : iommu_get_dma_domain+0x14/0x20
lr : iommu_dma_unmap_page+0x38/0xe0
Call trace:
 iommu_get_dma_domain+0x14/0x20
 dma_unmap_page_attrs+0x38/0x1d0
 enetc_unmap_tx_buff.isra.0+0x6c/0x80
 enetc_poll+0x170/0x910
 __napi_poll+0x40/0x1e0
 net_rx_action+0x164/0x37c
 __do_softirq+0x128/0x368
 run_ksoftirqd+0x68/0x90
 smpboot_thread_fn+0x14c/0x190
Code: d503201f f9416800 d503233f d50323bf (f9405400)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

The problem seems to be that iommu_group_remove_device() is allowed to
run with no coordination whatsoever with the shutdown procedure of the
enetc PCI device. In fact, it almost seems as if it implies that the
pci_driver :: shutdown() method is mandatory if DMA is used with an
IOMMU, otherwise this is inevitable. That was never the case; shutdown
methods are optional in device drivers.

This is the call stack that leads to iommu_group_remove_device() during
reboot:

kernel_restart
-> device_shutdown
   -> platform_shutdown
      -> arm_smmu_device_shutdown
         -> arm_smmu_device_remove
            -> iommu_device_unregister
               -> bus_for_each_dev
                  -> remove_iommu_group
                     -> iommu_release_device
                        -> iommu_group_remove_device

I don't know much about the arm_smmu driver, but
arm_smmu_device_shutdown() invoking arm_smmu_device_remove() looks
suspicious, since it causes the IOMMU device to unregister and that's
where everything starts to unravel. It forces all other devices which
depend on IOMMU groups to also point their ->shutdown() to ->remove(),
which will make reboot slower overall.

There are 2 moments relevant to this behavior. First was commit
b06c076ea962 ("Revert "iommu/arm-smmu: Make arm-smmu explicitly
non-modular"") when arm_smmu_device_shutdown() was made to run the exact
same thing as arm_smmu_device_remove(). Prior to that, there was no
iommu_device_unregister() call in arm_smmu_device_shutdown(). However,
that was benign until commit 57365a04c921 ("iommu: Move bus setup to
IOMMU device registration"), which made iommu_device_unregister() call
remove_iommu_group().

Restore the old shutdown behavior by making remove() call shutdown(),
but shutdown() does not call the remove() specific bits.

Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration")
Reported-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20221215141251.3688780-1-vladimir.oltean@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-13 13:46:20 +01:00
..
accel Fix mismerge due to devnode now taking a 'const *' device 2022-12-16 13:04:15 -06:00
accessibility
acpi ACPI fixes for 6.2-rc2 2022-12-30 10:47:25 -08:00
amba ARM updates for 6.2 2022-12-13 15:22:14 -08:00
android
ata ata: ahci: Fix PCS quirk application for suspend 2022-12-27 11:06:57 +09:00
atm treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
auxdisplay
base Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
bcma
block block-2023-01-06 2023-01-06 13:12:42 -08:00
bluetooth treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
bus Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
cdrom
char tpm: Allow system suspend to continue when TPM suspend fails 2023-01-06 14:25:19 -08:00
clk A pile of clk driver updates with a small tracepoint patch to the clk core this 2022-12-13 13:46:07 -08:00
clocksource Updates for timers, timekeeping and drivers: 2022-12-12 12:52:02 -08:00
comedi
connector
counter
cpufreq linux-kselftest-next-6.2-rc1 2022-12-12 16:39:38 -08:00
cpuidle powerpc updates for 6.2 2022-12-19 07:13:33 -06:00
crypto This push fixes a CFI crash in arm64/sm4 as well as a regression 2023-01-06 11:14:11 -08:00
cxl cxl/region: Fix memdev reuse check 2022-12-08 13:03:47 -08:00
dax
dca
devfreq PM / devfreq: event: use devm_platform_get_and_ioremap_resource() 2022-12-05 21:57:20 +09:00
dio
dma dmaengine updates for v6.2 2022-12-19 08:54:17 -06:00
dma-buf Merge drm/drm-fixes into drm-misc-fixes 2023-01-03 08:32:12 +01:00
edac Merge branches 'edac-ghes' and 'edac-misc' into edac-updates-for-v6.2 2022-12-12 15:40:03 +01:00
eisa
extcon Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
firewire
firmware remoteproc updates for v6.2 2022-12-21 09:37:14 -08:00
fpga Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
fsi
gnss
gpio gpio: sifive: Fix refcount leak in sifive_gpio_probe 2023-01-02 13:01:14 +01:00
gpu Only gvt-fixes: 2023-01-06 10:16:49 +01:00
greybus
hid treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
hsi
hte
hv Networking changes for 6.2. 2022-12-13 15:47:48 -08:00
hwmon hwmon updates for v6.2 merge window 2022-12-13 13:09:38 -08:00
hwspinlock
hwtracing
i2c Core got a new helper 'i2c_client_get_device_id', designware got some 2022-12-15 14:47:10 -08:00
i3c i3c: export SETDASA method 2022-12-11 21:25:58 +01:00
idle
iio Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
infiniband RDMA/mlx5: Fix validation of max_rd_atomic caps for DC 2023-01-01 09:40:35 +02:00
input treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
interconnect
iommu iommu/arm-smmu: Don't unregister on shutdown 2023-01-13 13:46:20 +01:00
ipack
irqchip RISC-V Patches for the 6.2 Merge Window, Part 1 2022-12-14 15:23:49 -08:00
isdn treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
leds treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
macintosh
mailbox - qcom: enable sc8280xp, sm8550 and sm4250 support 2022-12-21 09:31:18 -08:00
mcb
md block: handle bio_split_to_limits() NULL return 2023-01-04 09:05:23 -07:00
media treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
memory ARM updates for 6.2 2022-12-13 15:22:14 -08:00
memstick memstick/mspro_block: Convert to use sysfs_emit()/sysfs_emit_at() APIs 2022-12-09 10:29:58 +01:00
message
mfd - New Drivers 2022-12-21 09:19:24 -08:00
misc kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
mmc MMC core: 2022-12-13 13:41:26 -08:00
most
mtd MTD core changes: 2022-12-13 12:32:07 -08:00
mux
net Including fixes from bpf, wifi, and netfilter. 2023-01-05 12:40:50 -08:00
nfc treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
ntb
nubus
nvdimm
nvme block-2023-01-06 2023-01-06 13:12:42 -08:00
nvmem Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
of of: fdt: Honor CONFIG_CMDLINE* even without /chosen node, take 2 2023-01-04 21:31:59 -06:00
opp
parisc parisc: led: Fix potential null-ptr-deref in start_task() 2022-12-17 23:19:38 +01:00
parport
pci phy-for-6.2 2022-12-19 08:40:58 -06:00
pcmcia treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
peci
perf RISC-V Patches for the 6.2 Merge Window, Part 1 2022-12-14 15:23:49 -08:00
phy phy-for-6.2 2022-12-19 08:40:58 -06:00
pinctrl Pin control changes for the v6.2 kernel cycle: 2022-12-13 13:03:06 -08:00
platform USB/Thunderbolt driver changes for 6.2-rc1 2022-12-16 03:22:53 -08:00
pnp
power power supply and reset changes for the v6.2 series 2022-12-17 08:39:31 -06:00
powercap
pps
ps3
ptp Networking changes for 6.2. 2022-12-13 15:47:48 -08:00
pwm pwm: Changes for v6.2-rc1 2022-12-21 09:41:28 -08:00
rapidio rapidio: devices: fix missing put_device in mport_cdev_open 2022-12-11 19:30:20 -08:00
ras
regulator regulator: Fixes for v6.2 2022-12-23 14:38:00 -08:00
remoteproc remoteproc: core: Do pm_relax when in RPROC_OFFLINE state 2022-12-07 11:20:55 -07:00
reset
rpmsg
rtc - New Drivers 2022-12-21 09:19:24 -08:00
s390 block-2023-01-06 2023-01-06 13:12:42 -08:00
sbus
scsi treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
sh
siox
slimbus
soc ARM: SoC fixes for 6.2 2022-12-19 16:07:59 -06:00
soundwire soundwire updates for 6.2 2022-12-19 08:47:33 -06:00
spi spi: Fix for v6.2 2022-12-23 14:44:08 -08:00
spmi
ssb
staging treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
target SCSI misc on 20221213 2022-12-14 08:58:51 -08:00
tc
tee SoC driver updates for 6.2 2022-12-12 10:17:08 -08:00
thermal thermal: int340x: Add missing attribute for data rate base 2022-12-30 19:48:37 +01:00
thunderbolt
tty treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
ufs SCSI misc on 20221213 2022-12-14 08:58:51 -08:00
uio
usb usb: dwc3: gadget: Ignore End Transfer delay on teardown 2023-01-06 16:32:10 +01:00
vdpa vdpa_sim_net: should not drop the multicast/broadcast packet 2022-12-28 05:28:11 -05:00
vfio Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
vhost vhost_vdpa: fix the crash in unmap a large memory 2022-12-28 05:28:11 -05:00
video fbdev: omapfb: avoid stack overflow warning 2023-01-05 11:43:27 +01:00
virt Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
virtio virtio: Implementing attribute show with sysfs_emit 2022-12-28 05:28:11 -05:00
vlynq
w1
watchdog linux-watchdog 6.2-rc1 tag 2022-12-17 08:34:01 -06:00
xen drm for 6.2: 2022-12-13 11:59:58 -08:00
zorro
Kconfig
Makefile