PCI: rockchip: dw: Use handle_level_irq for legacy irq

We get a report that a wireless ethernet device which uses legacy interrupt, exposes a
buggy behaviour when patching RT support. It can be observed on RK3588 EVB1 with NVMe under
RT environment when adding pci=nomsi to cmdline. The backtrace looks like below:

echo 3 > /proc/sys/vm/drop_caches && dd if=/dev/nvme0n1 of=/dev/null bs=1M count=50000

[   10.826850] irq 155: nobody cared (try booting with the "irqpoll" option)
[   10.826862] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.160-rt89 #505
[   10.826867] Hardware name: Rockchip RK3588 EVB1 LP4 V10 Board (DT)
[   10.826870] Call trace:
[   10.826871]  dump_backtrace+0x0/0x1e0
[   10.826881]  show_stack+0x18/0x24
[   10.826886]  dump_stack_lvl+0xcc/0xf8
[   10.826891]  dump_stack+0x18/0x54
[   10.826895]  __report_bad_irq+0x4c/0xdc
[   10.826899]  note_interrupt+0x2cc/0x380
[   10.826905]  handle_irq_event+0x10c/0x180
[   10.826909]  handle_simple_irq+0xac/0x120
[   10.826913]  generic_handle_irq+0x30/0x50
[   10.826917]  rk_pcie_legacy_int_handler+0xa8/0x160
[   10.826923]  __handle_domain_irq+0xb8/0x140
[   10.826927]  gic_handle_irq+0xd8/0x2e4
[   10.826932]  el1_irq+0xcc/0x180
[   10.826935]  arch_cpu_idle+0x18/0x3c
[   10.826940]  default_idle_call+0x2c/0x9c
[   10.826944]  do_idle+0x21c/0x2a0
[   10.826949]  cpu_startup_entry+0x24/0x70
[   10.826952]  rest_init+0xd0/0xe0
[   10.826956]  arch_call_rest_init+0x10/0x1c
[   10.826960]  start_kernel+0x50c/0x544
[   10.826963] handlers:
[   10.826965] [<0000000015317c1f>] irq_default_primary_handler threaded [<00000000edb1561e>] pcie_pme_irq
[   10.826977] [<0000000015317c1f>] irq_default_primary_handler threaded [<000000000065643b>] nvme_irq
[   10.826988] [<0000000015317c1f>] irq_default_primary_handler threaded [<000000000065643b>] nvme_irq
[   10.826996] Disabling IRQ #155

And NVMe can't work anymore due to the irq problem. The actual problem looks like:

nvme_irq                                    nvme_irq
 //process the previous request
   -> nvme_process_cq(nvmeq)		    // the previous one is still processing
                                            -> if(nvme_process_cq(nvmeq))
   -> return IRQ_HANDLED                    -> return IRQ_NONE

so a spurious irq was counted and if the irq ack time is short enough to increase the spurious
irq exceeding the limitation, report_bad_irq was triggered. This is why the bug was only observed
under RT environment since the irq was distributed more quickly than ever, but the bug was always there.

root@linaro-alip:/# cat /proc/irq/155/spurious
count 8990
unhandled 24339
last_unhandled 189829 ms

This can be fixed by the drivers as we could see many patches regarding "irq xxx: nobody cared",
and it's the case if we don't allow nvme_irq to nest itself or postpone the handler. However the
legacy interrupt support is also buggy. For legacy interrupt, it's a level irq rather than an edge
one. But it happened to work because Rockchip PCIe RC only generates oneshot irq instead of level,
preventing the irq storm from happening. So changing to use handle_level_irq is correct and should
help mask/unmask the irq when dealing with it in between. That's a decent solution for all.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Change-Id: Ie9499b3dbd19ac053500b4c726294296be537ffd
This commit is contained in:
Shawn Lin
2023-09-16 00:33:09 +08:00
parent 66f223674f
commit da03114d74
@@ -1682,7 +1682,7 @@ static struct irq_chip rk_pcie_legacy_irq_chip = {
static int rk_pcie_intx_map(struct irq_domain *domain, unsigned int irq,
irq_hw_number_t hwirq)
{
irq_set_chip_and_handler(irq, &rk_pcie_legacy_irq_chip, handle_simple_irq);
irq_set_chip_and_handler(irq, &rk_pcie_legacy_irq_chip, handle_level_irq);
irq_set_chip_data(irq, domain->host_data);
return 0;