TWx Linux Repository
Go to file
Michael Liang ec00ea5645 nvme-tcp: fix premature queue removal and I/O failover
[ Upstream commit 77e40bbce93059658aee02786a32c5c98a240a8a ]

This patch addresses a data corruption issue observed in nvme-tcp during
testing.

In an NVMe native multipath setup, when an I/O timeout occurs, all
inflight I/Os are canceled almost immediately after the kernel socket is
shut down. These canceled I/Os are reported as host path errors,
triggering a failover that succeeds on a different path.

However, at this point, the original I/O may still be outstanding in the
host's network transmission path (e.g., the NIC’s TX queue). From the
user-space app's perspective, the buffer associated with the I/O is
considered completed since they're acked on the different path and may
be reused for new I/O requests.

Because nvme-tcp enables zero-copy by default in the transmission path,
this can lead to corrupted data being sent to the original target,
ultimately causing data corruption.

We can reproduce this data corruption by injecting delay on one path and
triggering i/o timeout.

To prevent this issue, this change ensures that all inflight
transmissions are fully completed from host's perspective before
returning from queue stop. To handle concurrent I/O timeout from multiple
namespaces under the same controller, always wait in queue stop
regardless of queue's state.

This aligns with the behavior of queue stopping in other NVMe fabric
transports.

Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Michael Liang <mliang@purestorage.com>
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-09 09:44:01 +02:00
arch powerpc/boot: Fix dash warning 2025-05-09 09:43:57 +02:00
block block: fix resource leak in blk_register_queue() error path 2025-04-25 10:45:42 +02:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2025-04-25 10:45:58 +02:00
crypto crypto: null - Use spin lock instead of mutex 2025-05-02 07:50:52 +02:00
Documentation sched/topology: Consolidate and clean up access to a CPU's max compute capacity 2025-05-02 07:50:41 +02:00
drivers nvme-tcp: fix premature queue removal and I/O failover 2025-05-09 09:44:01 +02:00
fs smb: client: fix zero length for mkdir POSIX create context 2025-05-09 09:43:53 +02:00
include ALSA: ump: Fix buffer overflow at UMP SysEx message conversion 2025-05-09 09:44:00 +02:00
init sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP 2025-05-02 07:50:57 +02:00
io_uring io_uring: always do atomic put from iowq 2025-05-02 07:50:57 +02:00
ipc ipc: fix memleak if msg_init_ns failed in create_ipc_ns 2024-12-09 10:32:54 +01:00
kernel bpf: fix null dereference when computing changes_pkt_data of prog w/o subprogs 2025-05-09 09:43:55 +02:00
lib ubsan: Fix panic from test_ubsan_out_of_bounds 2025-05-02 07:51:02 +02:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm/memblock: repeat setting reserved region nid if array is doubled 2025-05-09 09:43:51 +02:00
net net: ipv6: fix UDPv6 GSO segmentation with NAT 2025-05-09 09:44:01 +02:00
rust rust: lockdep: Remove support for dynamically allocated LockClassKeys 2025-03-22 12:50:50 -07:00
samples tracing: Verify event formats that have "%*p.." 2025-05-02 07:50:37 +02:00
scripts objtool: Silence more KCOV warnings, part 2 2025-05-02 07:51:04 +02:00
security landlock: Add the errata interface 2025-04-25 10:45:57 +02:00
sound ASoC: soc-pcm: Fix hw_params() and DAPM widget sequence 2025-05-09 09:43:56 +02:00
tools selftests/bpf: extend changes_pkt_data with cases w/o subprograms 2025-05-09 09:43:55 +02:00
usr kbuild: hdrcheck: fix cross build with clang 2025-03-13 12:58:38 +01:00
virt KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock 2024-10-04 16:29:47 +02:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Remove *.orig pattern from .gitignore 2024-10-04 16:29:44 +02:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml
COPYING
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig
MAINTAINERS sign-file,extract-cert: move common SSL helper functions to a header 2025-04-25 10:45:57 +02:00
Makefile Linux 6.6.89 2025-05-02 07:51:05 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.