RDMA/rxe: Handle remote errors in the midst of a Read reply sequence

Requesting nodes do not handle a reported error correctly if it is
generated in the middle of multi-packet Read responses, and the node tries
to resend the request endlessly. Let completer terminate the connection in
that case.

Link: https://lore.kernel.org/r/20221013014724.3786212-2-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
This commit is contained in:
Daisuke Matsuda
2022-10-13 10:47:24 +09:00
committed by Leon Romanovsky
parent 5ebc548f4f
commit 5ac814e02e
+8
View File
@@ -200,6 +200,10 @@ static inline enum comp_state check_psn(struct rxe_qp *qp,
*/
if (pkt->psn == wqe->last_psn)
return COMPST_COMP_ACK;
else if (pkt->opcode == IB_OPCODE_RC_ACKNOWLEDGE &&
(qp->comp.opcode == IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST ||
qp->comp.opcode == IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE))
return COMPST_CHECK_ACK;
else
return COMPST_DONE;
} else if ((diff > 0) && (wqe->mask & WR_ATOMIC_OR_READ_MASK)) {
@@ -228,6 +232,10 @@ static inline enum comp_state check_ack(struct rxe_qp *qp,
case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST:
case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
/* Check NAK code to handle a remote error */
if (pkt->opcode == IB_OPCODE_RC_ACKNOWLEDGE)
break;
if (pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE &&
pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST) {
/* read retries of partial data may restart from