Merge branch 'net-use-skb_attempt_defer_free-in-napi_consume_skb'

Eric Dumazet says:

====================
net: use skb_attempt_defer_free() in napi_consume_skb()

There is a lack of NUMA awareness and more generally lack
of slab caches affinity on TX completion path.

Modern drivers are using napi_consume_skb(), hoping to cache sk_buff
in per-cpu caches so that they can be recycled in RX path.

Only use this if the skb was allocated on the same cpu,
otherwise use skb_attempt_defer_free() so that the skb
is freed on the original cpu.

This removes contention on SLUB spinlocks and data structures,
and this makes sure that recycled sk_buff have correct NUMA locality.

After this series, I get ~50% improvement for an UDP tx workload
on an AMD EPYC 9B45 (IDPF 200Gbit NIC with 32 TX queues).

I will later refactor skb_attempt_defer_free()
to no longer have to care of skb_shared() and skb_release_head_state().
====================

Link: https://patch.msgid.link/20251106202935.1776179-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Jakub Kicinski
2025-11-07 19:02:42 -08:00
3 changed files with 11 additions and 7 deletions
+2 -2
View File
@@ -355,9 +355,9 @@ skb_defer_max
-------------
Max size (in skbs) of the per-cpu list of skbs being freed
by the cpu which allocated them. Used by TCP stack so far.
by the cpu which allocated them.
Default: 64
Default: 128
optmem_max
----------
+1 -1
View File
@@ -20,7 +20,7 @@ struct net_hotdata net_hotdata __cacheline_aligned = {
.dev_tx_weight = 64,
.dev_rx_weight = 64,
.sysctl_max_skb_frags = MAX_SKB_FRAGS,
.sysctl_skb_defer_max = 64,
.sysctl_skb_defer_max = 128,
.sysctl_mem_pcpu_rsv = SK_MEMORY_PCPU_RESERVE
};
EXPORT_SYMBOL(net_hotdata);
+8 -4
View File
@@ -1149,11 +1149,10 @@ void skb_release_head_state(struct sk_buff *skb)
skb);
#endif
skb->destructor = NULL;
}
#if IS_ENABLED(CONFIG_NF_CONNTRACK)
nf_conntrack_put(skb_nfct(skb));
#endif
skb_ext_put(skb);
nf_reset_ct(skb);
skb_ext_reset(skb);
}
/* Free everything but the sk_buff shell. */
@@ -1477,6 +1476,11 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
DEBUG_NET_WARN_ON_ONCE(!in_softirq());
if (skb->alloc_cpu != smp_processor_id() && !skb_shared(skb)) {
skb_release_head_state(skb);
return skb_attempt_defer_free(skb);
}
if (!skb_unref(skb))
return;