Commit Graph

1309683 Commits

Author SHA1 Message Date
Menglong Dong 03483dbde8 net: vxlan: use kfree_skb_reason() in vxlan_mdb_xmit()
Replace kfree_skb() with kfree_skb_reason() in vxlan_mdb_xmit. No drop
reasons are introduced in this commit.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong e7c700aaa6 net: vxlan: add drop reasons support to vxlan_xmit_one()
Replace kfree_skb/dev_kfree_skb with kfree_skb_reason in vxlan_xmit_one.
No drop reasons are introduced in this commit.

The only concern of mine is replacing dev_kfree_skb with
kfree_skb_reason. The dev_kfree_skb is equal to consume_skb, and I'm not
sure if we can change it to kfree_skb here. In my option, the skb is
"dropped" here, isn't it?

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong b71a576e45 net: vxlan: use kfree_skb_reason() in vxlan_xmit()
Replace kfree_skb() with kfree_skb_reason() in vxlan_xmit(). Following
new skb drop reasons are introduced for vxlan:

/* no remote found for xmit */
SKB_DROP_REASON_VXLAN_NO_REMOTE
/* packet without necessary metadata reached a device which is
 * in "external" mode
 */
SKB_DROP_REASON_TUNNEL_TXINFO

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong d209706f56 net: vxlan: make vxlan_set_mac() return drop reasons
Change the return type of vxlan_set_mac() from bool to enum
skb_drop_reason. In this commit, the drop reason
"SKB_DROP_REASON_LOCAL_MAC" is introduced for the case that the source
mac of the packet is a local mac.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong 289fd4e752 net: vxlan: make vxlan_snoop() return drop reasons
Change the return type of vxlan_snoop() from bool to enum
skb_drop_reason. In this commit, two drop reasons are introduced:

  SKB_DROP_REASON_MAC_INVALID_SOURCE
  SKB_DROP_REASON_VXLAN_ENTRY_EXISTS

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong 7b3e018f3e net: vxlan: make vxlan_remcsum() return drop reasons
Make vxlan_remcsum() support skb drop reasons by changing the return
value type of it from bool to enum skb_drop_reason.

The only drop reason in vxlan_remcsum() comes from pskb_may_pull_reason(),
so we just return it.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:09 +01:00
Menglong Dong 4c06d9daf8 net: vxlan: add skb drop reasons to vxlan_rcv()
Introduce skb drop reasons to the function vxlan_rcv(). Following new
drop reasons are added:

  SKB_DROP_REASON_VXLAN_INVALID_HDR
  SKB_DROP_REASON_VXLAN_VNI_NOT_FOUND
  SKB_DROP_REASON_IP_TUNNEL_ECN

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:08 +01:00
Menglong Dong 9990ddf47d net: tunnel: make skb_vlan_inet_prepare() return drop reasons
Make skb_vlan_inet_prepare return the skb drop reasons, which is just
what pskb_may_pull_reason() returns. Meanwhile, adjust all the call of
it.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:08 +01:00
Menglong Dong 7f20dbd7de net: tunnel: add pskb_inet_may_pull_reason() helper
Introduce the function pskb_inet_may_pull_reason() and make
pskb_inet_may_pull a simple inline call to it. The drop reasons of it just
come from pskb_may_pull_reason().

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:08 +01:00
Menglong Dong 454bbde8f0 net: skb: add pskb_network_may_pull_reason() helper
Introduce the function pskb_network_may_pull_reason() and make
pskb_network_may_pull() a simple inline call to it. The drop reasons of
it just come from pskb_may_pull_reason.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-13 11:33:08 +01:00
Justin Chen c531f2269a net: bcmasp: enable SW timestamping
Add skb_tx_timestamp() call and enable support for SW
timestamping.

Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241010221506.802730-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 16:03:31 -07:00
Justin Chen ea22f8eabb net: broadcom: remove select MII from brcmstb Ethernet drivers
The MII driver isn't used by brcmstb Ethernet drivers. Remove it
from the BCMASP, GENET, and SYSTEMPORT drivers.

Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241010191332.1074642-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 16:00:17 -07:00
Jakub Kicinski 44f90d7cf5 Merge branch 'microchip_t1s-update-on-microchip-10base-t1s-phy-driver'
Parthiban Veerasooran says:

====================
microchip_t1s: Update on Microchip 10BASE-T1S PHY driver

This patch series contain the below updates:
- Restructured lan865x_write_cfg_params() and lan865x_read_cfg_params()
  functions arguments to more generic.
- Updated new/improved initial settings of LAN865X Rev.B0 from latest
  AN1760.
- Added support for LAN865X Rev.B1 from latest AN1760.
- Moved LAN867X reset handling to a new function for flexibility.
- Added support for LAN867X Rev.C1/C2 from latest AN1699.
- Disabled/enabled collision detection based on PLCA setting.
====================

Link: https://patch.msgid.link/20241010082205.221493-1-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:42 -07:00
Parthiban Veerasooran 78341049fb net: phy: microchip_t1s: configure collision detection based on PLCA mode
As per LAN8650/1 Rev.B0/B1 AN1760 (Revision F (DS60001760G - June 2024))
and LAN8670/1/2 Rev.C1/C2 AN1699 (Revision E (DS60001699F - June 2024)),
under normal operation, the device should be operated in PLCA mode.
Disabling collision detection is recommended to allow the device to
operate in noisy environments or when reflections and other inherent
transmission line distortion cause poor signal quality. Collision
detection must be re-enabled if the device is configured to operate in
CSMA/CD mode.

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-8-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:40 -07:00
Parthiban Veerasooran 6b079d8f7b net: phy: microchip_t1s: add support for Microchip's LAN867X Rev.C2
Add support for LAN8670/1/2 Rev.C2 as per the latest configuration note
AN1699 released (Revision E (DS60001699F - June 2024)) for Rev.C1 is also
applicable for Rev.C2. Refer hardware revisions list in the latest AN1699
Revision E (DS60001699F - June 2024).
https://www.microchip.com/en-us/application-notes/an1699

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-7-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:40 -07:00
Parthiban Veerasooran 662d9c5fe1 net: phy: microchip_t1s: add support for Microchip's LAN867X Rev.C1
Add support for LAN8670/1/2 Rev.C1 as per the latest configuration note
AN1699 released (Revision E (DS60001699F - June 2024)).
https://www.microchip.com/en-us/application-notes/an1699

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-6-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:39 -07:00
Parthiban Veerasooran 117b70e4c6 net: phy: microchip_t1s: move LAN867X reset handling to a new function
Move LAN867X reset handling code to a new function called
lan867x_check_reset_complete() which will be useful for the next patch
which also uses the same code to handle the reset functionality.

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-5-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:39 -07:00
Parthiban Veerasooran 7a0414fdcc net: phy: microchip_t1s: add support for Microchip's LAN865X Rev.B1
Add support for LAN8650/1 Rev.B1. As per the latest configuration note
AN1760 released (Revision F (DS60001760G - June 2024)) for Rev.B0 is also
applicable for Rev.B1. Refer hardware revisions list in the latest AN1760
Revision F (DS60001760G - June 2024).
https://www.microchip.com/en-us/application-notes/an1760

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-4-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:39 -07:00
Parthiban Veerasooran d793beee2d net: phy: microchip_t1s: update new initial settings for LAN865X Rev.B0
Update the new/improved initial settings from the latest configuration
application note AN1760 released for LAN8650/1 Rev.B0 Revision F
(DS60001760G - June 2024).
https://www.microchip.com/en-us/application-notes/an1760

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-3-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:39 -07:00
Parthiban Veerasooran 9826b9a08b net: phy: microchip_t1s: restructure cfg read/write functions arguments
Restructure lan865x_write_cfg_params() and lan865x_read_cfg_params()
functions arguments to more generic which will be useful for the next
patch which updates the improved initial configuration for LAN8650/1
Rev.B0 published in the Configuration Note.

Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20241010082205.221493-2-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:54:39 -07:00
Jakub Kicinski ec35b0c53c selftests: drv-net: add missing trailing backslash
Commit b3ea416419 ("testing: net-drv: add basic shaper test")
removed the trailing backslash from the last entry. We have
a terminating comment here to avoid having to modify the last
line when adding at the end.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20241010211857.2193076-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:53:40 -07:00
Jakub Kicinski c4708bf3cb Merge branch 'netdevsim-better-ipsec-output-format'
Hangbin Liu says:

====================
netdevsim: better ipsec output format

The first 2 patches improve the netdevsim ipsec debug output with better
format. The 3rd patch update the selftests.

v2: update rtnetlink selftest with new output format (Stanislav Fomichev, Jakub Kicinski)
====================

Link: https://patch.msgid.link/20241010040027.21440-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:44:29 -07:00
Hangbin Liu 3ec920bb97 selftests: rtnetlink: update netdevsim ipsec output format
After the netdevsim update to use human-readable IP address formats for
IPsec, we can now use the source and destination IPs directly in testing.
Here is the result:
  # ./rtnetlink.sh -t kci_test_ipsec_offload
  PASS: ipsec_offload

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20241010040027.21440-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:44:27 -07:00
Hangbin Liu 2cf567f421 netdevsim: copy addresses for both in and out paths
The current code only copies the address for the in path, leaving the out
path address set to 0. This patch corrects the issue by copying the addresses
for both the in and out paths. Before this patch:

  # cat /sys/kernel/debug/netdevsim/netdevsim0/ports/0/ipsec
  SA count=2 tx=20
  sa[0] tx ipaddr=0.0.0.0
  sa[0]    spi=0x00000100 proto=0x32 salt=0x0adecc3a crypt=1
  sa[0]    key=0x3167608a ca4f1397 43565909 941fa627
  sa[1] rx ipaddr=192.168.0.1
  sa[1]    spi=0x00000101 proto=0x32 salt=0x0adecc3a crypt=1
  sa[1]    key=0x3167608a ca4f1397 43565909 941fa627

After this patch:

  = cat /sys/kernel/debug/netdevsim/netdevsim0/ports/0/ipsec
  SA count=2 tx=20
  sa[0] tx ipaddr=192.168.0.2
  sa[0]    spi=0x00000100 proto=0x32 salt=0x0adecc3a crypt=1
  sa[0]    key=0x3167608a ca4f1397 43565909 941fa627
  sa[1] rx ipaddr=192.168.0.1
  sa[1]    spi=0x00000101 proto=0x32 salt=0x0adecc3a crypt=1
  sa[1]    key=0x3167608a ca4f1397 43565909 941fa627

Fixes: 7699353da8 ("netdevsim: add ipsec offload testing")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20241010040027.21440-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:44:27 -07:00
Hangbin Liu c71bc6da61 netdevsim: print human readable IP address
Currently, IPSec addresses are printed in hexadecimal format, which is
not user-friendly. e.g.

  # cat /sys/kernel/debug/netdevsim/netdevsim0/ports/0/ipsec
  SA count=2 tx=20
  sa[0] rx ipaddr=0x00000000 00000000 00000000 0100a8c0
  sa[0]    spi=0x00000101 proto=0x32 salt=0x0adecc3a crypt=1
  sa[0]    key=0x3167608a ca4f1397 43565909 941fa627
  sa[1] tx ipaddr=0x00000000 00000000 00000000 00000000
  sa[1]    spi=0x00000100 proto=0x32 salt=0x0adecc3a crypt=1
  sa[1]    key=0x3167608a ca4f1397 43565909 941fa627

This patch updates the code to print the IPSec address in a human-readable
format for easier debug. e.g.

 # cat /sys/kernel/debug/netdevsim/netdevsim0/ports/0/ipsec
 SA count=4 tx=40
 sa[0] tx ipaddr=0.0.0.0
 sa[0]    spi=0x00000100 proto=0x32 salt=0x0adecc3a crypt=1
 sa[0]    key=0x3167608a ca4f1397 43565909 941fa627
 sa[1] rx ipaddr=192.168.0.1
 sa[1]    spi=0x00000101 proto=0x32 salt=0x0adecc3a crypt=1
 sa[1]    key=0x3167608a ca4f1397 43565909 941fa627
 sa[2] tx ipaddr=::
 sa[2]    spi=0x00000100 proto=0x32 salt=0x0adecc3a crypt=1
 sa[2]    key=0x3167608a ca4f1397 43565909 941fa627
 sa[3] rx ipaddr=2000::1
 sa[3]    spi=0x00000101 proto=0x32 salt=0x0adecc3a crypt=1
 sa[3]    key=0x3167608a ca4f1397 43565909 941fa627

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20241010040027.21440-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:44:27 -07:00
Aryan Srivastava 5e7e69baad net: dsa: mv88e6xxx: Fix uninitialised err value
The err value in mv88e6xxx_region_atu_snapshot is now potentially
uninitialised on return. Initialise err as 0.

Fixes: ada5c3229b ("net: dsa: mv88e6xxx: Add FID map cache")
Signed-off-by: Aryan Srivastava <aryan.srivastava@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241009212319.1045176-1-aryan.srivastava@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:43:24 -07:00
Jakub Kicinski f7cb403e9a Merge branch 'net-xilinx-emaclite-adopt-clock-support'
Radhey Shyam Pandey says:

====================
net: xilinx: emaclite: Adopt clock support

This patchset adds emaclite clock support. AXI Ethernet Lite IP can also
be used on SoC platforms like Zynq UltraScale+ MPSoC which combines
powerful processing system (PS) and user-programmable logic (PL) into
the same device. On these platforms it is mandatory to explicitly enable
IP clocks for proper functionality.
====================

Link: https://patch.msgid.link/1728491303-1456171-1-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:41:37 -07:00
Abin Joseph 76d46d766a net: emaclite: Adopt clock support
Adapt to use the clock framework. Add s_axi_aclk clock from the processor
bus clock domain and make clk optional to keep DTB backward compatibility.

Signed-off-by: Abin Joseph <abin.joseph@amd.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1728491303-1456171-4-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:41:34 -07:00
Abin Joseph 130fbea551 net: emaclite: Replace alloc_etherdev() with devm_alloc_etherdev()
Use device managed ethernet device allocation to simplify the error
handling logic. No functional change.

Signed-off-by: Abin Joseph <abin.joseph@amd.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1728491303-1456171-3-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:41:34 -07:00
Abin Joseph 60dbdc6e08 dt-bindings: net: emaclite: Add clock support
Add s_axi_aclk AXI4 clock support. Traditionally this IP was used on
microblaze platforms which had fixed clocks enabled all the time. But
since its a PL IP, it can also be used on SoC platforms like Zynq
UltraScale+ MPSoC which combines processing system (PS) and user
programmable logic (PL) into the same device. On these platforms instead
of fixed enabled clocks it is mandatory to explicitly enable IP clocks
for proper functionality.

So make clock a required property and also define max supported clock
constraints.

Signed-off-by: Abin Joseph <abin.joseph@amd.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/1728491303-1456171-2-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:41:33 -07:00
Jakub Kicinski 356d568fa9 Merge branch 'net-remove-rtnl-from-fib_seq_sum'
Eric Dumazet says:

====================
net: remove RTNL from fib_seq_sum()

This series is inspired by a syzbot report showing
rtnl contention and one thread blocked in:

7 locks held by syz-executor/10835:
  #0: ffff888033390420 (sb_writers#8){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2931 [inline]
  #0: ffff888033390420 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x224/0xc90 fs/read_write.c:679
  #1: ffff88806df6bc88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1ea/0x500 fs/kernfs/file.c:325
  #2: ffff888026fcf3c8 (kn->active#50){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20e/0x500 fs/kernfs/file.c:326
  #3: ffffffff8f56f848 (nsim_bus_dev_list_lock){+.+.}-{3:3}, at: new_device_store+0x1b4/0x890 drivers/net/netdevsim/bus.c:166
  #4: ffff88805e0140e8 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:1014 [inline]
  #4: ffff88805e0140e8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x8e/0x520 drivers/base/dd.c:1005
  #5: ffff88805c5fb250 (&devlink->lock_key#55){+.+.}-{3:3}, at: nsim_drv_probe+0xcb/0xb80 drivers/net/netdevsim/dev.c:1534
  #6: ffffffff8fcd1748 (rtnl_mutex){+.+.}-{3:3}, at: fib_seq_sum+0x31/0x290 net/core/fib_notifier.c:46
====================

Link: https://patch.msgid.link/20241009184405.3752829-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:16 -07:00
Eric Dumazet 2698acd6ea net: do not acquire rtnl in fib_seq_sum()
After we made sure no fib_seq_read() handlers needs RTNL anymore,
we can remove RTNL from fib_seq_sum().

Note that after RTNL was dropped, fib_seq_sum() result was possibly
outdated anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241009184405.3752829-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:05 -07:00
Eric Dumazet 055202b16c ipmr: use READ_ONCE() to read net->ipv[46].ipmr_seq
mr_call_vif_notifiers() and mr_call_mfc_notifiers() already
uses WRITE_ONCE() on the write side.

Using RTNL to protect the reads seems a big hammer.

Constify 'struct net' argument of ip6mr_rules_seq_read()
and ipmr_rules_seq_read().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241009184405.3752829-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:05 -07:00
Eric Dumazet e60ea45447 ipv6: use READ_ONCE()/WRITE_ONCE() on fib6_table->fib_seq
Using RTNL to protect ops->fib_rules_seq reads seems a big hammer.

Writes are protected by RTNL.
We can use READ_ONCE() when reading it.

Constify 'struct net' argument of fib6_tables_seq_read() and
fib6_rules_seq_read().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241009184405.3752829-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:05 -07:00
Eric Dumazet 16207384d2 ipv4: use READ_ONCE()/WRITE_ONCE() on net->ipv4.fib_seq
Using RTNL to protect ops->fib_rules_seq reads seems a big hammer.

Writes are protected by RTNL.
We can use READ_ONCE() when reading it.

Constify 'struct net' argument of fib4_rules_seq_read()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241009184405.3752829-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:05 -07:00
Eric Dumazet a716ff52be fib: rules: use READ_ONCE()/WRITE_ONCE() on ops->fib_rules_seq
Using RTNL to protect ops->fib_rules_seq reads seems a big hammer.

Writes are protected by RTNL.
We can use READ_ONCE() on readers.

Constify 'struct net' argument of fib_rules_seq_read()
and lookup_rules_ops().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241009184405.3752829-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 15:35:05 -07:00
Eric Dumazet d677aebd66 tcp: move sysctl_tcp_l3mdev_accept to netns_ipv4_read_rx
sysctl_tcp_l3mdev_accept is read from TCP receive fast path from
tcp_v6_early_demux(),
 __inet6_lookup_established,
  inet_request_bound_dev_if().

Move it to netns_ipv4_read_rx.

Remove the '#ifdef CONFIG_NET_L3_MASTER_DEV' that was guarding
its definition.

Note this adds a hole of three bytes that could be filled later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Cc: Wei Wang <weiwan@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Link: https://patch.msgid.link/20241010034100.320832-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 08:45:24 -07:00
Aryan Srivastava 7e5b547cac net: phy: aquantia: poll status register
The system interface connection status register is not immediately
correct upon line side link up. This results in the status being read as
OFF and then transitioning to the correct host side link mode with a
short delay. This causes the phylink framework passing the OFF status
down to all MAC config drivers, resulting in the host side link being
misconfigured, which in turn can lead to link flapping or complete
packet loss in some cases.

Mitigate this by periodically polling the register until it not showing
the OFF state. This will be done every 1ms for 10ms, using the same
poll/timeout as the processor intensive operation reads.

If the phy is still expressing the OFF state after the timeout, then set
the link to false and pass the NA interface mode onto the phylink
framework.

Signed-off-by: Aryan Srivastava <aryan.srivastava@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241010004935.1774601-1-aryan.srivastava@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 08:43:47 -07:00
Jakub Kicinski 8401a108a6 eth: remove the DLink/Sundance (ST201) driver
Konstantin reports the maintainer's address bounces.
There is no other maintainer and the driver is quite old.
There is a good chance nobody is using this driver any more.
Let's try to remove it completely, we can revert it back in
if someone complains.

Link: https://lore.kernel.org/20240925-bizarre-earwig-from-pluto-1484aa@lemu/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Denis Kirjanov <dkirjanov@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-11 13:08:23 +01:00
Jakub Kicinski 59ae83dcf1 Merge branch 'tg3-link-irqs-napis-and-queues'
Joe Damato says:

====================
tg3: Link IRQs, NAPIs, and queues

This follows from a previous RFC (wherein I botched the subject lines of
all the messages) [1].

I've taken Michael Chan's suggestion on modifying patch 2 and I've
updated the commit messages of both patches to test and show the output
for the default 1 TX 4 RX queues and the 4 TX and 4 RX queues cases.

Reviewers: please check the commit messages carefully to ensure the
output is correct (or on your own systems to verify, if you like). I am
not a tg3 expert and it's possible that I got something wrong.

[1]: https://lore.kernel.org/all/20241005145717.302575-3-jdamato@fastly.com/T/
====================

Link: https://patch.msgid.link/20241009175509.31753-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 18:40:34 -07:00
Joe Damato aec5514d73 tg3: Link queues to NAPIs
Link queues to NAPIs using the netdev-genl API so this information is
queryable.

First, test with the default setting on my tg3 NIC at boot with 1 TX
queue:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump queue-get --json='{"ifindex": 2}'

[{'id': 0, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8197, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}]

Now, adjust the number of TX queues to be 4 via ethtool:

$ sudo ethtool -L eth0 tx 4
$ sudo ethtool -l eth0 | tail -5
Current hardware settings:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a

Despite "Combined: n/a" in the ethtool output, /proc/interrupts shows
the tg3 has renamed the IRQs to be combined:

343: [...] eth0-0
344: [...] eth0-txrx-1
345: [...] eth0-txrx-2
346: [...] eth0-txrx-3
347: [...] eth0-txrx-4

Now query this via netlink to ensure the queues are linked properly to
their NAPIs:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8960, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8961, 'type': 'rx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8962, 'type': 'rx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8963, 'type': 'rx'},
 {'id': 0, 'ifindex': 2, 'napi-id': 8960, 'type': 'tx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 8961, 'type': 'tx'},
 {'id': 2, 'ifindex': 2, 'napi-id': 8962, 'type': 'tx'},
 {'id': 3, 'ifindex': 2, 'napi-id': 8963, 'type': 'tx'}]

As you can see above, id 0 for both TX and RX share a NAPI, NAPI ID
8960, and so on for each queue index up to 3.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241009175509.31753-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 18:40:29 -07:00
Joe Damato 25118cce66 tg3: Link IRQs to NAPI instances
Link IRQs to NAPI instances with netif_napi_set_irq. This information
can be queried with the netdev-genl API.

Begin by testing my tg3 device in its default state: 1 TX queue and 4 RX
queues.

Compare the output of /proc/interrupts for my tg3 device with the output of
netdev-genl after applying this patch:

$ cat /proc/interrupts | grep eth0
343: [...] eth0-tx-0
344: [...] eth0-rx-1
345: [...] eth0-rx-2
346: [...] eth0-rx-3
347: [...] eth0-rx-4

As you can see above, tg3 has named the IRQs such that there is a
dedicated tx IRQ and 4 dedicated rx IRQs, for a total of 5 IRQs.

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
			 --dump napi-get --json='{"ifindex": 2}'

[{'id': 8197, 'ifindex': 2, 'irq': 347},
 {'id': 8196, 'ifindex': 2, 'irq': 346},
 {'id': 8195, 'ifindex': 2, 'irq': 345},
 {'id': 8194, 'ifindex': 2, 'irq': 344},
 {'id': 8193, 'ifindex': 2, 'irq': 343}]

Netlink displays the same IRQs as above, noting that each is mapped to a
unique NAPI instance.

Now, reconfigure the NIC to have 4 TX queues and 4 RX queues:

$ sudo ethtool -L eth0 rx 4 tx 4
$ sudo ethtool -l eth0 | tail -5
Current hardware settings:
RX:		4
TX:		4
Other:		n/a
Combined:	n/a

Examine /proc/interrupts once again, noting that tg3 will now rename the
IRQs to suggest that they are combined tx and rx without allocating
additional IRQs, so the total IRQ count in /proc/interrupts is
unchanged:

343: [...] eth0-0
344: [...] eth0-txrx-1
345: [...] eth0-txrx-2
346: [...] eth0-txrx-3
347: [...] eth0-txrx-4

Check the output from netlink again:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 2}'
[{'id': 8973, 'ifindex': 2, 'irq': 347},
 {'id': 8972, 'ifindex': 2, 'irq': 346},
 {'id': 8971, 'ifindex': 2, 'irq': 345},
 {'id': 8970, 'ifindex': 2, 'irq': 344},
 {'id': 8969, 'ifindex': 2, 'irq': 343}]

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241009175509.31753-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 18:40:29 -07:00
Heiner Kallweit 854d71c555 r8169: remove original workaround for RTL8125 broken rx issue
Now that we have b9c7ac4fe2 ("r8169: disable ALDPS per default for
RTL8125"), the first attempt to fix the issue shouldn't be needed
any longer. So let's effectively revert 621735f590 ("r8169: fix
rare issue with broken rx after link-down on RTL8125") and see
whether anybody complains.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/382d8c88-cbce-400f-ad62-fda0181c7e38@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 18:36:28 -07:00
Heiner Kallweit 87e26448db r8169: don't apply UDP padding quirk on RTL8126A
Vendor drivers r8125/r8126 indicate that this quirk isn't needed
any longer for RTL8126A. Mimic this in r8169.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/d1317187-aa81-4a69-b831-678436e4de62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 18:36:06 -07:00
Jakub Kicinski 9c0fc36ec4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3).

No conflicts and no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 13:13:33 -07:00
Linus Torvalds 1d227fcc72 Merge tag 'net-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth and netfilter.

  Current release - regressions:

   - dsa: sja1105: fix reception from VLAN-unaware bridges

   - Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is
     enabled"

   - eth: fec: don't save PTP state if PTP is unsupported

  Current release - new code bugs:

   - smc: fix lack of icsk_syn_mss with IPPROTO_SMC, prevent null-deref

   - eth: airoha: update Tx CPU DMA ring idx at the end of xmit loop

   - phy: aquantia: AQR115c fix up PMA capabilities

  Previous releases - regressions:

   - tcp: 3 fixes for retrans_stamp and undo logic

  Previous releases - always broken:

   - net: do not delay dst_entries_add() in dst_release()

   - netfilter: restrict xtables extensions to families that are safe,
     syzbot found a way to combine ebtables with extensions that are
     never used by userspace tools

   - sctp: ensure sk_state is set to CLOSED if hashing fails in
     sctp_listen_start

   - mptcp: handle consistently DSS corruption, and prevent corruption
     due to large pmtu xmit"

* tag 'net-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  MAINTAINERS: Add headers and mailing list to UDP section
  MAINTAINERS: consistently exclude wireless files from NETWORKING [GENERAL]
  slip: make slhc_remember() more robust against malicious packets
  net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC
  ppp: fix ppp_async_encode() illegal access
  docs: netdev: document guidance on cleanup patches
  phonet: Handle error of rtnl_register_module().
  mpls: Handle error of rtnl_register_module().
  mctp: Handle error of rtnl_register_module().
  bridge: Handle error of rtnl_register_module().
  vxlan: Handle error of rtnl_register_module().
  rtnetlink: Add bulk registration helpers for rtnetlink message handlers.
  net: do not delay dst_entries_add() in dst_release()
  mptcp: pm: do not remove closing subflows
  mptcp: fallback when MPTCP opts are dropped after 1st data
  tcp: fix mptcp DSS corruption due to large pmtu xmit
  mptcp: handle consistently DSS corruption
  net: netconsole: fix wrong warning
  net: dsa: refuse cross-chip mirroring operations
  net: fec: don't save PTP state if PTP is unsupported
  ...
2024-10-10 12:36:35 -07:00
Linus Torvalds 0edab8d132 Merge tag 'trace-ringbuffer-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
 "Ring-buffer fix: do not have boot-mapped buffers use CPU hotplug
  callbacks

  When a ring buffer is mapped to memory assigned at boot, it also
  splits it up evenly between the possible CPUs. But the allocation code
  still attached a CPU notifier callback to this ring buffer. When a CPU
  is added, the callback will happen and another per-cpu buffer is
  created for the ring buffer.

  But for boot mapped buffers, there is no room to add another one (as
  they were all created already). The result of calling the CPU hotplug
  notifier on a boot mapped ring buffer is unpredictable and could lead
  to a system crash.

  If the ring buffer is boot mapped simply do not attach the CPU
  notifier to it"

* tag 'trace-ringbuffer-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Do not have boot mapped buffers hook to CPU hotplug
2024-10-10 12:25:32 -07:00
Linus Torvalds eb952c47d1 Merge tag 'for-6.12-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - update fstrim loop and add more cancellation points, fix reported
   delayed or blocked suspend if there's a huge chunk queued

 - fix error handling in recent qgroup xarray conversion

 - in zoned mode, fix warning printing device path without RCU
   protection

 - again fix invalid extent xarray state (6252690f7e), lost due to
   refactoring

* tag 'for-6.12-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix clear_dirty and writeback ordering in submit_one_sector()
  btrfs: zoned: fix missing RCU locking in error message when loading zone info
  btrfs: fix missing error handling when adding delayed ref with qgroups enabled
  btrfs: add cancellation points to trim loops
  btrfs: split remaining space to discard in chunks
2024-10-10 10:02:59 -07:00
Linus Torvalds 5870963f6c Merge tag 'nfsd-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:

 - Fix NFSD bring-up / shutdown

 - Fix a UAF when releasing a stateid

* tag 'nfsd-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: fix possible badness in FREE_STATEID
  nfsd: nfsd_destroy_serv() must call svc_destroy() even if nfsd_startup_net() failed
  NFSD: Mark filecache "down" if init fails
2024-10-10 09:52:49 -07:00
Linus Torvalds 825ec756af Merge tag 'xfs-6.12-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:

 - A few small typo fixes

 - fstests xfs/538 DEBUG-only fix

 - Performance fix on blockgc on COW'ed files, by skipping trims on
   cowblock inodes currently opened for write

 - Prevent cowblocks to be freed under dirty pagecache during unshare

 - Update MAINTAINERS file to quote the new maintainer

* tag 'xfs-6.12-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix a typo
  xfs: don't free cowblocks from under dirty pagecache on unshare
  xfs: skip background cowblock trims on inodes open for write
  xfs: support lowmode allocations in xfs_bmap_exact_minlen_extent_alloc
  xfs: call xfs_bmap_exact_minlen_extent_alloc from xfs_bmap_btalloc
  xfs: don't ifdef around the exact minlen allocations
  xfs: fold xfs_bmap_alloc_userdata into xfs_bmapi_allocate
  xfs: distinguish extra split from real ENOSPC from xfs_attr_node_try_addname
  xfs: distinguish extra split from real ENOSPC from xfs_attr3_leaf_split
  xfs: return bool from xfs_attr3_leaf_add
  xfs: merge xfs_attr_leaf_try_add into xfs_attr_leaf_addname
  xfs: Use try_cmpxchg() in xlog_cil_insert_pcp_aggregate()
  xfs: scrub: convert comma to semicolon
  xfs: Remove empty declartion in header file
  MAINTAINERS: add Carlos Maiolino as XFS release manager
2024-10-10 09:45:45 -07:00