twx-linux/include
Josh Hunt 32b293a53d IPv6: Avoid taking write lock for /proc/net/ipv6_route
During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which uses
"write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
found this on 2.6.32, but reading the code I believe the same basic idea
exists currently. Looking at the rtnetlink code they are only calling
"read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
reading from proc isn't the recommended way of fetching the ipv6 route
table; taking a write lock seems unnecessary and would probably cause
network performance issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3
cases:
  * doing nothing
  * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs each (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using this
info I decided to write a version of fib6_clean_all() which replaces
write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
this new function I see the same results as with my rtnetlink iperf test.

Signed-off-by: Josh Hunt <joshhunt00@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-30 17:07:33 -05:00
..
acpi
asm-generic Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-30 13:04:14 -05:00
crypto
drm drm/radeon/kms: add some new pci ids 2011-12-14 12:29:03 +00:00
keys
linux unix_diag: Fixup RQLEN extension report 2011-12-30 16:46:02 -05:00
math-emu
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2011-12-20 10:49:39 -08:00
misc
mtd
net IPv6: Avoid taking write lock for /proc/net/ipv6_route 2011-12-30 17:07:33 -05:00
pcmcia
rdma
rxrpc
scsi [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set skb priority 2011-12-15 11:02:07 +04:00
sound
target target: remove the unused se_dev_list 2011-12-06 06:00:57 +00:00
trace writeback: show writeback reason with __print_symbolic 2011-12-18 14:20:17 +08:00
video
xen Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel" 2011-12-19 09:30:35 -05:00
Kbuild