kernel-hacking-2024-linux-s.../net/ipv4
Herbert Xu 69d1506731 [TCP]: Let skbs grow over a page on fast peers
While testing the virtio-net driver on KVM with TSO I noticed
that TSO performance with a 1500 MTU is significantly worse
compared to the performance of non-TSO with a 16436 MTU.  The
packet dump shows that most of the packets sent are smaller
than a page.

Looking at the code this actually is quite obvious as it always
stop extending the packet if it's the first packet yet to be
sent and if it's larger than the MSS.  Since each extension is
bound by the page size, this means that (given a 1500 MTU) we're
very unlikely to construct packets greater than a page, provided
that the receiver and the path is fast enough so that packets can
always be sent immediately.

The fix is also quite obvious.  The push calls inside the loop
is just an optimisation so that we don't end up doing all the
sending at the end of the loop.  Therefore there is no specific
reason why it has to do so at MSS boundaries.  For TSO, the
most natural extension of this optimisation is to do the pushing
once the skb exceeds the TSO size goal.

This is what the patch does and testing with KVM shows that the
TSO performance with a 1500 MTU easily surpasses that of a 16436
MTU and indeed the packet sizes sent are generally larger than
16436.

I don't see any obvious downsides for slower peers or connections,
but it would be prudent to test this extensively to ensure that
those cases don't regress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 15:47:05 -07:00
..
ipvs ipvs: Make wrr "no available servers" error message rate-limited 2008-02-05 20:00:10 -08:00
netfilter [NETFILTER]: ipt_recent: sanity check hit count 2008-03-20 15:07:10 -07:00
af_inet.c [NET] endianness noise: INADDR_ANY 2008-03-17 22:44:53 -07:00
ah4.c [IPSEC]: Fix bogus usage of u64 on input sequence number 2008-02-12 22:50:35 -08:00
arp.c Revert "[NDISC]: Fix race in generic address resolution" 2008-02-17 18:39:54 -08:00
cipso_ipv4.c NetLabel: introduce a new kernel configuration API for NetLabel 2008-02-05 09:44:20 -08:00
datagram.c
devinet.c [IPV4]: Reset scope when changing address 2008-02-26 18:42:41 -08:00
esp4.c [IPV4]: esp_output() misannotations 2008-03-17 22:50:23 -07:00
fib_frontend.c [NETNS]: Lookup in FIB semantic hashes taking into account the namespace. 2008-01-31 19:28:41 -08:00
fib_hash.c ipv4/fib_hash.c: fix NULL dereference 2008-02-19 16:28:54 -08:00
fib_lookup.h
fib_rules.c [IPV4]: Consolidate fib_select_default. 2008-01-28 15:11:02 -08:00
fib_semantics.c [NETNS]: Lookup in FIB semantic hashes taking into account the namespace. 2008-01-31 19:28:41 -08:00
fib_trie.c fib_trie: /proc/net/route performance improvement 2008-02-12 17:53:31 -08:00
icmp.c [ICMP]: Restore pskb_pull calls in receive function 2008-02-05 03:15:50 -08:00
igmp.c [IGMP]: Optimize kfree_skb in igmp_rcv. 2008-02-09 23:22:26 -08:00
inet_connection_sock.c [SOCK] proto: Add hashinfo member to struct proto 2008-02-03 04:28:52 -08:00
inet_diag.c [NETNS]: Tcp-v6 sockets per-net lookup. 2008-01-31 19:28:20 -08:00
inet_fragment.c [NETNS][FRAGS]: Make the pernet subsystem for fragments. 2008-01-28 15:10:40 -08:00
inet_hashtables.c [INET]: Unexport inet_listen_wlock 2008-02-13 17:40:25 -08:00
inet_lro.c
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c [IPV4]: Fix null dereference in ip_defrag 2008-03-21 15:01:50 -07:00
ip_gre.c [INET]: Don't create tunnels with '%' in name. 2008-02-26 23:51:04 -08:00
ip_input.c
ip_options.c
ip_output.c [NET]: Introducing socket mark socket option. 2008-01-31 19:27:19 -08:00
ip_sockglue.c [NET] endianness noise: INADDR_ANY 2008-03-17 22:44:53 -07:00
ipcomp.c [IPCOMP]: Disable BH on output when using shared tfm 2008-02-28 11:23:17 -08:00
ipconfig.c [NET] endianness noise: INADDR_ANY 2008-03-17 22:44:53 -07:00
ipip.c [INET]: Don't create tunnels with '%' in name. 2008-02-26 23:51:04 -08:00
ipmr.c [NETNS]: Add namespace parameter to ip_route_output_key. 2008-01-28 15:11:07 -08:00
Kconfig [ESP]: Add select on AUTHENC 2008-03-04 14:29:21 -08:00
Makefile
netfilter.c [NETNS]: Add namespace parameter to ip_route_output_key. 2008-01-28 15:11:07 -08:00
proc.c [NETNS][FRAGS]: Make the mem counter per-namespace. 2008-01-28 15:10:36 -08:00
protocol.c
raw.c [RAW]: Wrong content of the /proc/net/raw6. 2008-01-31 19:27:26 -08:00
route.c [IPV4]: Use proc_create() to setup ->proc_fops first 2008-02-28 14:14:25 -08:00
syncookies.c [NETNS]: Add namespace parameter to ip_route_output_key. 2008-01-28 15:11:07 -08:00
sysctl_net_ipv4.c [TCP]: Fix a bug in strategy_allowed_congestion_control 2008-01-31 19:28:23 -08:00
tcp.c [TCP]: Let skbs grow over a page on fast peers 2008-03-22 15:47:05 -07:00
tcp_bic.c [TCP]: BIC web page link is corrected. 2008-02-28 22:14:32 -08:00
tcp_cong.c
tcp_cubic.c
tcp_diag.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c [TCP]: Must count fack_count also when skipping 2008-03-03 12:10:16 -08:00
tcp_ipv4.c [TCP]: Fix tcp_v4_send_synack() comment 2008-02-17 22:29:19 -08:00
tcp_lp.c
tcp_minisocks.c
tcp_output.c [TCP]: Fix shrinking windows with window scaling 2008-03-20 16:11:27 -07:00
tcp_probe.c
tcp_scalable.c
tcp_timer.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tunnel4.c
udp.c [NETNS]: Udp sockets per-net lookup. 2008-01-31 19:28:21 -08:00
udp_impl.h
udplite.c [IPV4] UDP,UDPLITE: Sparse: {__udp4_lib,udp,udplite}_err() are of void. 2008-01-28 15:10:24 -08:00
xfrm4_input.c
xfrm4_mode_beet.c [IPSEC] xfrm4_beet_input(): fix an if() 2008-02-05 02:51:39 -08:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c
xfrm4_policy.c [NET]: should explicitely initialize atomic_t field in struct dst_ops 2008-01-31 19:27:23 -08:00
xfrm4_state.c
xfrm4_tunnel.c [IPCOMP]: Fix reception of incompressible packets 2008-01-31 19:27:24 -08:00