Linux Tune Network Stack (Buffers Size) To Increase Networking Performance

by on May 20, 2009 · 34 comments· LAST UPDATED July 8, 2009

in , ,

I've two servers located in two different data center. Both server deals with a lot of concurrent large file transfers. But network performance is very poor for large files and performance degradation take place with a large files. How do I tune TCP under Linux to solve this problem?

By default the Linux network stack is not configured for high speed large file transfer across WAN links. This is done to save memory resources. You can easily tune Linux network stack by increasing network buffers size for high-speed networks that connect server systems to handle more network packets.

The default maximum Linux TCP buffer sizes are way too small. TCP memory is calculated automatically based on system memory; you can find the actual values by typing the following commands:
$ cat /proc/sys/net/ipv4/tcp_mem
The default and maximum amount for the receive socket memory:
$ cat /proc/sys/net/core/rmem_default
$ cat /proc/sys/net/core/rmem_max

The default and maximum amount for the send socket memory:
$ cat /proc/sys/net/core/wmem_default
$ cat /proc/sys/net/core/wmem_max

The maximum amount of option memory buffers:
$ cat /proc/sys/net/core/optmem_max

Tune values

Set the max OS send buffer size (wmem) and receive buffer size (rmem) to 12 MB for queues on all protocols. In other words set the amount of memory that is allocated for each TCP socket when it is opened or created while transferring files:

WARNING! The default value of rmem_max and wmem_max is about 128 KB in most Linux distributions, which may be enough for a low-latency general purpose network environment or for apps such as DNS / Web server. However, if the latency is large, the default size might be too small. Please note that the following settings going to increase memory usage on your server.

# echo 'net.core.wmem_max=12582912' >> /etc/sysctl.conf
# echo 'net.core.rmem_max=12582912' >> /etc/sysctl.conf

You also need to set minimum size, initial size, and maximum size in bytes:
# echo 'net.ipv4.tcp_rmem= 10240 87380 12582912' >> /etc/sysctl.conf
# echo 'net.ipv4.tcp_wmem= 10240 87380 12582912' >> /etc/sysctl.conf

Turn on window scaling which can be an option to enlarge the transfer window:
# echo 'net.ipv4.tcp_window_scaling = 1' >> /etc/sysctl.conf
Enable timestamps as defined in RFC1323:
# echo 'net.ipv4.tcp_timestamps = 1' >> /etc/sysctl.conf
Enable select acknowledgments:
# echo 'net.ipv4.tcp_sack = 1' >> /etc/sysctl.conf
By default, TCP saves various connection metrics in the route cache when the connection closes, so that connections established in the near future can use these to set initial conditions. Usually, this increases overall performance, but may sometimes cause performance degradation. If set, TCP will not cache metrics on closing connections.
# echo 'net.ipv4.tcp_no_metrics_save = 1' >> /etc/sysctl.conf
Set maximum number of packets, queued on the INPUT side, when the interface receives packets faster than kernel can process them.
# echo 'net.core.netdev_max_backlog = 5000' >> /etc/sysctl.conf
Now reload the changes:
# sysctl -p
Use tcpdump to view changes for eth0:
# tcpdump -ni eth0

Recommend readings:

TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 34 comments… read them below or add one }

1 Shoaibi May 20, 2009 at 10:21 am

great… you always impress me by your level of knowledge and great explanation skills.

Reply

2 blink4blog May 20, 2009 at 10:58 am

it appears that these tuning is not suitable for average workstation users whom want to have a faster/solid network connection?

Reply

3 nixCraft May 20, 2009 at 11:19 am

For average user (Desktop / workstation) defaults are fine. These are for servers with lots of traffic generated while transferring large files.

@Shoaibi: no problem. I’m glade that information helped you to understand new things.

Reply

4 Samir May 20, 2009 at 1:12 pm

Great article man. Quality stuff you got here… keep em coming.

Reply

5 blink4blog May 20, 2009 at 1:48 pm

it will be great if there is another article to light boost network connectivity of workstation/notebook ubuntu.

Reply

6 Bash May 20, 2009 at 2:28 pm

Hi

According to this page: http://www.psc.edu/networking/projects/tcptune/#Linux

“Kernels greater than 2.6.17 have full auto tuning and manual tuning is generally not recommended.”

If you are in a position to test the effects of auto vs manual tuning thoroughly, please do so.

Thanks

Reply

7 Michael May 21, 2009 at 11:27 am

Can you offer some specific information about this information:
1) What kernel were you using?
2) Do you have any numbers to show the changes in performance?
3) Are there any negative effects on your systems by doing this?
4) What are the hardware configs of the systems involved?

TNX

Reply

8 nixCraft May 21, 2009 at 12:16 pm

OS RHEL 4.7 server – RHEL kernel 2.6.9.xx – 64bit system.

Hardware – Server Dual Core Xeon 53xx with 16GB RAM – RAID 10 with x 15kx 73GB server.

A clear performance decline is observed when using a low value of rmem_max and wmem_max limit used. This causes small window sizes and creates a performance ceiling for large data transfers. However, I lost my data file with trasfer per rate per sec by received size. However, you can easily test things your self try comparison with socket buffer 4 KB and 132 bytes.

I’ve same experience with RHEL 5 64 bit kernel 2.6.18 with 32GB RAM and SAN storage. You need to tweak the kernel.

HTH.

Reply

9 Pramod Walke May 23, 2009 at 6:54 am

good information, can have any trick to increase wine application’s performance.. i am using tally on wine, it works slow..

Reply

10 James Pearce June 18, 2009 at 9:37 pm

Very nice article. I tweaked these parameters a little to obtain almost twice the real throughput in Samba over a high-latency VPN link.

Reply

11 Sam Hall July 8, 2009 at 2:27 am

I think this command is a typo:

# echo ‘sysctl net.ipv4.tcp_sack = 1′ >> /etc/sysctl.conf

…should not contain sysctl?

Reply

12 nixCraft July 8, 2009 at 3:11 am

@Sam,

Thanks for the heads-up!

Reply

13 Muhammad July 14, 2009 at 8:56 pm

Great info. thanks.

How can I check how much Linux TCP buffer sizes are used on my system?

Reply

14 Bill Ward March 8, 2010 at 9:21 pm

Interesting, but only close to what I’m looking for. I have a very large number of small connections (client connects via TCP, sends a message that has a maximum size of 64K, and disconnects). The client may reconnect to send an additional message, but EVERY message has it’s own connection. I need to handle >20K (and perhaps over 40K) connections per second over GigE; my server seems to be similar in some ways (32GB of RAM on a SAN, running vanilla Linux 5.2 x86-64, with 2 Quad Core Opterons).

Raising the various memory for individual buffers settings seems futile, since I’m sending large amounts of very small data and would be unlikely to see any performance for mostly unused memory; however,

# echo ‘net.core.netdev_max_backlog = 5000′ >> /etc/sysctl.conf

seems promising, although I think 5K may be low in my case. Any other tuning that might be helpful that you can recommend?

Reply

15 Leon C March 10, 2011 at 6:10 am

You might want to try a couple of things.. Check the ring buffer sized on your driver using ethtool -g ..

You might want to make sure that your socket is opened using NODELAY flag to turn of nagel and don’t do timestamps=1 and sack=1 .. these things should be turned off to increase efficiency. These things add overhead and usually you don’t need them.

max_backlog depending on the line speed could be set higher than 5000. I used 30000 on a 10gig link but I think this may also be related to the max ring buffer size. Would be interested to know whick buffer the backlog is dumped into.

Some other settings you might want to look at:
net.ipv4.tcp_slow_start_after_idle = 0

and check if there is any interrupt coalescing going on using ethtool -c if your drivre supports it. e1000 supports this. Turn that sucker off Rx/Tx Coalescing..

if you’re sending tons of small messages all the nagel stuff and coalescing will hurt you.

Reply

16 Darwin May 26, 2011 at 8:21 am

Quite interesting. However, many of the settings are already default on the new Linux kernel (using 2.6.32 on Ubuntu 10.04). I still can’t crack why my Linux workstation with GigE have a transfer rate of 35MBps while my new iMac (OSX) have 65MBps downloading same file from the same NAS using the same port & cable. Is there something I miss?

Reply

17 nixCraft May 26, 2011 at 9:34 am

Are you using CIFS, NFS, or anything else? What about jumbo frame and the
/proc/sys/net/ipv4/tcp_moderate_rcvbuf ? By default most modern Linux kernel comes with autotuning. Also, it depends upon your network card and driver.

Reply

18 eh June 15, 2011 at 2:12 pm

I think wmem and rmem are generic for sockets and can be used by any protocol. For tcp it is tcp_wmem and tcp_rmem.

Reply

19 anony October 19, 2011 at 8:36 pm

Very Very nice aritlce …Very much helpfull….

Reply

20 Rocky Patel November 15, 2011 at 4:39 pm

Hi Vivek,

I have a small qeury.Please clear me.

As per your word:
“Tune values

Set the max OS send buffer size (wmem) and receive buffer size (rmem) to 12 MB for queues on all protocols. In other words set the amount of memory that is allocated for each TCP socket when it is opened or created while transferring files:”

suppose i set 12 MB for wmem and rmem so that time each tcp socket will take 12 MB ?

means suppose we have bussy server which having 1000 concurrent users so in that time each tcp connection taking 12 MB of ram ?

Please clear me that how to calculate it.

And for such tunning , do we need more ram ?

From where system set 12 mb for each tcp socket ? is from ram ?

Thanks,
Rocky

Reply

21 moa December 13, 2011 at 2:18 pm

“suppose i set 12 MB for wmem and rmem so that time each tcp socket will take 12 MB ?

means suppose we have bussy server which having 1000 concurrent users so in that time each tcp connection taking 12 MB of ram ?”

same question. anybody?

Reply

22 Jason April 3, 2012 at 5:40 pm

Won’t this just cause buffer bloat? http://www.bufferbloat.net/

Reply

23 Anish May 11, 2012 at 5:26 pm

Hi,

Duplex settings for ethernet interface is often forgotten causing duplex mismatch
and packet drops,
ethtool -s eth0 speed 100 duplex full autoneg off
ethtool eth0
Note : The link goes down and comes up
For GE links autoneg shall be enabled .
The tx queue length of eth0 can also be adjusted using
ifconfig eth0 txqueuelen 3000

Transmit queue length for eth0 can be adjusted

Reply

24 Arade May 11, 2012 at 6:23 pm

yeah i think it a buffer over bloat , we should not increase the buffer size too much , we should consider the bottleneck on our link while doing such stuff

Reply

25 richard May 22, 2012 at 1:24 pm

what is TXqueueing actually
is it recommended on a low latency needed server
–> ifconfig eth0 txqueuelen 3000 <–

Reply

26 Kido July 15, 2012 at 11:54 pm

after modify,how can i check if the server get better speed?

Reply

27 Andrés Chandía September 8, 2012 at 11:52 am

Hi there, im not sure if I’m suffering this problem, so not sure if this a suitable solution for me. I better explain my case:
I have a little office network, where I have a file, printer, download and router server. I mean I use it as storage (NFS). By a web interface users can access to a mldonkey where they look for things they need to download. And all the network goes to the internet through it.
I recently increased the memory fom 1G to 3G, and added a 2TB hard disk for storage. Since this moment on some workstations, specially the old ones have problems accesing the files stored at the server, if a user needs to watch a video, while watching it, the video stops some times and after a while starts again. The acces to folders with huge amount of data makes the nautilus window to turn gray for a while before showing the content, etc….
These kind of things didn’t happen with 1Gb RAM. I have to say that I have read that large disks may affect performance of the machine. But I suspect my problem is related to ram addition and the network.
I hope you can help me.
Thanks a lot.

Reply

28 vlassius November 27, 2012 at 10:05 pm

Hi, i want to share something that i spent a lot of time to solve and the solution is against any advice found on internet on doing it.

Ambient:
Mini network, 3 computers @ 300Mb wifi, where one is a server (proxy and storage).
To Internet (on server) using wifi direct connected adapter.

Effect:
On file transfer between server and computer, when gets about 50% of bandwidth, the connection with internet slows down to zero bytes transfer.
(The internet adapter is not used to file transfer, its another adapter)
The load of the server is bellow 20%
Server: Fedora Linux kernel 3.6.7-4.fc17.x86_64
Machine 1: Windows 7
Machine 2: Windows 8
This issue has about 18 months, i mean, the time that the network was mounted.
As a experimented developer, I’ve tried really a lot of things and i got really annoyed with this problem and I’ve found a lot of problems related to this found on internet (while searching for a solution). (I was thinking in go to try changes in kernel source, but i was not psychologically prepared to this )

Solution:
The solution was simply increase the network parameters to a very hight values (in memory parameters).
This is NOT proposed to do in a server without a extensive test but, worked like a charm to me. Every network communication had a surprised increase of performance and no more problems.

So, the working parameters are:

# adjust memory to 1.677MB – endless memory :-)
net.core.rmem_max=1677721600
net.core.rmem_default=167772160
net.core.wmem_max=1677721600
net.core.wmem_default=167772160
net.core.optmem_max= 2048000

# set minimum size, initial size, and maximum size in bytes
net.ipv4.tcp_rmem= 1024000 8738000 1677721600
net.ipv4.tcp_wmem= 1024000 8738000 1677721600
net.ipv4.tcp_mem= 1024000 8738000 1677721600
net.ipv4.udp_mem= 1024000 8738000 1677721600

The server is now allocating about 500MB to network buffer and everything is running very well.

Regards
vlassius

Reply

29 TT March 1, 2013 at 2:23 am

Servers to like RAM for caching whatever rubbish you access on them.
Be sure to have a UPS and do proper shutdowns every or else you’re looking at serious corruption and data loss.
Been there, done it, now I have a UPS.

Reply

30 TT March 1, 2013 at 2:24 am

There’s a typo there.
Servers to like RAM -> Servers do like RAM

Reply

31 Floyd Carp March 13, 2013 at 2:11 pm

Works Great, THANKS!

Reply

32 chekkiligili January 28, 2014 at 5:01 am

Hi,

Is there any way to reduce the network latency? Want to verify a specific issue, when there is low latency we are get into issue. So to complete testing phase we need to simulate that scenario.

Please advice

Reply

33 Anton February 16, 2014 at 8:20 am

Thanks for the article. Should I tune all tcp buffer settings on both sides (i.e. client and server)?

Reply

34 max July 17, 2014 at 4:42 pm

Big thanks. It fixes my server performance issues, such as, TCP Dup Ack and TCP out-of-order problems.

Reply

Leave a Comment

Tagged as: , , , , , , , , , , ,

Previous Faq:

Next Faq: