HOWTO: Optimize Gigabit Networking in Linux
Even if you have a gigabit networking adapter and a gigabit switch capable of jumbo frames, Linux still uses the default MTU size of 1500. To get something better, you need to configure things by hand.
The reason for this is that the IETF has never standardized anything above 1500. You might very well have gigabit ethernet equipment that either does not have jumbo frame support, or may be very disappointed to find out that “jumbo frame” can be used to describe any packet size between 1500 and 9000.
To make matters worse, not every gigabit ethernet switch handles mixed networking the same. You would think a gigabit switch would guarantee a 1gb connection between two computers with 1gb networking adapters, but under various circumstances, this isn’t always the case. Optimally, it would be best to separate your 100mb and 1gb devices onto two different switches, but this isn’t guaranteed to work.
Now that we have all the caveats out of the way, read on if you want to start optimizing.
First, you’ll need to run ifconfig eth[n] mtu [size] on each machine capable of gigabit networking on your networking segment. Start with an MTU of 9000. If you get the error message SIOCSIFMTU: Invalid argument
, work your way down in increments of 1000. Once you find a setting that works, work your away around in halves to find the maximum MTU.
For example:
sudo ifconfig eth0 mtu 9000
SIOCSIFMTU: Invalid argument
sudo ifconfig eth0 mtu 8000
SIOCSIFMTU: Invalid argument
sudo ifconfig eth0 mtu 7000
sudo ifconfig eth0 mtu 7500
SIOCSIFMTU: Invalid argument
sudo ifconfig eth0 mtu 7300
SIOCSIFMTU: Invalid argument
sudo ifconfig eth0 mtu 7200
In my limited experience, max MTUs are measured in increments of 100.
When you have the MTU for all of your machines, you will want to go with the lowest common denominator. On two machines, one preferably being the one with the lowest MTU, run this command:
sudo ifconfig eth[n] mtu [size]
Now that we have our test machines. it’s time to make sure our switch supports the frame size. Run this command on either machine to access any the other one:
ping -s [MTU-28] -M do [ip-address]
We have to account for the 8-byte ping header and 20-byte TCP/IP stack header, hence subtracting 28 from our determined MTU. This runs the ping command using our determined MTU size and forces the frames to refuse fragmentation. If you get responses, you’re in the clear. If not, either the pinged computer is dropping pings or your switch is refusing the frame size. If its the latter, you’ll have to do a bit more discovery on the packet size, in a fashion similar to what you used to discover the lowest max MTU to begin with.
Once you have an MTU size, you can run sudo ifconfig eth[n] mtu [size]
on all the machines to have an immediate effect. If you’re using DHCP, add this line under the appropriate stanza for your network card in /etc/network/interfaces
to make changes permanent:
pre-up /sbin/ifconfig $IFACE mtu [size]
If you’re using static addressing, you can get away with
mtu [size]
In my experience, this gives me an increase of network speed from 15MB/s (126mb/s) to 38MB/s (318mb/s). And from my research, this is to be expected. No one has ever reached anywhere near the theoretical 119MB/s (1gb/s) limit of gigabit. Whereas we got a little over half the theoretical limit of 100mb/s (7MB/s or 59mb/s), one can only get a little over one-third of the theoretical limit of gigabit.
UPDATE 03/24/2010: I recently upgraded my desktop, and the Windows drivers for the new motherboard’s onboard NIC (Realtek PCIe GBE) gave options for 2KB – 9KB, and those KB turned out to be the binary definition of kilobyte. Well, the binary definition minus 14 bytes. Not sure what ate them up, but the 7KB setting’s MTU was 7154 bytes not 7168 bytes.
UPDATE 04/17/2010: Portuguese University IP = faculty + students + wifi freeloaders. Only the first should be taken as a professional source…that should be focused on work instead of using university resources for personal matters.
strange 4:52 pm on June 11, 2009 Permalink |
why do the MTU’s have to match? a workstation accessing the internet will have several devices between it and the internet, a linksys router running linux being a common one, and it will be a much lower MTU.
the lowest common denominator will be all the gear out at the boundaries, which all the machines and servers will likely have to talk to at some point. so what do you do then?
brainwreckedtech 11:11 pm on June 12, 2009 Permalink |
The MTUs don’t have to match unless you enjoy having your LAN speed crippled as your computers break apart packets on their own trying to reach a common denominator. While the advice here is for optimizing the speed that computers communicate on a LAN, not the Internet, keep in mind that computers with bigger MTUs will have no trouble accepting smaller packets from computers with smaller MTUs. Your Internet download speeds won’t be affected, but your upload speeds might. However, most people’s Internet connection speed (in the US, at least) doesn’t even hit 1mb/s upstream. Factor that measly speed with all the latency due to routing, server capacity, etc., and the upload speed degradation from mismatched MTUs with the Internet becomes the least of your problems.
By no means should you adjust the MTU of a machine on your LAN if its sole purpose to upload data to the billions of anonymous users on the net. At the same time, you should consider getting that machine off your private network in the event a security breach. A simple double-NAT will do.
BlueSherpa 2:17 pm on February 7, 2010 Permalink |
Corrections:
The “theoretical” limit of gigabit, also known as the wire speed, is 125MB/s (wikipedia, 2010).
“one can only get a little over one-third of the theoretical limit of gigabit” is not true. 900Mb/s can be attained at the normal 1500 byte MTU setting (Schluting, 2007).
Wikipedia http://en.wikipedia.org/wiki/Gigabit_Ethernet
Schulting http://www.enterprisenetworkingplanet.com/nethub/article.php/3485486
brainwreckedtech 4:33 pm on February 8, 2010 Permalink |
125mB/s (m being read the decimal “mega” of 1000) was never in dispute, but I did botch my original 101MB/s (M being read as the computer “mega” of 1024). The correct calculation is 1,000,000,000 bits ÷ 8 bits/byte ÷ 1024 bytes/kilobyte ÷ 1024 kilobytes/megabyte = 119.21MB/s.
I’ll recant my “never,” but Schulting used server-class hardware and mem-to-mem copying. Consumers are going to be hard-pressed to find such equipment and are more apt re-use old equipment and go by drive-to-drive copying over Ethernet. Following his advice gave my speed a bump to an average of 43MB/s with the range anywhere from 36MB/s to 50MB/s. Nice, but far short of 119 MB/s.
august 8:39 pm on February 10, 2010 Permalink |
In networking, “mega” was always the proper “mega” – 1000000.
The only thing that really ever used the 2^10 thing was memory sizes, because they naturally come in powers of 2.
(And please use the proper prefixes (MiB etc) if you’re going to use the binary variant.)
So 125MB/s is the right number.
Also, if you’re doing drive-to-drive copying, you’re probably measuring the speed of your disk, and not the ethernet.
brainwreckedtech 4:37 am on February 11, 2010 Permalink |
Now you’re just picking nits.
210 is used for all storage sizes, not just memory.
125mB/s is a correct number. So is 119MB/s when that number is the theoretical ceiling that will be reported by any OS transferring a file.
I’ve seen MiB, and guess what? Fuck it. Using M for 220 and Mi for million is great, but what do you do for the giga level? Do they use G for 230 and Tr for trillion? NO, THEY USE Gi FOR TRILLION. And you can’t go back and say Gi = decimal giga because, if that was their intent, they should have used Me for decimal mega. So you can try and use that system if you want. I see no harm in using caps for bigger values and lowercase for smaller values because — at least to me — it makes sense and can be made consistent.
Finally, where do you think the data is coming from that’s being served over Ethernet? The config I’m using is two 250GB Samsung SP2504C drives in software RAID 0 in a file server. These drives have been rated for an average random reads and writes around 45MB/s. RAID 0 absolutely can double the average speed of random reads and writes, so that gives me a ceiling of 90MB/s. There is negligible, if any, difference between Linux software RAID and hardware RAID. I was only getting 50MB/s tops, so it wasn’t the hard drives. The NICs are on-board, so it isn’t the PCI bus. And even if it was, it was still below the 78MB/s limit of 33MHz PCI.
ah 6:35 am on March 12, 2014 Permalink |
when I test, I dont copy data from and to actual disk drives, I use dd to copy /dev/zero over the network to /dev/null at the other end, which cuts out the disk drive completely. Do you not use this method?
BrainwreckedTech 4:02 am on March 19, 2014 Permalink |
Not a bad idea, but shortcuts (large stream of only 0’s) can be taken at the kernel level. Better to take an ISO (or file created with /dev/urandom) inside a tmpfs and copy that.
Darr247 12:02 pm on December 16, 2010 Permalink |
You’ve got it backwards, so no wonder you think it’s a bad idea. 😉
M=10^6; Mi=2^20
The idea is, hard drive manufacturers used (some would argue “correctly”) the SI version of mega (ergo influenced what peoples’ idea of MB should be on computers) to mean 1,000,000 bytes, so back at the turn of the century the IEC designated MiB to mean 1,048,576 bytes.
It’s pronounced mebibytes, short for mega binary bytes.
Likewise,
G = 10^9; Gi = 2^30 (giga/giba)
T = 10^12; Ti = 2^40 (tera/tebi)
et cetera (P/Pi, E/Ei, Z/Zi) up to
Y = 10^24; Yi = 2^80 (yotta/yobi)
I don’t know that SI has abbreviations for values greater than 10^24… but when we get to 10^100 capacities, hard drive manufacturers will no doubt start using “googlebytes” thanks to the non-math guy who originally registered the domain name (instead of “googol” which is what the search engine’s inventors intended).
Implicitly, b=bits and B=bytes, also.
I’ve never seen anyone use G or Gi for trillion. Got a cite?
Finally, thanks for the linux gigabit tuning tips.
BrainwreckedTech 6:07 am on December 26, 2010 Permalink |
You’re right about me being backwards on X and Xi.
The hard drive manufacturers started using the decimal interpretation because it meant they could advertise bigger numbers. (Never attribute to malice what can be adequately explained by stupidity…or greed.) No one paid much mind back in the days of the kilobyte because the difference was small, and we’re used to a bit of fibbing from marketing departments. The excuse, “you lose some space due to formatting,” held enough truth to keep most users calm. I can’t recall exactly when the difference became a major ordeal, but fuzzy hindsight says it probably came with the introduction of the first gigabyte drives — kinda hard to chalk the drop from 1GB to 953MB to formatting.
Tangent: Networking came after hard drives. They took a page from hard drive manufacturer’s play book and took it a step further by never graduating beyond bits.
You can see Gi in use in Gnome and KDE. Besides, by definition, it is one trillion. 1,000,000,000 or 1,073,741,824 depending on which definition.
Pyrrhic 1:26 pm on February 23, 2011 Permalink |
Thank you for posting this I had a SIOC… etc error, your post allowed me to fix this. The MTU discussion is also very, very useful. Again…thanks!
Best,
P.
Configuring Ethernet Jumbo Frames on Ubuntu | My latest ponderings 1:23 am on April 16, 2011 Permalink |
[…] Another article on Jumbo Frames […]
Markus Torstensson 6:19 am on June 23, 2011 Permalink |
Thanks dude 😀 Works like a charm. Kinda shame about the prefix-flamewar you had to put up with.
Site perso KameSense - Optimiser un réseau gigabit sous Kubuntu : les jumbo frames 7:28 pm on December 14, 2011 Permalink |
[…] cette étape sur les autres machines. Source : https://brainwreckedtech.wordpress.com/2009/05/20/howto-optimize-gigabit-networking-in-linux/ Add comments Tagged with: ethernet, frames, gigabit, jumbo, mtu, optimiser, […]