“We just upgraded from a T1 (1.544 mbps) to a T3 (44.736 mbps) so why is it still taking 90 minutes to copy that file?”
The simple answer is that more capacity is not the same as faster bytes. If you increase your capacity by almost 29 (44.736 / 1.544) times you may be able to copy 29 files in the same 90 minutes but you probably will not be able to copy that one file any faster. I touched upon this last year in the blog titled “Application Performance Problems and Latency” but I think file transfers and copy_file requires a more focus blog.
The key has to do with window sizes, that is, how many bytes can the sending host/application send without having to stop and wait for an acknowledgment from the receiving host/application. Some applications like FTP do not have an application layer acknowledgment and rely on TCP. In this case the window size is based on what the remote host’s TCP layer advertises, up to 64K bytes which is the maximum that STCP supports. Some applications have an application layer window, for these applications the limiting window is the smaller of the TCP and application windows.
The maximum throughput is calculated as
W / RTT = T
Where
W is the window size in bytes
RTT is the round trip time in seconds
T is throughput in bytes per second
Round trip time is the time it takes to send some data and get the acknowledgment. For large distances, i.e. cross country, that time is based primarily on the distance that the bytes must travel and the processing speed of all the network equipment between the end points. The processing speed of the receiving host, packet sizes (TCP MSS value, see “An Easy way to improve throughput across subnets”) and of course the bandwidth of the slowest link play smaller roles.
For example, if the RTT is 0.050 seconds and the window is 64K the maximum throughput will be 1,310,720 bytes per second (64 * 1024 / 0.050). As long as the unused capacity of the link is greater than the throughput, increasing the capacity will not speed up the transfer time.
T is the maximum possible throughput. Nothing you do (short of increasing the window or reducing the RTT) can make things go faster BUT hostile network conditions may make things go considerably slower.
Estimating transfer time is just dividing the file size by throughput
F/ W * RTT = Ti
Where
F is the file size in bytes
W is the window size in bytes
RTT is the round trip time in seconds
Ti is time in seconds to transfer the file
OSL has an application window size of 4K, the OpenVOS file system reports file sizes in blocks of 4K bytes so to estimate the time to copy a file between two OpenVOS systems using the copy_file command it is simply
F * RTT = Ti
Where
F is the file size in blocks
RTT is the round trip time in seconds
Ti is time in seconds to transfer the file
On a link with a 0.050 second RTT it will take OSL a little over 41 minutes to copy a 50,000 block file (50,000 * 0.05). The maximum throughput will be 81,920 bytes per second (4096 / 0.050). As long as the unused network capacity of your link is greater than 81,920 bytes per second adding capacity will not reduce the time it takes to copy the file.
The simplest way to measure round trip time is with ping. Unfortunately, ping doesn’t give you one number for RTT but multiple numbers and the numbers may vary significantly, for example:
ping 172.16.1.80 Pinging host 172.16.1.80 : 172.16.1.80 ICMP Echo Reply:TTL 53 time = 418 ms ICMP Echo Reply:TTL 53 time = 107 ms ICMP Echo Reply:TTL 53 time = 91 ms ICMP Echo Reply:TTL 53 time = 100 ms Host 172.16.1.80 replied to all 4 of the 4 pings ready 12:03:15 |
If you are interested in calculating the best possible throughput use the lowest number. If you are interested in a reasonable estimate of what you should expect use a ping count of 10, toss out the two highest and 2 lowest numbers and average the rest.
ping 172.16.1.80 -count 10 Pinging host 172.16.1.80 : 172.16.1.80 ICMP Echo Reply:TTL 53 time = 89 ms ICMP Echo Reply:TTL 53 time = 96 ms ICMP Echo Reply:TTL 53 time = 95 ms ICMP Echo Reply:TTL 53 time = 105 ms ICMP Echo Reply:TTL 53 time = 186 ms ICMP Echo Reply:TTL 53 time = 87 ms ICMP Echo Reply:TTL 53 time = 90 ms ICMP Echo Reply:TTL 53 time = 90 ms ICMP Echo Reply:TTL 53 time = 89 ms ICMP Echo Reply:TTL 53 time = 96 ms Host 172.16.1.80 replied to all 10 of the 10 pings ready 12:12:39 calc 96 + 95 + 90 + 90 + 89 + 96 556 ready 12:12:49 calc 556 / 6 92.6666666666667 ready 12:12:57 |
As I said above a hostile network can slow things down significantly. Ping timeouts are an indication of this. So any timeouts mean that these calculations are probably significantly optimistic. Transfers can also be slowed by busy disks and or CPUs that prevent the sender from reading and sending the data as fast as possible or cause the receiver to read the data slower than its arrival rate, delaying application acknowledgments (increasing round trip time) and possibly filling the TCP receive buffers which causes the sending host to stop transmitting. The only way to know if any of these conditions apply is with a packet trace.