I would like to expand on Paul’s recent blog “Is your pre-production testing effective?”. Paul covered CPU utilization and code paths but there is another very important aspect of many applications – network utilization. Many applications are tested in a LAN environment with latencies in the sub-millisecond range and bandwidth at least at 100 mbps. They are then deployed to be used across a WAN with smaller bandwidths and much higher latencies. The results can be a disaster.
While testing in a LAN environment will uncover most network related bugs there may be bugs related to dropped packets or unexpected segmentation (TCP will NOT maintain your message boundaries) that you are less likely to see in a LAN environment. You are also much less likely to notice a less than optimal design in a fast LAN environment then in a slow WAN environment. It is therefore very important to test any network based application under the worst anticipated network environment, low bandwidth, high latencies and don’t forget packet loss rates.
There are two ways to do this.
The first is to use the actual environment. Put the server (or client) on a host out in the network and see how it works. This has the advantage of using the actual infrastructure. The disadvantage is that you have no control over the environment, something that is critical when you are trying to duplicate a problem or test a bug fix.
The second is to use a WAN simulator. There are hardware and software only, commercial and open source (free) simulators. The advantage here is that you have total control over latency, packet drop rates and other network parameters, and you do not have to involve other groups (i.e. put your software on someone else’s system). The disadvantage is cost and a learning curve. Even if you use free software you have to provide a system (typically some flavor of Unix) to run it on and learn how to use it. Several years ago I wrote a tutorial on Dummynet. At the time it was one of the few free simulators available. Now there are many more, just google “wan simulator”.
Based on my experience of looking at performance and application failure issues I believe that this kind of testing will more than pay for itself in reduced production outages due to bugs and increased performance of both the application and its users.