Best Practices for Network Simulations

Network simulation offers an efficient, cost-effective way to analyze application and network performance, identify potential problems, understand the root cause, and assess alternative mitigation strategies. For best results, the analysis should leverage an at-scale, accurate model of the network and the operating environment. In other words, the model should represent, in adequate detail, the behavior of the network elements (protocols at different layers, mobility patterns, etc.), the communication environment (terrain, RF characteristics, etc.), and the applications running on the network. The simulation should also generate a comprehensive set of statistics that can be analyzed to answer the question(s) of interest.

Some of the factors that an analyst might consider when building a network model include: Which attributes of the target network need to be modeled? What is the level of detail that is necessary to include in the simulation? How does one determine that the model accurately represents the real network, with respect to the objectives of the simulation? While the answers to these questions vary from case to case, this article provides widely applicable guidelines in building and using network simulation models across a diverse set of use cases.

1.     Identify the objective

What are the questions the simulation is expected to help answer? For example, for a tactical network, the planner may want to determine if a call-for-fire message from a warfighter to the tactical operations center (TOC) will be delivered within a tightly bound interval under different operating conditions. For a Wi-Fi network, the planner may be interested in determining the number and layout of access points and servers to handle the expected volume of video calls during peak operating hours.

A clear identification of the objective at the outset will help to create an appropriate network model (that models the relevant components at the appropriate level of fidelity) and to accurately measure the statistics and metrics most likely to be useful to meet the objective of the simulation.

2.     Identify relevant components and level of detail to model

Based on the objective, identify the components which need to be modeled or can be omitted from the model. For example, if the goal is to assess the end-to-end latency among ships communicating over a satellite link, the ‘internal’ network within each ship can be ignored in the simulation model.

The operational environment can also dictate the choice of models to use in the simulation. For instance, in the case of communications over satellites, if the ground stations are located far enough or use different frequencies so that the communication channels do not interfere with each other, then the Abstract Satellite model, which is simpler to configure and runs faster, is sufficient, and a more detailed satellite model (such as the Aloha Satellite Model with Reed-Solomon/Viterbi Support, which takes coding and modulation overhead and channel interference into account) is not needed.

3.     Use an iterative approach for model development

The simulation model of a physical network has many components and, depending on the network, can be complex. Instead of creating the highest fidelity model representing the entire network in a single step, build the model in iterative steps, adding details in each step.

    • Start with a simple model of the network. The initial model can be of a smaller size (e.g., a single subnet or access point with a small number of mobile units) and omit advanced features (e.g., handoffs).
    • Add some simple traffic and test that the model is working correctly, for example, by verifying the end-to-end throughput or visualizing the hop-by-hop paths followed by packets (see 4 below).
    • Refine the model in one or more of the following ways (these refinements can be applied in any order or combination):

o   Increase the fidelity of the model by adding more details (see 5 below).

o   Increase the size of the network (see 6 below).

o   Add more realistic application traffic (see7 below).

o   If needed, add mobility and refine the environment model by adding terrain characteristics (see 8 below).

    • Test the correctness of the refined model using both end-to-end statistics (e.g., throughput and latency) and specific metrics focused on the refinements added in this step (see 4 below).
    • Repeat the process of refinement and testing until a complete model of the network at the desired level of fidelity has been created.

Note: The model should be saved prior to each refinement, in order to roll back to that state if needed.

4.     Use visualization and statistics for testing

Use visualization, while the simulation is running, and statistics to test the correctness of the simulation model after each step in the iterative process. Statistics and visualization can also be used to trouble-shoot the simulation model (for example, if messages are not being delivered, then use visualization to identify where the packets are being dropped and examine the statistics to determine if the packets are being dropped because there are no routes to the destination or due to insufficient transmit power or some other network configuration issue). For wireless scenarios, visualization of signal coverage of transmitters using heat maps can be very helpful in identifying locations where packets may get dropped due to inadequate radio coverage or high interference from different transmitters. The PHY Events, Network Events, Application Events, and Network Connectivity statistics database tables can be particularly useful in debugging the scenario.

If the results are not as expected, then visualization and statistics might be helpful in identifying the underlying cause.

    • The simulation or network configuration may be incorrect. For instance, the transmission power of a radio may be too low for signals to be received or the routing protocol may be incorrectly configured resulting in packets not being delivered.
    • The model may be representing the correct network operation, but the behavior is non-intuitive.  This provides valuable insight into the network operations.

Statistics can also help gain confidence in the simulation model. Identify ’verification’ metrics, beyond the main objectives of the simulation, and verify that they follow the expected trends (or deviations can be explained). For example, if the objective is to study the impact of radio characteristics on end-to-end packet delay, verify that the delay is not artificially low due to a large number of messages being dropped by examining the message completion rate statistics.

5.     Increase model fidelity in stages

It is often useful to add functionality to the model one layer at a time, starting from the higher layers. For example, to test the routing protocol configuration in a wireless scenario, use simplified models at the lower layers which guarantee perfect or close to perfect connectivity at the lower layers by doing the following:

    • Place the nodes close to each other or use a higher transmission power to ensure that signals are received.
    • Use simpler propagation, MAC layer and PHY layer models (e.g., free-space propagation, TDMA MAC and Abstract PHY).
    • Use default values for protocol parameters.
    • Disable mobility and terrain.

To test that the routing protocol will discover alternate routes when network conditions change, faults can be added at specific interfaces and times. This provides a predictable way to test the routing protocol.

Once upper layer functionality has been verified, more complex behavior can be introduced at the lower layers with more accurate models or by adjusting protocol parameters, until the model is a more accurate representation of the real network.

6.     Increasing scale of model in stages

Start with a single subnet and test it for intra-subnet functionality. Start with a smaller subnet and increase the size in successive steps. Once the functionality of each subnet has been tested, connect different subnets and test that they work together correctly. For heterogeneous networks, in particular, test each type of subnet (wired or wireless) separately before connecting different types of subnets together.

7.     Refine applications

Start with a simple traffic model (such as the Constant Bit Rate (CBR) traffic generator) to test the network model. Network behavior under a simple traffic model is more predictable, making it easier to analyze and debug the simulation model. Once the other components of the network model have been refined and tested, replace the simple traffic model with one which more accurately represents the traffic in the real network.

If applicable, use tools such as the PCAP Importer and Netflow Importer to create more accurate models of the application traffic and use these traffic models for the final analyses.

8.     Adding terrain and mobility

To simplify testing and troubleshooting, terrain or node mobility may be omitted in the earlier stages of model development. Add terrain and mobility characteristics to the model only after the other components of the network have been refined and tested.

If mobility data from the target network is available, then import it into the model and use it for the final analyses.

9.     Use multiple runs for final analysis

Once an at-scale, detailed model of the network has been created:

    • Run simulations with different random number seeds.
    • Discard the outliers.
    • Analyze the statistics collected during the simulation and generate reports to answer the questions of interest (objectives of the simulation).

10.  Miscellaneous considerations

    • The network behavior in the transient phase (i.e., while routes are converging) may be very different from the behavior in steady state. For analyses of steady state behavior, start the applications after the routes have had time to converge (unless the network behavior during the transient phase is of interest).
    • The starting time of applications should be staggered so that they do not all start at the same time and cause collisions.

SHARE