IP traffic monitoring and forecasting¶

This section describes the capabilities of ENP tool for managing IP traffic monitoring traces, estimate end-to-end traffic matrices from link monitoring traces, and build long-term traffic foecasts that can feed network capacity upgrades or redesign processes.

ENP functionalities described in this section are:

Add, import and export traffic monitoring information associated to some or all of the IP links and/or MPLS-TE tunnels and/or IP demands and/or IP multicast flows. Monitoring information is composed of traces of traffic monitoring samples of the form (time, traffic amount).
Create forecasts for the IP links or demands future traffic, computed manually or via built-in proprietary Machine-Learning (ML) algorithms that process the monitored information. This information can be used to assess when the different parts of the network need to be upgraded to cope with the traffic growth. These forecasts can be used to predict the traffic state in the future, and then compute.
Graphically visualize and analyze the traffic growth patterns and forecasts.

These functionalities are accessible via right-click options under the menu ' Monitor/forecast... ' in the different tables.

Adding and manipulating monitoring traces¶

ENP allows the user associating monitoring traces to any of the following elements:

IP links. Each sample in the trace stores the aggregated in or out traffic (in Gbps) in the link at a given time.
IP injection links. Each sample in the trace stores the aggregated in or out traffic (in Gbps) in the link at a given time.
MPLS-TE tunnels. Each sample in the trace stores the injected traffic (in Gbps) in the MPLS-TE tunnel at a given time.
IP demands. Each sample in the trace stores the injected traffic (in Gbps) in the IP demand at a given time.
IP multicast flows. Each sample in the trace stores the injected traffic (in Gbps) by the source of the multicast flow at a given time.

Monitoring information may be introduced in the tool in three forms:

Monitoring information synthetically created by the ENP tool using different realistic models. The menu option 'Add synthetic monitoring trace to selected elements' in the different tables can be used for this.
Uploading manually the CSV files with the monitoring samples to the user-defined monitoring samples folder. See this section for more information.
Adding monitoring samples manually, one by one, using the menu option 'Add one monitoring traffic sample for selected elements'.
Adding monitoring samples from the current traffic in the network design, using the option 'Create monitoring samples from current traffic'.

Monitoring information can be removed using different right-click options under the 'Remove...' submenu.

Note

The links above direct to the right-click options in the IP demands table, similar options exist in the IP links, injection link, tunnels, demands, and multicast flow tables.

Percentile-filtering of the monitoring traffic¶

Monitoring information in production networks typically consists of one monitoring sample every short period (e.g. 5 minutes), which averages the traffic in the element during the last sampling period (or on other occasions, during a fraction of it, e.g. last minute).

A five-minute sampling trace of the traffic in an element (e.g. in an IP link) shows fine-grained information on how traffic varies along the day, that may be of interest for some purposes (e.g. traffic anomaly detections).

However, such fine-grained information is not that important for the long-term capacity planning of the network. Instead, capacity planning processes are based on the so-called busy hour traffic, i.e. the traffic carried by the element during the hour of the day in which the highest traffic is carried. Then, the network should be dimensioned so that all the traffic is carried during the busy hour.

When we want to dimension the network so it is able to cope with the future traffic demands, we should be able to estimate the future traffic in the busy hour.

Percentile traffic filtering is used for this purpose. In ENP, this filtering is available via the right-click option 'Percentile-filtering of monitoring samples'. The filtering is characterized by two user-defined factors:

Time interval between produced samples. This is the time between two consecutive samples produced by the filtering process, typically e.g. 1 sample per day. This interval is longer than the sampling interval of the input samples to be processed (e.g. one sample every 5 minutes).
Percentile. A percentage P between 0% and 100%, typically 95% or 99%. For each time interval, all the input samples are taken (e.g. one every 5 min), and one output sample is produced with a traffic T such that the P % of the input traffic samples are below or equal to T.

When e.g. a 99% percentile filtering is applied e.g producing one sample per day, the information of how the traffic varies along the day is lost, and we just keep the traffic in the busy hour, such that roughly the traffic is higher than it at most 1% of the time during the day. This information stored for multiple consecutive days, can help us predict how the busy hour traffic will be in the future (e.g. in one year from today), and plan our network upgrade accordingly.

CSV-file storage of the monitoring information at the server¶

In ENP the user can define for a network design, the so-called monitoring samples folder. This folder should be located at the ENP server and is used by ENP to automatically import and export to/from CSV files the monitoring information in the different elements.

CSV files are the persistent form in which the monitoring information is automatically saved and read:

Changes in the monitoring information produced by user actions are automatically updated in the CSV files.
And manual changes performed in the CSV files are automatically reflected in the ENP tool.

One CSV file exists for the following elements:

IP interfaces - transmitted traffic, with file name 'tx_#ID.csv', being #ID the unique identifier of the IP interface with the monitoring samples.
IP interfaces - received traffic, with file name 'rx_#ID.csv', being #ID the unique identifier of the IP interface with the monitoring samples.
IP injection links, with file name 'tx_#ID.csv', being #ID the unique identifier of the element with the monitoring samples.
MPLS-TE tunnel, with file name 'tx_#ID.csv', being #ID the unique identifier of the element with the monitoring samples.
IP demands, with file name 'tx_#ID.csv', being #ID the unique identifier of the element with the monitoring samples.
IP multicast flows, with file name 'tx_#ID.csv', being #ID the unique identifier of the element with the monitoring samples.

The CSV files have a very simple structure, human-readable, with one row per monitoring sample, with two columns: one for the date in the format 'yyyy-MM-dd HH:mm:ss z', and the other with the traffic monitored in Gbps.

The user can set a new folder for becoming the monitoring samples folder arbitrarily, via appropriate right-click options in the tables.

Naturally, the user can also modify or process offline the CSV files in any form. As mentioned above, these changes are automatically reflected in the tool. This opens the door for building customized systems where ENP automatically updates its monitoring information taking it from the Performance Management system of the production network.

The functionalities described in this section are accessible in the menu 'Monit samples & CSV files'.

IP traffic forecasting¶

Traffic forecasting permits the user to generate predictions of the future evolution of the traffic in different network elements, according to the monitored information present in those same elements. This is applicable to IP demands, IP injection links, multicast flows, IP links, and MPLS-TE tunnels.

These options are accessible via the right-click submenu 'Create long-term traffic predictor for selected elements traffic'.

Create a traffic predictor¶

With this option, the user can create a traffic predictor function for each selected element (IP demands, multicast flows, IP links, and MPLS-TE tunnels). Traffic predictors can be computed in several forms:

Automatically estimated from the monitored information present in those elements, using a Machine Learning (ML) process. In this case, the user has requested the initial and end dates of the monitoring samples to consider as input. Then, an internal ML process estimates the growth pattern and their parameters for all elements and creates the traffic predictor functions for them.
Manually created as a linear growth of user-defined per-year traffic increment.
Manually created as an exponential growth, with a user-defined Compound Annual Growth Rate (CAGR).
Manually created as a constant traffic, equal to the current traffic in the network element.

The user can observe an estimation of the quality of the forecast produced, via the so-called fraction of the variance explained statistic, between 0 and 100%. A variance explained as a fraction of 100% means a forecast that perfectly fits the data. In general, when the available data is large enough, any value above 80%-90% can be considered a reasonably good forecast.

For more information on how forecast information can be visualized, see this section.

Using forecast/monitoring to set the current network traffic¶

These options permit the user to fix the current IP demand and IP multicast flow traffic, as the one coming from:

The monitoring samples on a particular user-defined date.
The forecasted traffic for a particular date.

After this setting, the traffic routing is simulated in the regular form, producing the traffic in the IP links and tunnels.

The functionalities described in this section are accessible in the menu 'Set current traffic from...'.

End-to-end traffic estimation from IP link counts¶

This option permits the user to generate estimations of end-to-end IP demands and multicast flows' injected traffic, from the monitored information present for IP links, MPLS-TE tunnels, and potentially partial information on IP demands and multicast flows.

This functionality is a required customary process in capacity planning of IP networks since (i) monitoring information is typically only present for IP links, but (ii) capacity planning requires an estimation of per-IP demand traffic. Monitoring the per-IP demand traffic is typically not possible, or just possible for only some selected flows, since such monitoring is computationally costful for the routers, and/or requires specific equipment at high IP rates.

For this reason, several options have been researched and are used in production networks for deriving the IP demands traffic from the IP link counts, a process commonly referred to as IP traffic matrix derivation, since in the most simple case, when at most one IP demand exists between each node pair, all the information of the IP demands' traffic can be represented by a matrix with as many rows and columns as the number of network nodes.

In ENP, the end-to-end IP traffic derivation options are accessible via the right-click menus under the 'IP end-to-end traffic matrix derivation' submenu. The user can check this menu information for specific details. Below are some somewhat-theoretical explanations of the two models offered in ENP for end-to-end traffic derivation: the gravity model, and a proprietary full-regression model.

Forecast demands traffic using the gravity model¶

This option permits the user to estimate the traffic in the IP demands, from the information of the traffic in the IP links, using a built-in variation of the so-called gravity model.

Gravity model-based IP demand forecast can be used when:

No IP multicast flows are defined in the network.
There is traffic information for all the IP links outgoing/incoming of those nodes in the network which are end nodes of IP demands. That is, it is possible to obtain the total traffic outgoing to the IP nodes, and the total traffic incoming to the IP nodes.
For applying the gravity model we do not need to know the routing policies applied in the network (i.e. how the IP traffic was routed).

The gravity model works by assuming that the traffic from node A to node B is directly proportional to the total traffic that A produces (to any destination), and also proportional to the total traffic that B receives (from any origin).

Forecast demands traffic using the ENP full-regression model¶

This option permits the user to estimate the traffic in the IP demands and the traffic in the IP multicast flows, exploiting, and making the most, of the information available with respect to:

IP demand monitoring information, if any.
IP multicast flow monitoring information, if any.
IP link aggregated traffic information, if any.
MPLS-TE injected traffic information, if any.
For applying the full-regression model we do need to know the routing policies applied in the network (i.e. how the IP traffic was routed).

The method takes benefit of the information available, to increase the estimation accuracy. In general, the more the information, the better the accuracy. Internally, the full-regression model will search for the more likely end-to-end traffic of the IP demands, that has a better consistency with the monitored data available.

To know more¶

For those interested in knowing more about theoretical and practical aspects of this topic, some initial reads are

Paul Tune, Matthew Roughan, "Internet Traffic Matrices: A Primer", April 2013.
A. Nucci, K. Papagiannaki, "Design, measurement, and management of large-scale IP networks. Bridging the gap between theory and practice", Cambridge University Press 2009.

Visualizing the monitor/forecast information¶

The traffic monitoring, estimation, and forecasting information can be visualized in different forms:

Monitoring and forecast information in the tables. In the different columns under the Monitor/forecast view in the different tables (IP demands, IP links, etc). This includes information on the number of monitoring samples and initial and end dates, the forecasted traffic and its relative mismatch against current design traffic, the type of forecast method, and the variance explained with respect to the monitoring samples.
Capacity upgrade deadline information is in the tables. For those IP logical ports for which a traffic prediction function exists, the tool provides information on the time when the estimated average traffic will reach a user-defined utilization limit. For this, see the columns under the Occupation forecast view in the IP logical links table.
Monitoring/forecast panel. The user will be able to visually observe, zoom, etc. the monitoring traces of the elements, together with the predicted traffic, in the Monitoring/forecast panel that is shown below the information tables