b.lo project manager
Is network capacity planning an exact science?
Author: Philippe Bigorgne – b.lo Project Manager
Why is it sometimes impossible to make some mobile voice calls? How can you adapt the bandwidth of your main IP links? How can you measure the impact of your new application on your network infrastructure? And what happens when new employees come on board?
In each case, it’s all about capacity planning. Have sufficient resources been allocated for your traffic, and will they be sufficient tomorrow? Projecting into the future, sizing up networks according to traffic and usage evolution requires good capacity planning. It also needs to be done at the right time and at the right cost.
The stakes are high. If your network infrastructure is undersized, your end-users will have a poor experience and at worst, they won’t be able to access your service. If your infrastructure is oversized, unnecessary investments will have been made.
The same challenge is faced by all companies, be they enterprises managing a network with just a few dozen sites, or world-wide operators with thousands of CPEs.
Which criteria should be used to size a network?
A network is sized according to the requirements of the end-user experience, as defined in the Service Level Agreement, when it is tested to its limit.
- The “Busy Hour”. A network is sized to allow traffic to flow when there is highest demand. This period is generally called the “Busy Hour”. For example:
- On an enterprise network, the data network is generally very in demand during the first few hours of the morning when employees start work.
- On a mobile network, the busiest period for voice calls is frequently on Tuesdays and Thursdays between 6pm and 8pm.
- On Video on Demand streaming servers, loads are usually highest on Sundays at 9pm.
- As regards internet interfaces towards Google Global Cache, it can be seen that they must be sufficiently sized for roughly 7pm on Sundays.
- The quality expected for a given group application. Networks transport applications with varying quality constraints. The network size is therefore dependent on a quality of service requirement as defined in the SLA (Service Level Agreement). For example:
- A mobile operator needs to provide 500,000 voice calls simultaneously at peak period, with a call failure rate of under 0,5%.
- On an enterprise network, it is preferable to have 100% access to business applications during the day, whereas it is acceptable to see load peaks of 95% during the night, when software is updated; these peaks do not impact users.
- As regards voice on IP, a Jitter lasting longer than 15ms can impact the quality of the call.
- Connection time of longer than 2 seconds to a browser page can lead to user frustration.
The “Busy Hour” is therefore the baseline for the traffic measurement period. The SLA sets out the criteria for sizing the link or equipment at this “Busy Hour”. To guarantee the desired level of service, we need to define the necessary resources to deliver traffic during this period.
Good capacity planning must be a compromise between sufficient sizing and good quality: an undersized network will lead to user frustration and too high a standard of quality is expensive.
First of all, look at what you have
Capacity planning entails knowing the applications and services carried by the network and the way in which they are used. Before looking to the future, the network engineer must develop an in depth knowledge of the nature of these applications and services and identify the stress points.
- Know your end-to-end architecture: This means measuring up all elements without leaving any grey areas “point de contention” within your architecture. It would be a pity to remark that there is a heavy limitation of data throughput over a 3G/4G mobile network coverage zone. For example, all the mobile base stations were correctly sized, but an IP flow concentrator element was forgotten. Or alternatively, the number of video servers was correctly sized but not the various transmission links leading to the heart of the network.
- Map your network applications: Define groups of the same nature. For example, video streaming, IP voice, downloading files… It is sometimes necessary to have fine granularity. e.g. Microsoft 365 and Facebook are both web applications, but they have a different impact on enterprise productivity.
- Determine when they are used: On an operator network, voice calls decrease after 8 pm, but video streaming increases strongly and peaks around 10pm. It is sometimes acceptable to have an undersized network during certain period because there are no users (for example, enterprise network back up) or because the applications have a higher “tolerance” of network saturation. Knowing when these main applications are used is therefore important.
- Separate out flows: It is often preferable to associate specific resources with application categories requiring similar SLAs. This makes tracking and forecasting easier. An enterprise network will use Class of Service. On the access network (the connection to the client), the operator often chooses to separate voice and data applications by giving them dedicated elements and specific links.
- Make the SLA quantifiable and measurable: To define customer experience, it is necessary to be able de measure it. Transactional applications with numerous exchanges between user and server require rapid response time. Whereas, voice and quality video applications require low jitter.
Next step: Forecast the future
A network which does not have an increase in traffic, no new users, no new applications, no upgrade, is a running network. Obviously this is not what really happens. The network needs to adapt to new needs or sporadic events: virtualization of office software (Windows 365, Gmail,…), the New Year peak for an operator (SMS or Snapchat traffic which is 5 times higher than a “normal” day), the emergence of new applications (Instant Messaging, video streaming service) and the updating of the Windows 10 anniversary…
- Concentrate on applications or network elements which have a real impact: Predicting the increase in traffic several months in advance includes a margin of error. It is therefore not necessary to take into account those groups of applications with low traffic or little expected development. For example, an operator will size the core network mainly according to prediction of video streaming flow which is the main reason for an increase in traffic. An enterprise will tend to “oversize” its access network for the Data Center core to enable it to absorb sporadic events.
- Define simple and realistic criteria for traffic projection: The network engineer must build up simple references in order to be able to check hypotheses quickly and regularly. For example: the average traffic per user for a given application, or the standard repartition of applications groups for a certain type of user.
- Identify known events: This means identifying known events which will have an impact on the network. For example: New users, new sites, new applications.
- Evaluate trends: In this case, there are no particular events, but a general upward trend or a drop in certain uses. Trends may be a result of specialized firms, e.g. Garner predicts a 60% increase in Data traffic on mobile networks over the next 3 years due to mobile video. This is why LivingObjects platforms provide several projection models which help project traffic for an application group, or more generally for a link or CPU equipment for the 6-9-12 month period ahead.
- Apply this projection on the network elements: Lastly, we need to establish an action list of installations, equipment changes, hardware and software upgrade, the purchase of licences,….This allocation of resources must be done according to the SLAs. An operator will be able to be less demanding as regards the quality of 4G data connections over the weekend if, during the week, the SLAs are respected for enterprise clients. An enterprise will be able to tolerate a 100% link load, during night-time backups, knowing that there are no other users on the site.
- Initiate and follow action plans: The objective is not to start all these actions at the same time, in the light of uncertainties of input parameters (global traffic projection), but to start at the right time according to how long they will take to do. The time between initiation and completion is therefore an important criterion in establishing capacity planning. A long cycle, requiring a lead-time of several months (infrastructure works, administration time) will be started as quickly as possible and will lead to oversizing of the network. Medium or short-term actions can wait to see how forecasts work out, especially if the costs are high. In these cases, it is important that the QoS tracking tool can easily provide the sizing overview on a very regularly basis, if not daily.
The need for a flexible management platform able to easily deal with analytics
Without the appropriate tools, the whole capacity planning exercise may be painful and time consuming. Excel is often the network engineer’s tool of choice when carrying out capacity planning. Existing management tools rarely have the necessary flexibility to define and visualize KPIs, or calculate the “Busy Hour” or trends. Furthermore, network technologies are frequently managed using many different QoS tracking tools, however there is one solution that can provide a homogenous, end-to-end vision of the network.
LivingObjects eye.lo and b.lo platforms have been developed with this in mind. They collect network data through different protocols. This data is then made available and understandable to engineers to manipulate it. Thanks to a wide range of analytics tools, they are able to build Key Performance indicators, regroup services, regroup equipment, define “Busy Hour” periods on the basis of one or several KPIs, and forecast.
Understanding the current state of your network becomes simple and intuitive. Forecasting the future is more precise and capacity planning can be done on a daily basis.
Capacity planning does therefore involve areas of uncertainty, but methodical work, a good knowledge of the enterprise’s network and an appropriate monitoring platform ensures investment at the right time to meet users’ needs.