Cisco is a feature bundle embedded in routers, targeted at improving end-user experience when they use applications over Wide Area Network (WAN). Cisco IWAN provides the ability to report your application performance metrics, enables per-application policy for granular control of application bandwidth use (Application Visibilty and Control, AVC), monitors network performance and selects the best path for each Class of Service (Performance Routing, PfR), and optimize application traffic for faster response time and less bandwidth (Wide Area Application Services, WaaS).
Netflow version 9 and IPFIX are the protocols of choices for Cisco IWAN to export information from the routers.
eye.lo is a multi-tenant software platform dedicated to Service Provider. Its distributed architecture, based on redundant nodes, insures smooth scalability, optimal level of use of resources, fault tolerance and low latency response for the concurrent requests and big data.
LivingObjects recommends a centralized collection to ease implementation of new customers. However, eye.lo can also be deployed in isolated environment.
eye.lo is deployed in Service Providers’ DCs on a single server or a distributed infrastructure. Above 3000 routers, LivingObjects recommends a distributed architecture which enables smooth scalability across time. I/O are optimized for random data access. Data storage needs to be implemented on physical machine with SSD. The other components can be virtualized, but need dedicated VM infrastructure not to compete with other software for I/O.
LivingObjects recommends Linux Debian distribution with Kernel version greater than 3.16. For other distribution contact LivingObjects at firstname.lastname@example.org. To access eye.lo, end-customers can use any recent web browsers.
In the past, typical network traffic could easily be identified using well known port number. HTTP, HTTPS, POP3, or IMAP were among common traffic seen in enterprise. Today, there is increasing number of applications which is delivered over HTTP – both business and recreational applications. And many applications use dynamic ports such as Exchange, and voice and video which are delivered over RTP. This makes them impossible to be identified by looking at port number.
NBAR2 is Cisco’s Deep Packet Inspection (DPI), based on application signature, and Field Extraction (FE) technologies, to retrieve fields such as HTTP URL, SIP domain, mail server, and so on. Application information such as Sharepoint, Netflix, or Google Docs is provided by Nbar2 signature dictionary, called protocol pack. The protocol pack is updated several times a year to include new applications. Version 16.0 includes more than 1500 signatures.
PfR is part of Cisco IWAN. PfR monitors network performance and routes applications based on application performance policies and load balances traffic based upon link utilization levels to efficiently utilize all available WAN bandwidth. PfR is comprised of two major Cisco IOS components, a Master Controller (MC) and a Border Router (BR).
The Master Controller is a policy decision point at which policies are applied to various traffic classes that traverse the Border Router systems.
- The hub master controller is the master controller at the hub-site, which is either a data center or a head quarter. This is the device where all policies are configured. It also acts as master controller for that site and makes optimization decision.
- Branch Master Controller: The Branch Master Controller is the master controller at the branch-site. There is no policy configuration on this device. It receives policy from the Hub MC. This device acts as master controller for that site for making optimization decision.
Border Routers (BRs) are in the data forwarding path. Border Routers collect data from their Performance Monitor cache and smart probe results, provide a degree of aggregation of this information, and influence the packet forwarding path as directed by the Master Controller to manage user traffic.
NetFlow provides the ability to collect IP network information as it enters or exits an interface. A Flow Record consists of keyed fields and non-keyed fields. Keyed fields are all field(s) which need to be unique in order for a new Flow Record cache entry to be created in the CPE memory. Non-keyed fields provide information such as metrics (byte count, packet count, latency or jitter). For every record, a cache table is created to track and store flow entry. A new cache entry is created when the keyed field(s) of the packet does not match existing cache entry. Otherwise, only the non-keyed fields are updated, e.g. byte count is incremented.
The key difference resides in the information access: SNMP requires collectors to request the information. Netflow collectors passively receive and process flows from all devices. For first case (polling), devices need to store the data available on request. With Netflow, devices send data once processed. Thus, if devices embed the right processing engines (Deep Packet Inspection, passive probe, …), one could have much more detail on traffic and performance using Netflow.
1. Netflow 5: (IPV4-specific,) NFV5 is the most commonly deployed version. The flows exported by the equipment provides 5-tuple keyed fields, Source IP/port, Destination IP/port and protocol, to describe the identities of the systems involved in the conversation and the amount of data transferred.
2. Flexible NetFlow FNF V9: (IPv4&6 compatible) Version v9 has brought FNF capability, which makes Netflow a highly versatile protocol. Its flexibility makes it particularly more relevant for complex reporting and heterogeneous data.
- Flexible key field aggregation
- variable number of data fields.
- unidirectional or bidirectional
- sampled or not
- multi-vendor (430 standardized fields, thousands vendor-specific fields)
- aggregated, synchronized or not for exports
3. IPFIX: (“IP Flow Information eXport”) also referred to as NFv10, IPFIX is the industry standardized version of Netflow. It builds on NFv9 for most of the features, and brings additional flexibility (variable-length fields, sub -application extracted fileds, options-data…)
Note: Netflow version 9 and IPFIX are the export protocols of choices for AVC, because they can accommodate flexible record format and multiple records required by Flexible Netflow infrastructure. IPFix is recommended.
5 minutes is the default recommended granularity for all counters.
When received, data is aggregated to minimize storage without deleting history. Available granularities are 5 minutes, hour and day. By default, data retention is 3 week for 5 minutes granularity, 3 months for 1 hour granularity, and day. Data are never purged. Specific KPI such as daily 5 minutes max period can however be stored for longer period for capacity planning needs.
CPEs send flows on a scheduled period. Let’s take 5 minutes as the export period: For five minutes, the CPEs aggregate traffic. Then, they export the aggregated flows, the next five minutes (smooth export to prevent spike of traffic in the collection link). eye.lo collects the data on the fly and starts data processing at the end of the five minutes export period. So when an event occurs, it will be displayed between 5:30 minutes (best case) to 14 minutes after the event (Worst case with heavy load on the eye.lo servers).
eye.lo needs to be provisioned with the same information a multi-tenant SNMP collection tool needs and some specific fields dedicated to IWAN: IP address, Client, WAN interfaces and additional data for smoother UI browsing, such as site name, customer contract, address or coordinate for mapping, …. Additional information is auto-discovered by eye.lo. Seed file is provisioned manually using a .csv file or scheduled for automatic import.
See Installation Guide for required fields format.
You need to configure data export on CPEs. Data will be exported using Netflow/IPFIX towards eye.lo.
- The Exporter defines the source interface (LAN/WAN) and the destination IP address of the remote collector
- The Monitor attaches records and exporters to the devices interfaces to activate the monitoring. It also configures the cache maximum size (32k recommended for small branch routers to 200k for DCs) and strategy (Synchronized 5 minutes recommended).
You may configure one of the following options (see LLD for details): static LO Performance Monitors, custom LO EZPM, EZPM ‘application-performance’ profile, FNF, TNF.
Compared to the relative simplicity of SNMP monitoring metrics, IWAN features comes at an expense on the Network device in terms of memory (need for anticipation in configuring the cache size) and CPU (for advanced processing). Cisco has introduced EZPM in latest version of IWAN to decrease CPU load. Service providers have to check if the needed additional resources on routers is compliant with their existing portfolio (ie: Which router for which contracted bandwidth).
To poll data from devices, eye.lo embeds SNMP V2/V3, Telnet, TL1 and SSH
For flow collection, eye.lo is compliant with Traditional Netflow and IPFIX.
eye.lo embeds traditional ways to collect information among IP Network devices: Polling (snmp V2/V3, SSH, Telnet) and Flow collection (Traditional Netflow). Using these communication protocols, eye.lo can collect information on a multi-vendor Network.
As a multi-tenant platform, eye.lo does not support network discovery. eye.lo needs information that can not be discovered such as the client name for a specific IP address, and particular IWAN fields to process IWAN information. A seed file is required to enable eye.lo.
Standard SNMP polling mode help admin to quickly add metrics based on OID (For example: CPU, traffic, drop, ….).
Expert mode leverages a scripting interface to connect the monitored device and its counterparts (Provider Edge, IPSLA probe) and collect advanced metrics (CoS, IPSLA, Metrics, ..). Admins can mix CLI, attributes coming from the topology and routers template to build advanced metrics such as traffic per CoS, IPSLA jitter, …
eye.lo architecture divides in layers: Proxy, collection node, storage cluster and services. Each component embeds his scalability strategy for a light-touch implementation. The non-blocking technology and distributed architecture used to collect data let eye.lo to achieve a high response rate for hundreds of thousands of multi-vendor devices.
The extreme flexibility and scalability yields a very short lead-time from the order to the delivery, resulting in value from day one.
See Low Level design Guide for more detail.
The provisioning process is highly parallel: Using snmp protocol, all tasks requests are asynchronously sent on the network and smoothed over time. For large seed files, processed equipment responses are batched onto groups of 1000, and returned to the administration UI for the user to track the provisioning progress.
If service providers choose a centralized collection, they have to size the collection link properly. Link sizing recommendation depends on :
- IWAN features enabled: More features = more data to export.
- Bandwidth per site repartition: headquarter with 500 employees will have more variety (=more export) than a small office with 20 employees)
- The time traffic distribution: the CPEs don’t have their max traffic at the same time.
A typical 10 000 CPE enterprise SP IWAN Network requires an 200 Mbps bandwidth
Collection link Max speed = Average flow size x Predicted Max aggregated traffic x Flow count at this max traffic
Please contact LivingObjects for customized sizing.
eye.lo processes and aggregates raw flows coming from the routers to store what is needed for a customer/service profile. Storage size varies depending on:
- Customer profiles: For example, an internet connectivity customer will access only application traffic metrics and a VPN connectivity customer will access traffic and performance metrics.
- The data retention: For example, an internet connectivity customer will access 5 minutes metrics for 1 week, and a VPN connectivity customer will access traffic and performance metrics for 3 weeks.
Storage sizing = Client profile X Data retention rule X Predicted Average aggregated traffic
A typical 10000 CPE, scalable infrastructure is
- Proxy: 1 x 4 CPU, 4 GO RAM, 1 SAS 20 GB (scalable)
- Collect: 2 x 8 CPU, 16 GO RAM, SAS 1 TB (scalable)
- Services : 4 x 8 CPU, 8 GO RAM, 1 SAS 200 GB
Physical environment for DB:
- 2 x (20 CPU/64GB RAM/ 2*1TB SSD disk (RAID1)) (Scalable)
Please contact LivingObjects for customized sizing.
An expert mode is available on eye.lo. It records raw flows (data + template) coming from the specified CPE and make the the data available for analysis. It helps admin troubleshoot issues on router template configuration.
The Flow Record represents the atomic building block exported by the device. NFv9 and IPFIX define 4 types of flow records: template, data, options template, options data. While template and data records describe the actual live traffic, the two latter stand for static mapping information, such as devices, interfaces, applications…
By using NBAR2 in the class-map, routers can identify traffic by NBAR2 application signature. This allows per-application policy control such as QoS, e.g. limit traffic rate for Netflix, Pandora, and iTunes applications, or guarantee bandwidth for business applications such as WebEx, Office 365, or Sharepoint.
Eye.lo stores and reports metrics down to the DSCP level per application and interface. This helps know whether a given application may have been misclassified when there is a DSCP-based QoS.
Yes. High availability for eye.lo requires the solution to be installed on two data centers, and declaring one instance as a backup to the other.
Note: See Low Level Design guide for more details.
IWAN monitoring level may be configured through eye.lo provisioning for each CPE. If the CPE exports PFR data flows but provisioned only for AVC, the PFR flows will be ignored by eye.lo.
Is it possible to define his own Key Performance Indicators based on raw counters provided by the collection ?
Beside default KPI included in the KPI library, admins can define custom KPIs they need to assess the Network and end-user experience. They use a graphical interface to build formulas mixing raw counters and operand. New KPIs are immediately available for building new graph or tables.
eye.lo architecture relies on proprietary TSDB, optimized for time series. Time Series data are stored every 5 minutes in read-only binary files. Map Reduce algorithms fetch and calculate values displayed on end-user graphs: this requires random, non linear access in those files, which explains why SSD drives is mandatory when eye.lo platform embeds more than a few hundreds of CPEs.
This optimized end-to-end architecture insures high responsiveness to end-users requests even for tens of thousands od CPEs.