Arista Lanz Consumer Monitoring
Use This InfluxDB Integration for FreeShort for Arista Network Visibility Latency Analyzer, LANZ is a tool designed to track interface congestion and queuing latency with real-time data collection and reporting - thus allowing developers to guarantee maximum performance for end users at all times. With the LANZ application layer event export, all applications that you're working with can use historical data to predict impending congestion and latency.
In other words, it's a way to quickly identify a small problem now before it has a chance to become a much bigger one in the future. This also enables the application layer to make traffic and routing decisions with total visibility into the network layer, thus resulting in smarter and more proactive actions across the board.
Overall, LANZ helps network operations teams and administrators alike have near real-time visibility into the health and operation of a network. This makes it possible to detect situations like microbursts as early as possible. LANZ is notable in that it will also continually monitor congestion on a network, allowing for the rapid detection of said events and the automatic sending of application layer messages to those employees who need to see them.
There are a few different types of events that users will experience when LANZ is in notifying mode.
- "Start" occurs when any queue on an interface exceeds the upper threshold set during the initial configuration.
- "Update" events are generated periodically when the congested queue remains above that threshold.
- "End" events are generated when the congested queue finally drops below the lower threshold, either thanks to the intervention of team members or because things have returned to normal on their own.
A Polling Mode is also available where LANZ will poll the most congested queue in each ASIC and continue to report on it every 800 microseconds, but this is only available on Arad and Jericho switches. LANZ can export data into system log messages as well depending on your preferences.
Why use a Telegraf plugin for Arista LANZ Consumer?
The Arista LANZ Consumer Telegraf Plugin is a consumer for use with Arista Networks' Latency Analyzer (LANZ) to stream data via TCP through port 50001 on the switches management IP into InfluxDB. LANZ provides congestion data by continuously monitoring each port's output queue lengths. When the length of an output queue exceeds the upper threshold for that port, LANZ generates an over-threshold event. Collecting these metrics into InfluxDB will allow you to gain insights into your networks and enable your applications to react to any changes in the network conditions. You can pair this with a number of other Telegraf plugins to get a view into your entire application stack.
How to monitor your networks with the Arista LANZ Consumer Telegraf Plugin
One of the most important things to understand about all of this is that LANZ is disabled by default — meaning that you’ll need to manually enable it so that it can function. Once enabled, your switch will monitor queue lengths on not only all front-panel ports, but also on CPU and fabric ports on your selected platforms. Queue length data is available in a few different formats for you to choose from depending on your needs, including as syslog data, CLI display, data stream or the CSV-format output.
- Documentation on configuring LANZ
- Enabling streaming LANZ data
In the Telegraf configuration, you need to list the servers from which you want to collect the streamed metrics.
Once streaming data has been enabled, you can further configure everything on your system using the following commands. Simply replace the default values in the example with the ones that make the most sense given your deployment:
[[inputs.lanz]] servers = [ "tcp://switch1.int.example.com:50001", "tcp://switch2.int.example.com:50001",
Key Arista LANZ Consumer metrics to use for monitoring
Some of the important Arista LANZ Consumer metrics that you can use include:
intf_name
switch_id
port_id
entry_type
traffic_class
fabric_peer_intf_name
source
port
queue_size
(integer)time_of_max_qlen
(integer)tx_latency
(integer)q_drop_count
(integer)