Kubernetes Monitoring

Real-time visibility into your entire container-based environment to unify all your metrics and events for faster root cause analysis.

Get Started Now

Why monitor Kubernetes?

Kubernetes orchestration provides built-in fault tolerance, automating scaling and maintenance for a desired cluster state. However, visibility must come with the necessary granularity and information for fast identification of the source of trouble. Monitoring and accountability are what make automation reliable.

Why InfluxDB for Kubernetes monitoring?

k8s-nodemetrics-dashboard

Most production environments don’t have a singular approach when it comes to application monitoring. InfluxDB can monitor private and public cloud infrastructures (e.g., PaaS, SaaS, website) and provide deployment variances for scalability, as well as custom instrumentations for uniqueness and cross-measurement analytics advanced alerting. InfluxDB helps to identify and resolve problems before they affect critical processes, and most importantly, offers ways to implement Kubernetes monitoring that accommodates developers’ need for instrumentation without overloading IT operations.

Get broad insight and act in time

Monitor all Prometheus metrics, custom application metrics, K8s annotations, and logs from one pane. Multiple data collection options and comprehensive view of infrastructure, containers, and applications status are fundamental to keep services running without degradations or escalating issues that could lead to outages.

Optimize to be competitive at lower cost

InfluxDB is a performant store for time series data both numeric and non-numeric. Avoiding the need for expensive hardware and storage, and also, to install and maintain multiple monitoring platforms. InfluxDB real-time stream analytics, highly efficient compression and compaction, allow data to be ingested and stored cost-effectively.

HA and Scalability

InfluxDB’s purpose-built design for time series allows for very high volume storage of monitoring records while providing horizontal scalability and high availability with clustering. This makes it an ideal solution for long-term storage of Kubernetes monitoring data for historic records or modeling purposes.

InfluxData support for Kubernetes monitoring

InfluxData’s Telegraf is an open source, plugin-based agent (300+ plugins) that collects metrics and events, from Kubernetes nodes, master node, pods, containers and all Prometheus /metric endpoints. Telegraf is also capable of monitoring itself and its metric pipeline, so you can be alerted if some important metrics are not being collected. In Kubernetes cluster environments, Telegraf can be deployed as a DaemonSet in every node, as an application sidecar in pods, or as a central collector.

In addition to Telegraf’s versatility in collecting data from Kubernetes APIs, Prometheus /metrics endpoint, infrastructure (systems, VMs and containers), and applications, InfluxData has implemented a Kubernetes Operator to facilitate the deployment and management of InfluxDB Instances.

Telegraf deployed as DaemonSet agent

Telegraf can be installed as a DaemonSet in every Kubernetes node. Telegraf can directly collect monitoring data from the nodes, containers, pods, and application via push or pull mechanism. This deployment option is good for baseline metrics that need to be collected from all nodes.

Telegraf deployed as a sidecar agent

Telegraf can be installed as an application sidecar in Kubernetes pod deployments. As a sidecar, Telegraf is encapsulated in the pod with the application sharing the same network. This deployment is particularly useful to isolate the impact of application instrumentation to the pod level, not overloading the entire Kubernetes monitoring. Sidecar deployment gives freedom to developers to expose and instrument metrics as they find necessary, without burdening IT Ops with scraping scaling issues.

Telegraf deployed as central agent with Kubernetes service discovery

Telegraf supports Kubernetes service discovery by watching Prometheus annotations on pods, thus finding out which applications expose /metrics endpoints. As a single agent, Telegraf can scrape /metrics endpoints exposed in the clusters and send collected data more efficiently to InfluxDB.

Native Kubernetes Operators

InfluxDB Kubernetes Operator allows for InfluxDB to be deployed as a Kubernetes object. It is built using the Operator SDK, which is part of the Operator Framework, and manages one or more InfluxDB instances deployed on Kubernetes. A common use case is to facilitate backup/restore operations.

K8s-Diagram 07.09.2021v3

Specific Telegraf components

  • Telegraf Kubernetes input plugin – The Kubernetes Input Plugin talks to the kubelet API to gather metrics about the running pods and containers.
  • Telegraf Kubernetes Inventory plugin – The Kubernetes Inventory plugin collects kube state metrics (nodes, namespaces, deployments, replica sets, pods etc.)
  • Telegraf plugin for service discovery of Prometheus /metrics – The Prometheus Format Input Plugin for Telegraf discovers and gathers metrics from HTTP servers exposing metrics in Prometheus format.
  • Telegraf self-monitoring of metric pipeline plugins – Telegraf can collect data about internal metrics and agent stats by enabling inputs.internal plugin, and also check its own availability by enabling http_response plugin.

InfluxDB for Kubernetes monitoring

InfluxData has added Kubernetes-specific capabilities to make it easier for its users to work with Kubernetes:

  • Helm Charts for Faster Node Deployment – kube-influxdb is a collection of Helm Charts for the InfluxData TICK Stack to monitor Kubernetes with InfluxData.
  • Native Kubernetes Operators – A Kubernetes Operator manages InfluxDB instances deployed as a Kubernetes object.
  • High Availability (HA) and scalability of monitored data – Large volume of Kubernetes metrics and events can be preserved in InfluxDB storage clusters allowing long-term policy retention together with high data granularity and high series cardinality.
  • Integration with Prometheus Monitoring: Kubernetes native monitoring is based on Prometheus format. InfluxDB integration with Kubernetes Prometheus monitoring is supported in two ways:
    1. Remote Write API: Prometheus can write samples that it ingests to InfluxDB in a standardized format.
    2. Remote Read API: Prometheus can read (back) sample data from InfluxDB in a standardized format.
“We recently introduced InfluxDB as our first-class time series database system, where we had the opportunity to work directly with InfluxData to ensure we were on a path that is scalable, robust, and in line with the future direction of their platform.”

Mike Bell, Engineer, Wayfair

Featured customers:

Learn more about Kubernetes monitoring implementations made available by the InfluxDB community:

WEBINARS

InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring Shows how to use InfluxDB and the Telegraf Operator to monitor your Kubernetes containers.

How InfluxData Makes Kubernetes an Even Better Master of Its Components Through Monitoring Shows how to use InfluxData to help Kubernetes orchestrate the scaling out of applications by monitoring all components of the underlying infrastructure.

Kapacitor: Service Discovery, Pull and Kubernetes Shows how Kapacitor’s Service discovery and scraping code will allow any service discovery target that works with Prometheus to work with Kapacitor.

BLOG POSTS

Monitoring the Kubernetes Nginx Ingress with the Nginx InfluxDB Module

Kubernetes Cluster Monitoring and Autoscaling With Telegraf and Kapacitor

How to Spin up the TICK Stack in a Kubernetes Instance

Packaged Kubernetes Deployments – Writing a Helm Chart

Scaling Kubernetes Monitoring without Blind Spots or Operations Burden