Google Cloud Stackdriver and Prometheus Integration
Powerful performance with an easy integration, powered by Telegraf, the open source data connector built by InfluxData.
5B+
Telegraf downloads
#1
Time series database
Source: DB Engines
1B+
Downloads of InfluxDB
2,800+
Contributors
Table of Contents
Powerful Performance, Limitless Scale
Collect, organize, and act on massive volumes of high-velocity data. Any data is more valuable when you think of it as time series data. with InfluxDB, the #1 time series platform built to scale with Telegraf.
See Ways to Get Started
Input and output integration overview
This plugin enables the collection of monitoring data from Google Cloud services through the Stackdriver Monitoring API. It is designed to help users monitor their cloud infrastructure’s performance and health by gathering relevant metrics.
The Prometheus Output Plugin enables Telegraf to expose metrics at an HTTP endpoint for scraping by a Prometheus server. This integration allows users to collect and aggregate metrics from various sources in a format that Prometheus can process efficiently.
Integration details
Google Cloud Stackdriver
The Stackdriver Telegraf plugin allows users to query timeseries data from Google Cloud Monitoring using the Cloud Monitoring API v3. With this plugin, users can easily integrate Google Cloud monitoring metrics into their monitoring stacks. This API provides a wealth of insights about resources and applications running in Google Cloud, including performance, uptime, and operational metrics. The plugin supports various configuration options to filter and refine the data retrieved, enabling users to customize their monitoring setup according to their specific needs. This integration facilitates a smoother experience in maintaining the health and performance of cloud resources and assists teams in making data-driven decisions based on historical and current performance statistics.
Prometheus
This plugin for facilitates the integration with Prometheus, a well-known open-source monitoring and alerting toolkit designed for reliability and efficiency in large-scale environments. By working as a Prometheus client, it allows users to expose a defined set of metrics via an HTTP server that Prometheus can scrape at specified intervals. This plugin plays a crucial role in monitoring diverse systems by allowing them to publish performance metrics in a standardized format, enabling extensive visibility into system health and behavior. Key features include support for configuring various endpoints, enabling TLS for secure communication, and options for HTTP basic authentication. The plugin also integrates seamlessly with global Telegraf configuration settings, supporting extensive customization to fit specific monitoring needs. This promotes interoperability in environments where different systems must communicate performance data effectively. Leveraging Prometheus’s metric format, it allows for flexible metric management through advanced configurations such as metric expiration and collectors control, offering a sophisticated solution for monitoring and alerting workflows.
Configuration
Google Cloud Stackdriver
[[inputs.stackdriver]]
## GCP Project
project = "erudite-bloom-151019"
## Include timeseries that start with the given metric type.
metric_type_prefix_include = [
"compute.googleapis.com/",
]
## Exclude timeseries that start with the given metric type.
# metric_type_prefix_exclude = []
## Most metrics are updated no more than once per minute; it is recommended
## to override the agent level interval with a value of 1m or greater.
interval = "1m"
## Maximum number of API calls to make per second. The quota for accounts
## varies, it can be viewed on the API dashboard:
## https://cloud.google.com/monitoring/quotas#quotas_and_limits
# rate_limit = 14
## The delay and window options control the number of points selected on
## each gather. When set, metrics are gathered between:
## start: now() - delay - window
## end: now() - delay
#
## Collection delay; if set too low metrics may not yet be available.
# delay = "5m"
#
## If unset, the window will start at 1m and be updated dynamically to span
## the time between calls (approximately the length of the plugin interval).
# window = "1m"
## TTL for cached list of metric types. This is the maximum amount of time
## it may take to discover new metrics.
# cache_ttl = "1h"
## If true, raw bucket counts are collected for distribution value types.
## For a more lightweight collection, you may wish to disable and use
## distribution_aggregation_aligners instead.
# gather_raw_distribution_buckets = true
## Aggregate functions to be used for metrics whose value type is
## distribution. These aggregate values are recorded in in addition to raw
## bucket counts; if they are enabled.
##
## For a list of aligner strings see:
## https://cloud.google.com/monitoring/api/ref_v3/rpc/google.monitoring.v3#aligner
# distribution_aggregation_aligners = [
# "ALIGN_PERCENTILE_99",
# "ALIGN_PERCENTILE_95",
# "ALIGN_PERCENTILE_50",
# ]
## Filters can be added to reduce the number of time series matched. All
## functions are supported: starts_with, ends_with, has_substring, and
## one_of. Only the '=' operator is supported.
##
## The logical operators when combining filters are defined statically using
## the following values:
## filter ::= {AND AND AND }
## resource_labels ::= {OR }
## metric_labels ::= {OR }
## user_labels ::= {OR }
## system_labels ::= {OR }
##
## For more details, see https://cloud.google.com/monitoring/api/v3/filters
#
## Resource labels refine the time series selection with the following expression:
## resource.labels. =
# [[inputs.stackdriver.filter.resource_labels]]
# key = "instance_name"
# value = 'starts_with("localhost")'
#
## Metric labels refine the time series selection with the following expression:
## metric.labels. =
# [[inputs.stackdriver.filter.metric_labels]]
# key = "device_name"
# value = 'one_of("sda", "sdb")'
#
## User labels refine the time series selection with the following expression:
## metadata.user_labels."" =
# [[inputs.stackdriver.filter.user_labels]]
# key = "environment"
# value = 'one_of("prod", "staging")'
#
## System labels refine the time series selection with the following expression:
## metadata.system_labels."" =
# [[inputs.stackdriver.filter.system_labels]]
# key = "machine_type"
# value = 'starts_with("e2-")'
</code></pre>
Prometheus
[[outputs.prometheus_client]]
## Address to listen on.
## ex:
## listen = ":9273"
## listen = "vsock://:9273"
listen = ":9273"
## Maximum duration before timing out read of the request
# read_timeout = "10s"
## Maximum duration before timing out write of the response
# write_timeout = "10s"
## Metric version controls the mapping from Prometheus metrics into Telegraf metrics.
## See "Metric Format Configuration" in plugins/inputs/prometheus/README.md for details.
## Valid options: 1, 2
# metric_version = 1
## Use HTTP Basic Authentication.
# basic_username = "Foo"
# basic_password = "Bar"
## If set, the IP Ranges which are allowed to access metrics.
## ex: ip_range = ["192.168.0.0/24", "192.168.1.0/30"]
# ip_range = []
## Path to publish the metrics on.
# path = "/metrics"
## Expiration interval for each metric. 0 == no expiration
# expiration_interval = "60s"
## Collectors to enable, valid entries are "gocollector" and "process".
## If unset, both are enabled.
# collectors_exclude = ["gocollector", "process"]
## Send string metrics as Prometheus labels.
## Unless set to false all string metrics will be sent as labels.
# string_as_label = true
## If set, enable TLS with the given certificate.
# tls_cert = "/etc/ssl/telegraf.crt"
# tls_key = "/etc/ssl/telegraf.key"
## Set one or more allowed client CA certificate file names to
## enable mutually authenticated TLS connections
# tls_allowed_cacerts = ["/etc/telegraf/clientca.pem"]
## Export metric collection time.
# export_timestamp = false
## Specify the metric type explicitly.
## This overrides the metric-type of the Telegraf metric. Globbing is allowed.
# [outputs.prometheus_client.metric_types]
# counter = []
# gauge = []
Input and output integration examples
Google Cloud Stackdriver
-
Integrating Cloud Metrics into Custom Dashboards: With this plugin, teams can funnel metrics from Google Cloud into personalized dashboards, allowing for real-time monitoring of application performance and resource utilization. By customizing the visual representation of cloud metrics, operations teams can easily identify trends and anomalies, enabling proactive management before issues escalate.
-
Automated Alerts and Analysis: Users can set up automated alerting mechanisms leveraging the plugin’s metrics to track resource thresholds. This capability allows teams to act swiftly in response to performance degradation or outages by providing immediate notifications, thus reducing the mean time to recovery and ensuring continued operational efficiency.
-
Cross-Platform Resource Comparison: The plugin can be used to draw metrics from various Google Cloud services and compare them with on-premise resources. This cross-platform visibility helps organizations make informed decisions about resource allocation and scaling strategies, as well as optimize cloud spending versus on-premise infrastructure.
-
Historical Data Analysis for Capacity Planning: By collecting historical metrics over time, the plugin empowers teams to conduct thorough capacity planning. Understanding past performance trends facilitates accurate forecasting for resource needs, leading to better budgeting and investment strategies.
Prometheus
-
Monitoring Multi-cloud Deployments: Utilize the Prometheus plugin to collect metrics from applications running across multiple cloud providers. This scenario allows teams to centralize monitoring through a single Prometheus instance that scrapes metrics from different environments, providing a unified view of performance metrics across hybrid infrastructures. It streamlines reporting and alerting, enhancing operational efficiency without needing complex integrations.
-
Enhancing Microservices Visibility: Implement the plugin to expose metrics from various microservices within a Kubernetes cluster. Using Prometheus, teams can visualize service metrics in real time, identify bottlenecks, and maintain system health checks. This setup supports adaptive scaling and resource utilization optimization based on insights generated from the collected metrics. It enhances the ability to troubleshoot service interactions, significantly improving the resilience of the microservice architecture.
-
Real-time Anomaly Detection in E-commerce: By leveraging this plugin alongside Prometheus, an e-commerce platform can monitor key performance indicators such as response times and error rates. Integrating anomaly detection algorithms with scraped metrics allows the identification of unexpected patterns indicating potential issues, such as sudden traffic spikes or backend service failure. This proactive monitoring empowers business continuity and operational efficiency, minimizing potential downtimes while ensuring service reliability.
-
Performance Metrics Reporting for APIs: Utilize the Prometheus Output Plugin to gather and report API performance metrics, which can then be visualized in Grafana dashboards. This use case enables detailed analysis of API response times, throughput, and error rates, promoting continuous improvement of API services. By closely monitoring these metrics, teams can quickly react to degradation, ensuring optimal API performance and maintaining a high level of service availability.
Feedback
Thank you for being part of our community! If you have any general feedback or found any bugs on these pages, we welcome and encourage your input. Please submit your feedback in the InfluxDB community Slack.
Powerful Performance, Limitless Scale
Collect, organize, and act on massive volumes of high-velocity data. Any data is more valuable when you think of it as time series data. with InfluxDB, the #1 time series platform built to scale with Telegraf.
See Ways to Get Started
Related Integrations
Related Integrations
HTTP and InfluxDB Integration
The HTTP plugin collects metrics from one or more HTTP(S) endpoints. It supports various authentication methods and configuration options for data formats.
View IntegrationKafka and InfluxDB Integration
This plugin reads messages from Kafka and allows the creation of metrics based on those messages. It supports various configurations including different Kafka settings and message processing options.
View IntegrationKinesis and InfluxDB Integration
The Kinesis plugin allows for reading metrics from AWS Kinesis streams. It supports multiple input data formats and offers checkpointing features with DynamoDB for reliable message processing.
View Integration