Kinesis and MongoDB Integration
Powerful performance with an easy integration, powered by Telegraf, the open source data connector built by InfluxData.
5B+
Telegraf downloads
#1
Time series database
Source: DB Engines
1B+
Downloads of InfluxDB
2,800+
Contributors
Table of Contents
Powerful Performance, Limitless Scale
Collect, organize, and act on massive volumes of high-velocity data. Any data is more valuable when you think of it as time series data. with InfluxDB, the #1 time series platform built to scale with Telegraf.
See Ways to Get Started
Input and output integration overview
The Kinesis plugin enables you to read from Kinesis data streams, supporting various data formats and configurations.
The MongoDB Plugin allows you to send metrics to a MongoDB instance.
Integration details
Kinesis
This plugin reads from a Kinesis data stream and creates metrics using supported input data formats. It supports various configuration options for AWS Kinesis and DynamoDB checkpointing.
MongoDB
This plugin sends metrics to MongoDB, automatically creating time series collections where they don’t already exist. Time series collections require MongoDB 5.0+.
Configuration
Kinesis
# Configuration for the AWS Kinesis input.
[[inputs.kinesis_consumer]]
## Amazon REGION of kinesis endpoint.
region = "ap-southeast-2"
## Amazon Credentials
## Credentials are loaded in the following order
## 1) Web identity provider credentials via STS if role_arn and web_identity_token_file are specified
## 2) Assumed credentials via STS if role_arn is specified
## 3) explicit credentials from 'access_key' and 'secret_key'
## 4) shared profile from 'profile'
## 5) environment variables
## 6) shared credentials file
## 7) EC2 Instance Profile
# access_key = ""
# secret_key = ""
# token = ""
# role_arn = ""
# web_identity_token_file = ""
# role_session_name = ""
# profile = ""
# shared_credential_file = ""
## Endpoint to make request against, the correct endpoint is automatically
## determined and this option should only be set if you wish to override the
## default.
## ex: endpoint_url = "http://localhost:8000"
# endpoint_url = ""
## Kinesis StreamName must exist prior to starting telegraf.
streamname = "StreamName"
## Shard iterator type (only 'TRIM_HORIZON' and 'LATEST' currently supported)
# shard_iterator_type = "TRIM_HORIZON"
## Max undelivered messages
## This plugin uses tracking metrics, which ensure messages are read to
## outputs before acknowledging them to the original broker to ensure data
## is not lost. This option sets the maximum messages to read from the
## broker that have not been written by an output.
##
## This value needs to be picked with awareness of the agent's
## metric_batch_size value as well. Setting max undelivered messages too high
## can result in a constant stream of data batches to the output. While
## setting it too low may never flush the broker's messages.
# max_undelivered_messages = 1000
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
##
## The content encoding of the data from kinesis
## If you are processing a cloudwatch logs kinesis stream then set this to "gzip"
## as AWS compresses cloudwatch log data before it is sent to kinesis (aws
## also base64 encodes the zip byte data before pushing to the stream. The base64 decoding
## is done automatically by the golang sdk, as data is read from kinesis)
##
# content_encoding = "identity"
## Optional
## Configuration for a dynamodb checkpoint
[inputs.kinesis_consumer.checkpoint_dynamodb]
## unique name for this consumer
app_name = "default"
table_name = "default"
MongoDB
[[outputs.mongodb]]
# connection string examples for mongodb
dsn = "mongodb://localhost:27017"
# dsn = "mongodb://mongod1:27017,mongod2:27017,mongod3:27017/admin&replicaSet=myReplSet&w=1"
# overrides serverSelectionTimeoutMS in dsn if set
# timeout = "30s"
# default authentication, optional
# authentication = "NONE"
# for SCRAM-SHA-256 authentication
# authentication = "SCRAM"
# username = "root"
# password = "***"
# for x509 certificate authentication
# authentication = "X509"
# tls_ca = "ca.pem"
# tls_key = "client.pem"
# # tls_key_pwd = "changeme" # required for encrypted tls_key
# insecure_skip_verify = false
# database to store measurements and time series collections
# database = "telegraf"
# granularity can be seconds, minutes, or hours.
# configuring this value will be based on your input collection frequency.
# see https://docs.mongodb.com/manual/core/timeseries-collections/#create-a-time-series-collection
# granularity = "seconds"
# optionally set a TTL to automatically expire documents from the measurement collections.
# ttl = "360h"
Input and output integration examples
Kinesis
- Basic Configuration: Set up the Kinesis Consumer to read from a specific stream in a specified AWS region.
- Checkpointing: Use DynamoDB to checkpoint processed records to ensure data is not lost during stream consumption.
- Data Format Management: Configure the plugin to handle different data formats, allowing for flexibility in how data is ingested.
MongoDB
-
Log Management: Integrate this plugin to send application logs directly to MongoDB for structured storage and flexible querying. You can analyze logs as time series data, aggregating logs by hour, day, or month.
-
Metric Capture: Use the plugin to capture system metrics (CPU, memory usage) in real-time and store them in MongoDB. The time-series collections will allow for efficient queries over time ranges.
-
Monitoring Solutions: Combine this output plugin with inputs from various sources, such as disk usage metrics, network statistics, or application performance data. It allows for consolidated monitoring dashboards with historical trends saved in MongoDB.
Feedback
Thank you for being part of our community! If you have any general feedback or found any bugs on these pages, we welcome and encourage your input. Please submit your feedback in the InfluxDB community Slack.
Powerful Performance, Limitless Scale
Collect, organize, and act on massive volumes of high-velocity data. Any data is more valuable when you think of it as time series data. with InfluxDB, the #1 time series platform built to scale with Telegraf.
See Ways to Get Started
Related Integrations
Related Integrations
HTTP and InfluxDB Integration
The HTTP plugin collects metrics from one or more HTTP(S) endpoints. It supports various authentication methods and configuration options for data formats.
View IntegrationKafka and InfluxDB Integration
This plugin reads messages from Kafka and allows the creation of metrics based on those messages. It supports various configurations including different Kafka settings and message processing options.
View IntegrationMQTT and InfluxDB Integration
The MQTT plugin is a service input for reading metrics from specified MQTT topics. It supports various data formats and configuration options for reliable message consumption.
View Integration