Monitoring TLS Certificates with Telegraf
By
David Flanagan /
Use Cases, Developer
Apr 25, 2019
Navigate to:
We’ve all been there. You’re sitting eating your lunch in the office canteen and you notice a flurry of people walking briskly and asking each other to check the website on their phone. Is it just one phone? Oh, it’s your phone too. Maybe it’s the WiFi … they check on 4G …
The faces slowly turn in your direction, eyes catching awkwardly. You feel your phone vibrate in your pocket … not just once. Production can’t be down, you think. My pager hasn’t gone off … everything must be fine, they’re confused; right?
Oh dear. The x509 / TLS certificate expired and nobody in the world can browse our high-profile, 24x7, worldwide, super amazing website.
While it’s common for operators and developers to monitor their systems, using the metrics we treasure so dearly: RED/USE/4 Golden Signals, something so simple is often overlooked - the x509 certificates with which we deliver our website, used to authenticate microservices, or to authenticate against the Kubernetes API.
Fortunately, we’ve got you covered! Telegraf has had an x509_cert plugin for many years now, and it couldn’t be easier to setup.
Configuring the plugin
The x509_cert input plugin supports local and remote x509 endpoints. So whether you’re running Telegraf as a daemonset on your Kubernetes cluster, monitoring your local cert directory, or running a single instance to monitor your certificates from a users perspective; we’ve got you covered.
[[inputs.x509_cert]]
sources= ["https://www.example.org:443", "/etc/tls/certs/www.example.org"]
Available metrics
Now that we’ve got Telegraf collecting and sending our x509 metrics to InfluxDB, we can begin to build a query to alert on its expiration. Fortunately, this is almost as simple as configuring the plugin.
SELECT (expiry / 60 / 60 / 24) as "expiry" FROM "telegraf"."autogen"."x509_cert"
This will return the number of days until each certificate expires.
Telegraf provides the following tags to filter or add to your alerts:
- common_name
- country
- locality
- organization
- organizational_unit
- province
- source
If you’ve got Telegraf configured to add the hostname to each measurement, that will also be available. Be sure to only use this when running Telegraf as a daemonset or on bare metal.
Telegraf makes it incredibly simple to monitor these certificates that nobody should ever have to get caught off guard again.