Scaling Kubernetes Deployments with InfluxDB & Flux
By
Community /
Product, Use Cases, Developer
Oct 28, 2020
Navigate to:
This article was written by InfluxDB Community member and InfluxAce David Flanagan.
Eighteen hours ago, I was meeting with some colleagues to discuss our Kubernetes initiatives and grand plan for improving the integrations and support for InfluxDB running on Kubernetes. During this meeting, I laid out what I felt was missing for InfluxDB to really shine on Kubernetes. I won’t bore you with the details, but one of the things that I insisted we needed was a metrics server integration to provide horizontal pod autoscaling (HPA) based on data within InfluxDB. As I proposed the options we could take to bootstrap this quickly, my wonderful colleague Giacomo, chirped in:
“That already exists.”
TL;DR
- You can deploy
kube-metrics-adapter
to your cluster, which supports annotating your HPA resources with a Flux query to control the scaling of your deployment resources. - InfluxData has a Helm Charts repository that includes a chart for InfluxDB 2
- Telegraf can be used as a sidecar for local metric collection
- InfluxDB 2 has a component called
pkger
that allows for a declarative interface, through manifests (like Kubernetes), for the creation and management of InfluxDB resources.
Scaling your deployments with Flux
Giacomo continued with a great explanation of what was built, but I’m going to keep this brief. It turns out that a former colleague of ours, Lorenzo Affetti, submitted some PRs to Zalandos metrics-adapter project at the beginning of the year. The pull requests he submitted have since been merged, and we can actually use this project to scale our deployments by annotating said deployments with a Flux query.
How does it work? It’s rather simple. Let me show you.
Deploy InfluxDB
This article assumes you already have InfluxDB 2 running within your cluster. If you don’t, you can use our Helm Chart to deploy InfluxDB in 30s. I’ll start the clock now …
If you’re feeling brave, you can drop this into a terminal and hope for the best.
kubectl create namespace monitoring
helm repo add influxdata https://helm.influxdata.com/
helm upgrade --install influxdb --namespace=monitoring influxdata/influxdb2
Deploy Metrics Adapter
The first thing we need to do is deploy the metrics-adapter to our Kubernetes cluster. Zalando don’t provide a Helm chart for doing this, but Banzai Cloud do. Unfortunately, the Banzai Cloud chart needs a few tweaks to support the InfluxDB Collector; so for today, we’re going to deploy this with custom manifests. I know it’s not great, but you only need to do it once. ????
The Manifests
A word of caution, before you blindly copy and paste this to your cluster: there’s 3 hard-coded variables in the args
section of the Deployment
resource. If you plan to roll this out to production, please use Secrets and mount them as files or environment variables, rather than taking the haphazard approach that I use in this demo.
The 3 hard-coded variables are:
- InfluxDB URL
- Organization Name
- Token
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-server-resources
rules:
- apiGroups:
- custom.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-metrics-server-resources
rules:
- apiGroups:
- external.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-reader
rules:
- apiGroups:
- ""
resources:
- namespaces
- pods
- services
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-collector
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- apiGroups:
- apps
resources:
- deployments
- statefulsets
verbs:
- get
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-external-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: custom-metrics-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics-resource-collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-resource-collector
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
group: custom.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.external.metrics.k8s.io
spec:
group: external.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: v1
kind: Service
metadata:
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
ports:
- port: 443
targetPort: 443
selector:
app: kube-metrics-adapter
---
apiVersion: v1
kind: Namespace
metadata:
name: custom-metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kube-metrics-adapter
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
replicas: 1
selector:
matchLabels:
app: kube-metrics-adapter
template:
metadata:
labels:
app: kube-metrics-adapter
spec:
containers:
- args:
- --influxdb-address=http://influxdb.monitoring.svc:9999
- --influxdb-token=secret-token
- --influxdb-org=InfluxData
image: registry.opensource.zalan.do/teapot/kube-metrics-adapter:v0.1.5
name: kube-metrics-adapter
serviceAccountName: custom-metrics-apiserver
The big demo
Now that we have InfluxDB and Metrics Adapter running in our cluster, let’s scale some pods!
In the interest of keeping this demo quite complete, I’m going to cover using Telegraf as a sidecar, to scrape the metrics from nginx
, and using pkger
to create the bucket for our metrics using a Kubernetes concept called initContainers
. In order to accomplish both these steps, we need to inject a ConfigMap
to provide a Telegraf configuration file and a pkger
manifest. Our nginx
configuration is also included, which enables the status page.
You SHOULD read the comments above each file key within the YAML.
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-hpa
data:
# This is our nginx configuration. It enables the status (/nginx_status) page to be scraped from Telegraf over the shared interface within the pod.
default.conf: |
server {
listen 80;
listen [::]:80;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /nginx_status {
stub_status;
allow 127.0.0.1; #only allow requests from localhost
deny all; #deny all other hosts
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
# This is our Telegraf configuration. It has the same hard coded values we mentioned earlier. You'll want to move them to secrets for a production deployment,
# but I'm keeping that out of scope for this demo. We configure Telegraf to pull metrics from nginx and write to our local InfluxDB 2 instance.
telegraf.conf: |
[agent]
interval = "2s"
flush_interval = "2s"
[[inputs.nginx]]
urls = ["http://localhost/nginx_status"]
response_timeout = "1s"
[[outputs.influxdb_v2]]
urls = ["http://influxdb.monitoring.svc:9999"]
bucket = "nginx-hpa"
organization = "InfluxData"
token = "secret-token"
# Finally, we need a bucket to store our metrics. You don't need a long retention, as it's only used for HPA.
buckets.yaml: |
apiVersion: influxdata.com/v2alpha1
kind: Bucket
metadata:
name: nginx-hpa
spec:
description: Nginx HPA Example Bucket
retentionRules:
- type: expire
everySeconds: 900
Now I’m going to deploy nginx
to the cluster. I’ve chosen nginx
because it’s very easy to cause a scaling event with the vast array of HTTP load testing tools available; I’m going to use baton.
Our nginx
manifest looks like so. Again, please remember to extract the hard-coded values and use secrets!
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa
spec:
selector:
matchLabels:
app: nginx-hpa
template:
metadata:
labels:
app: nginx-hpa
spec:
volumes:
- name: influxdb-config
configMap:
name: nginx-hpa
initContainers:
- name: influxdb
image: quay.io/influxdb/influxdb:2.0.0-beta
volumeMounts:
- mountPath: /etc/influxdb
name: influxdb-config
command:
- influx
args:
- --host
- http://influxdb.monitoring.svc:9999
- --token
- secret-token
- pkg
- --file
- /etc/influxdb/buckets.yaml
- -o
- InfluxData
- --force
- "true"
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- mountPath: /etc/nginx/conf.d/default.conf
name: influxdb-config
subPath: default.conf
ports:
- containerPort: 80
- name: telegraf
image: telegraf:1.16
volumeMounts:
- mountPath: /etc/telegraf/telegraf.conf
name: influxdb-config
subPath: telegraf.conf
Finally, let’s take a look at the HorizontalPodAutoscaler
manifest that completes our demonstration.
We’ve added an annotation, metric-config.external.flux-query.influxdb/interval
, that allows us to specify the Flux query we wish to execute in order to get the metrics we require to determine if this deployment should be scaled up. Our Flux query fetches the waiting
field from our nginx
measurement, which with a greater-than-zero value, is a strong indicator that we need to scale horizontally to handle the current flow of traffic.
Our goal is to keep that waiting number as close to 0 / 1 as possible. We can also use another annotation, metric-config.external.flux-query.influxdb/interval
, to define how frequently we want to check for traffic and scaling events. We’re going to use 5s intervals.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
annotations:
metric-config.external.flux-query.influxdb/interval: "5s"
metric-config.external.flux-query.influxdb/http_requests: |
from(bucket: "nginx-hpa")
|> range(start: -30s)
|> filter(fn: (r) => r._measurement == "nginx")
|> filter(fn: (r) => r._field == "waiting")
|> group()
|> max()
// Rename "_value" to "metricvalue" for letting the metrics server properly unmarshal the result.
|> rename(columns: {_value: "metricvalue"})
|> keep(columns: ["metricvalue"])
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa
minReplicas: 1
maxReplicas: 4
metrics:
- type: External
external:
metric:
name: flux-query
selector:
matchLabels:
query-name: http_requests
target:
type: Value
value: "1"
That’s it! Easy when you know how, right?
If you want to explore this in more detail, or want to know more about monitoring Kubernetes with InfluxDB please check out my examples repository with many more goodies for you to peruse.