How InfluxData Helps Docker Autoscale through Monitoring
Docker Swarm is a clustering and scheduling container tool embedded in Docker. With Swarm, IT administrators and developers can establish and manage a cluster of Docker nodes as a single virtual system. Gianluca will showcase how the TICK Stack can be a powerful solution to manage and monitor your containerized environment through a new open source solution that uses InfluxData to collect metrics from Docker to build and trigger autoscaling policies.
Watch the Webinar
Watch the webinar "How InfluxData helps Docker auto-scale through monitoring" by clicking on the download button on the right. This will open the recording.[et_pb_toggle _builder_version="3.17.6" title="Transcript" title_font_size="26" border_width_all="0px" border_width_bottom="1px" module_class="transcript-toggle" closed_toggle_background_color="rgba(255,255,255,0)"]
Here is an unedited transcript of the webinar "How InfluxData helps Docker auto-scale through monitoring." This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors. Speakers: Chris Churilo: Director Product Marketing, InfluxData Jack Zampolin: Developer Evangelist, InfluxData Gianluca Arbezzano: Site Reliability Engineer, InfluxData Chris Churilo 00:08.427 All right. Good morning, everybody. We will go ahead and get started on our webinar today. And today we are going over the Orbiter project, which is going to help you to autoscale your applications in a Docker container. We've got two fabulous speakers today. We have Gianluca Arbezzano and Jack Zampolin. And I'm going to go ahead and hand the ball over to Jack. And before I do, just want to remind everybody that we are recording this session. And as soon as I can download the recording, I will make sure that I post it so you guys can take another listen to it at your leisure. So without further ado, I will pass this over to Jack. And we'll get started. Jack Zampolin 00:52.661 All right. Good morning, everyone. So today we're going to be talking about autoscaling in Dockers swarm with the TICK stack. I'm going to give a brief little intro here, and then turn it over to Gianluca to describe the Orbiter project. And then we've got a little bit more sideware and a demo for you as well. So in the webinar today, we're going to discuss project motivations, explain the Orbiter options, configuration and operation. Talk through the sample application, do a quick demo and then talk through some questions. Also, while we're giving this presentation, if you have any questions, drop them in the Q&A box. We're going to try to answer those as we're going along, and we'll get to any others at the end. So right here, I'm going to go ahead and turn it over to Gianluca for talking about Docker Swarm. Gianluca Arbezzano 02:00.119 Hey, everybody. Okay. Let's start. I can't share my screen. Jack Zampolin 02:07.405 Oh, okay. Let me toss the ball over to you, Gianluca. Gianluca Arbezzano 02:10.652 Okay. Let's have a really quick introduction about Docker, and not Docker the engine, but Docker the orchestrator. Probably you know that from Docker 1.12, we have Docker orchestration framework embedded into the engine set. Docker is growing. It's not just a way to spin up containers. But now we can definitely think about using Docker in the production environment, in an environment that has more than one note because Swarmkit, that is the library that Docker uses to have this kind of capability, it's a completely separate open-source project, and they are just embedding this library and exposing API over HTTP, CLI, and gRPC. Other than that, they also had other features like Secret Management. Before, we had no better solution to manage Secret. It means that injects our API credential or AWS credential or other tokens was complicated. We were using probably environment variable that are not that secure or other solution like [inaudible]. Now we have this capability into the Docker itself. They also added a stack and bundle feature that, if you know how Docker Compose works, it's something similar but designed to work with Docker Swarm. The syntax is compatible. We can use Docker Compose file to deploy service in the first swarm, but they call it stack. Gianluca Arbezzano 04:18.161 What we are going to do today is run a presentation about Orbiter that is a open source project to manage autoscaling because in Docker, at the moment, there are no easy way to scale containers based on metrics or load because they don't have this kind of capability yet. They are exposing some sort of metrics. We can get metrics from the container but in order to use them, we need to write something down. What we are trying to do is try to make less expensive the work of operators because, right now, we work in a really kind of scalable and available environment and some kind of procedures can be done by a human. With AWF autoscaling group, we already know what autoscaling means and we are just trying to have the same capability in Docker Swarm. Orbiter is an open source project. Is in GO and it exposed HTTP API that you can use to persist action to Docker Swarm or other providers. It's designed to work with different providers. It means that it's not just a Docker Swarm autoscaler but it has an abstract interface that can be used to implement other providers like Digital Ocean that is already implemented, or AWS or other provider that we will probably had or am working on. It's designed to be the deployable in high scalable environment, because what it's trying to manage-it's trying to manage applications that we grow, or the applications that are growing. And it's really easy to start and run in a containerized environment and not. We are using Docker in this example to start our Orbiter. It needs to be started into the Docker master. Docker swarm has two kinds of nodes. You can think about a node like a bigger machine, and there are master and workers. Master is the brain of our cluster. Every kind of read and update come to the master. And the workers have just then to split the load between all the clusters. And all of them have that ability. It means that if you can have more than one master, and more than one slave, and more than one worker, and if some of them has a trouble or problem, Docker swarm is automatically going to rebalance your number of containers in order to reconciliate what you need with what you have. Gianluca Arbezzano 07:50.859 In the example, I'm sharing the Docker socket, because Orbiter needs to communicate with the Docker engine. And I'm using the environment variable to tell Orbiter where is Docker. In this case, I'm using the Unix socket. And in Orbiter itself, I'm using the Docker SDK. The official Docker SDK. And I have the ability to use environment variables and use that. If you know how Docker machine works, you probably already used something similar. Because when you do Docker machine and the name of your server, you have a list of environment variables. Is the same in this case. You just need to set up the right environment variable. The Orbiter works in two different modalities at the moment. One is the auto detection, that is implemented only for Docker swarm. And it allows us to really deploy Orbiter in a fast and easy way. How you can see from logs-I'm starting a daemon in the back mode it means that I have some information and you can see that there is an orbiter started in the back mode and also in auto detection mode. And you can see that is successfully connected to a Docker daemon, it means that our configuration is okay and it's running. And in also auto-detector server call it wikidiff worker. And it has up one and down one. Up and down are the number of containers to spin up when we are growing, or down when we need less containers. And the API are listening on the port 8000. Gianluca Arbezzano 10:00.021 We spoke about up and down and you can modify these values because sometimes it depends on your kind of traffic or which kind of application do you have, but usually, you need to make some tuning around these two values. And you can do that with labels. You can do everything with labels. It means that if your service in Docker swarm is implementing label, Orbiter too, Orbiter is going to auto-detach the service itself. It means that it's able-Orbiter is taking care about this kind of [inaudible] from that server. And if you override orbiter.up and orbiter.down you're going to change the value itself. And that's an example. I don't know how many of you know something about Docker swarm. We already talked about nodes. It has another kind of concept that is called service, and you can think about service as having a collection of containers. In Docker swarm containers are not called containers but are called tasks, because the idea around Swarmkit is that Docker is just one of the providers for this orchestration framework. You can-probably in the future, we're going to see other projects using Docker swarm. And services are just a collection of containers that come from the same images. And in my case, is the nginx image. What we really do when we ask orbiter to scale up and down-we are just changing the number of tasks into the service itself. Gianluca Arbezzano 11:54.175 There is another way to use Orbiter-it's with a configuration file. As I said, only at the moment on Docker swarm it's supported, because, for example, for Digital Ocean at the moment, I don't figure out how to make auto-detection. At that's why we have, sort of, we have a YAML configuration. And you can see an example here, you can have more than one autoscaler. Mine is called events. Every autoscaler has one provider, in this case, swarm, and a bunch of parameter. In this case, I don't have parameters because I don't need to specify configuration. But if in Digital Ocean, you are going to use these parameters, for example, from the access key. And the policy is you can have more than one, and contains the real-what to do with a bunch of servers, or a bunch of tasks, in this case, up and down. How it works, it exposes API, and you can just use curl to scale up and down. And there are two modalities. One is a POST, and you can see that there is a root handle with autoscaler and service. The autoscaler is the autoscaler name, and the service is the service name. If we come back to the configuration, events is the name of the autoscaler and worker is the name of the service. It means that, in this example, if we replace autoscaler and service with {"direction": true}' we are asking Orbiter to create new tasks for the combination autoscaler service. We can add another parameter in the hand of the function and use up and down. In this way, we don't need to specify a body for our guest. Depends on the use case that you have. For example, with the previous configuration, you can see that the last command, that action through for events worker-what I'm doing is I'm scaling up the worker service. Gianluca Arbezzano 14:31.501 Okay, I'm going to show you something here, and let's make this a bit bigger. I have my Docker swarm cluster up and running, five nodes and one is the manager, is the leader. And you can see that there is stars. And what I'm doing now, I can show you that I have a bunch of services up and running. I have a bunch of services. What is important to see now is that wikidiff orbiter, that is the Orbiter itself. We can- [silence] Gianluca Arbezzano 15:38.945 We can see here some logs. And Orbiter started in daemon mode with autodetection and he detect wikidiff worker. Why it detects wikidiff worker-because wikidiff worker is a service here. And if we have a look at this service, we can see that there are probably some enabled in some place, here. These three labels orbiter true, orbiter down two and orbiter up three. And yeah that's how the three labels that Orbiter's looking is for. And that's why it attacked the new service. I'm going to split my monitor. And in this part, I'm going to take Orbiter logs to see what happen. And from here, I'm going to make a code. Now I try to scale down, but Orbiter is saying to me that it's not going to persist this action because the minimum amount of tasks is already there. We can try to scale it up and we can see that we have no [inaudible]. And in the other part of the table, my terminal, we have a log that is telling us service wikidiff workers scale up. Scale from one to four. Because if we remember the Orbiter up label was three. That's why we start from one to four, and we can exit here. If we look another time, we can now see that our wikidiff worker has four applicants. If I'm going to call another time, now I can call down and what we can see is how it's true because the orbiter download was two. And that's just easy introduction about how it works, but in this kind of modality, we still need to have a sort of false dial in front of the laptop that is doing everything by hand. Try to avoid this. I'm going to toss the microphone to Jack because we have something. Okay. Jack Zampolin 19:09.987 Nice. Thank you, Gianluca. So as Gianluca was saying you're not going to want to be creating curl commands by hand. What you would like to do is have some sort of automated service kick off this auto-scaling. So as we said earlier, how am I going to scale my services either on the container or infrastructure level based on metrics that I care about like, request for second or amount of memory in my cache? So in order to do that, you need to hook your monitoring into your infrastructure, and the TICK stack that InfluxData produces offers a number of excellent tools to do this. We're going to be using all of the elements, the TICK stack in this demo, and I'm just going to go through them quickly. The first one is Telegraf. That does data metrics, log collection for the TICK stack. Here it's going to be gathering host level metrics, pulling the Docker Daemon for Docker-specific metrics. So CPU memory utilization for each container, network ingress, and egress, that type of thing. And then it's also monitoring RabbitMQ. So stuff like queue depth, number of messages that that rabbit instance is handling, and we're going to be triggering alerts and scaling events based on that. InfluxDB, obviously the data storage layer for this, it's going to be catching all that Telegraf metric data. For this demo, we're using an instance that's hosted in InfluxDB Cloud, so we're not actually running this infrastructure. Kapacitor is the alerting and data processing engine for the stack. It's responsible for sending the autoscaling message in the demo, and this is also hosted on InfluxDB Cloud. And then along with this demo, we're also going to be showing Chronograf, the new Chronograf. It has some key dashboards for various different Telegraf plugins as well as a Kapacitor user interface. And then as Gianluca just went over earlier and just to go over again, the sample application that we have has three components. The wikidiff_handler, the RabbitMQ instance, and a wikidiff_worker. The wikidiff_handler simply subscribes to the Wikipedia Recent changes feed. So this is just an events stream that's sent over Socket.IO of any changes that happen on any of the Wikipedia pages. So whenever edits are made, whenever new pages are created, when people are having moderator wars, you can get all of those events. And the handler simply handles the incoming message and sends the URL of the page that was edited to RabbitMQ. RabbitMQ here is a high-performance queuing system. It also has a native Telegraf integration as well as a lot of other queuing systems so that Telegraf can act as both a consumer and a producer. And then the final piece of infrastructure here is wikidiff_worker. This worker is meant to simulate high CPU load tasks. It will pull the URL off the queue that was sent by the handler, navigates the page and screenshots it. And then stores the resulting image in s3. And then the final element of the demo that Gianluca went over extensively earlier is the Orbiter itself. Jack Zampolin 22:47.898 So let's go ahead and look at a walkthrough of this demo just to visualize it. For me personally having the diagram does help me visualize things. So at the base, we have our hosts and then Dockers form running on top of this. And Gianluca just showed you those. Each of those hosts is running a Telegraf instance. That Telegraf instance is pulling the Docker daemon as well as a number of metrics from the host and also RabbitMQ. We have our two cloud instances, InfluxDB and Kapacitor. And then the handler, RabbitMQ and worker flow with the Wikipedia Changes feed coming in there. The final piece is the Orbiter. So when Telegraf detects that the RabbitMQ feed is growing, it will be continually sending data to InfluxDB. Any data that Influx receives is getting forked over to Kapacitor as well. Kapacitor will detect that change. And once it trips the threshold, will send a POST request to Orbiter with POST, whatever application we're looking to scale and then up. And that will scale the number of workers handling those messages on RabbitMQ to handle the load. Jack Zampolin 24:24.279 So let's go ahead and do the demo. I'm going to walk through the application in Chronograf very quickly. And then hand it over to Gianluca. So this is the home screen of Chronograf here. It shows you all of our different hosts and then system level statistics for each of those. You can see very simple stuff, CPU usage, memory, disk used. And then we also have RabbitMQ statistics, number of consumers exchanges in queues, how many messages are getting published and delivered every second, and then any [unact] messages. There's also a data explorer to add your own queries here and we can go into that later if anyone's interested. And then as far as alerting there is a connected Kapacitor instance and you can create new TICK scripts here. Databases and users-you're able to change those as well, and data sources. So let's go ahead and see the whole thing working. I'm going to go ahead and pass it over to Gianluca. Gianluca Arbezzano 25:50.950 Yes, thank you. Yeah, what we are trying to do is like-it's when you are looking for scalability, when you are trying to scale your application you really need to-first you really need to understand how it behaves. And containers are really good tools to do that because C groups are helping you to describe how much resources you need. And you can just ask for what you need and you can deploy obviously more than one container in the same server easily and in a safe way. What I'm going to show you now is the TICK stack that I spoke before about. I call it docker-compose just to show you that it's really compatible with docker-compose. We just need to specify a new version, the version 3.0. I'm using the 3.1 that supports new capability, but from the 3.0 you are able to use docker-compose file to deploy services in Docker Swarm. And if you have experience with docker-compose it's going to be really, really simple for you to understand what I'm doing. I'm just deploying my first services. One is orbiter, and expose a port. It's showing the Docker socket as I showed before and is using environment variable, is starting with a command daemon -debug. I'm also saying that it depends from the worker because it needs to be deployed after the worker in order to detect the new service. There is a new chapter here too. The call deploy-that is the one that is used to Docker swarm in order to understand what to do with your container or with your service. Placement, you can use different kind of option, you can use-in this case, I'm using node.role, it means that what I'm saying, in this case, is that Orbiter needs to be deployed into the manager. If there is more than one manager, it's going to take one of them, but that's to be sure that my Orbiter is not going to be deployed in a worker. I need to have it in a manager because I'm going to persist action to writing action, to Docker swarm because I'm changing the number of tasks and I cannot do that in a worker and that's why I need to stay in a manager. And replicas is the number of tasks to deploy, in this case, one. I'm doing the same for RabbitMQ and what you can see, the placement is different. I'm not using information from the nozzle. I'm using information from the engine itself. In this case, I'm using labels. As you probably know, you can configure Docker daemon with a different kind of label. And in Docker swarm they are really important because you can use them to orchestrate your containers. Let's say that you have, for example, machine with different distributions and you need to-and you have a continuous information server and you need to run your container in all the distribution to test your code in different environments. You can do that with labels. In this case, I'm just using label rabbit because I need to have my containers always in the same server because I don't have a distributed storage. It means that if I'm going to orchestrate my rabbit in different nodes I'm going to lose my data and that's why I'm using this label here. Handler is pretty much the same, it depends on RabbitMQ because I need to have rabbit up before starts the handle. And there are other implementations descriptions specific for swarm. As I said before, you need to understand how your application behaves because you need to have some sort of data in order to understand how quickly you need to scale up, how quick you need to scale down. We spoke before that we have up and down and they are there because they can be different. You can, for example, scale really faster-scale up really, really fast because your traffic is growing and you need to have more capability, and you can scale down slowly because nobody is going to-just to be sure that everything is working fine and that's why usually you have an up value greater than the down value. And you can specify limits and reservations. Limits are hard-the difference between limits and reservation is that swarm is going to use reservation to deploy your container in a node that has these kinds of resources available. And it's a soft limit because you can go over that limit, you can reach the harder specification. But when you are going over the 100 megabytes of RAM, for example, swarm is going to kill your container and restart a new container again. Limits are hard, and reservations are just soft limits. And that's pretty much it what we need to know from here. Oh, we have also secrets, because our worker is pushing images to a stream and that's why I added the new secret in Docker swarm that contains my configuration, my credential file, AWS credential. And that's pretty much it from this side. Gianluca Arbezzano 32:41.830 I'm going back to the Chronograf itself, and I'm taking some graph from here. You can see also that we have the number of containers and the number of images deployed. The state of my container, how much of them are running in a specific node, how much are [inaudible] for example, and we have block [inaudible] like a really good amount of information. What I'm looking for is for RabbitMQ because what I'm trying to do in this example and with this application is spin up more workers when we have more messages. It's something that usually when you have queue you are interested to do, because when you are receiving more information to elaborate, in my case, when a lot of people are changing stuff on Wikipedia, I'm interested to keep my application up-to-date and if they are changing more, I need to have more workers if I can. For example, one of the good metrics can be the number of published messages per second and the number of messages delivered per second. Published is when my handler gets new messages, delivered is when my worker might elaborate one message. What I'm really interested to have, my behavior is that this kind of line needs to be pretty much stable. And let's try to do that with Kapacitor, because we have-what I'm trying to do is, have something that is following my metrics and is alerting me when something is wrong. And it's not going to be an operator that catch this kind of metrics, it's not going to be a PagerDuty alert, it's going to be a POST alert that triggers the same action that we saw before with the curl but from Kapacitor in automation, in order to allow Orbiter to scale up and down. Because in this case, Kapacitor knows what I'm happy to have and it's just going to ask Orbiter to do that for me. And I can use that explorer, and this is the query that I'm using as alerting, it's the message publish rate. When the message publish rate is growing too much, I'm going to trigger an alert and ask Orbiter to scale up. When my scaled rate is going down, I'm going to ask Orbiter to remove containers because I don't need all of them. We can see that here there are a lot of really different spikes. And I can show you my Kapacitor here. My TICK. To do that I'm going to do kapacitor list tasks. I have the two of them. The first one is the TICK that is alerting Orbiter to scale up and the rabbit -down is the other one, the opposite one. The other ones are all disabled because I need a stable environment. I can show you kapacitor show-that's the TICK. I don't know how many of them have come in here, but I can just look into both of them and just to understand what-they are really similar. You need to specify a stream, in my case. It's a stream because I'm following all the metrics that are supposed to be-is receiving from Telegraf. And you can-you need to write your query. And as you can see, I have a bunch of parameters here, the DB that I'm using. The kind of measurement I'm using RabbitMQ measurement, and the periods, I'm asking Kapacitor to look at my series for 10 seconds every minute. And if the message publish rate is more than the number that I'm waiting for, that is three, in my case, because if you look at this graph, like I said, you need to know how your application behaves before trying to autoscale it because if you don't know how it behaves, you cannot autoscale it really well. And what I can see is that, usually I have some spike that are two, four publish rate, but other times it's like pretty much quiet. And in this case, I'm not really interested to start a new worker because I'm happy enough. What I'm looking to do is start something when this kind of metrics are not that anymore because I'm receiving more changes. Gianluca Arbezzano 38:37.473 I'm just trying to just log Orbiter. In this way, we can have a look if something is working or not and I'm also going to enable my TICK. From now, when I'm back here, from now my TICK is going to wait for some metrics. Now, I don't know why it's in pending here. Oh, I can see that it's probably just a networking blink because I already received a scale down. Kapacitor list tasks. [silence] S3 40:02.837 Okay, probably just some... right, connection is having some trouble probably. Okay, now it's enabled, and the real problem now is that I cannot ask all of you guys to go to Wikipedia and change something, it means that I need to figure out something by myself to change its behavior. And what I'm doing is I'm going to scale, not the worker, I'm going to scale the handler. In this way I'm, in practice, simulating that somebody is-like I have more events to process. [silence] Gianluca Arbezzano 41:03.368 What I'm waiting for is that Telegraf is getting me new data and we'll see. My expectation is that the number of messages in my queue will grow, because I have more handlers that are taking the message from the socket and when-okay, we have a spike here, that's good. My expectation is that when Kapacitor elaborates a good amount of metrics, it's going to ask Orbiter to scale. And you can see that number of worker now is two, and they are both working. With the common Kapacitor show, you can try to figure out what Kapacitor is doing. You can see that it's processing 44 points at the moment and didn't trigger an alert yet because probably it didn't take the spike yet. And in maximum, one minute more-oh that's it. Orbiter that acts a new alert, like receive a new alert from Kapacitor, and you can see that it's receiving a new request to scale and is also scaling worker from two to five. And if we have a look here, we can see that my TICK triggered one alert because my spike is still a good spike. It means that people really working with Wikipedia right now. I can put replica to zero-it means that now in Wikipedia people are not changing anything anymore. That's not really-not real, but you know I'm not getting metrics. I'm not getting messages anymore, and we can still-in the meantime I receive another TICK set because the spike is still there, now it's going down, but probably it detects a lot of time that I need more workers, it means that it's been up other three, from five to eight. And if we have a look at the number of services, we can see that we have eight workers up and running that are elaborating messages. This graph is really quiet now, and my expectation is that we are going to receive an alert to scale down containers because we don't need that anymore. Let's see what the other Chronograf TICK is doing. Oh, the other Chronograf TICK is disabled that's why I can wait forever. Well, I can just show you the other one and it's just to have an idea about the behavior of the scale down. And show rabbit-down. We can close this one because we see that we can receive the alert. It's pretty much the same. I'm just using different criteria. It's not three it's two, and also the period is different. I'm asking Kapacitor to elaborate my matches for 10 seconds, but every 4 minutes because, as I said if I'm getting a lot of events, I'm still happy to have some workers in order to elaborate all the queue and I needed less workers later when I don't have all that message anymore. And it's going to do the same. The TICK is working the same, but the value is obviously the opposite. And also, the POST that it's sending to Orbiter is not up, it's down. Gianluca Arbezzano 46:06.285 And that's really it from what I was happy to share about how to scaling and I hope that you enjoy this part. And if you have any question or if you are happy to have me, the project is open source. I'm just trying to come back to my slide for the final one. We put together a list of links, three of them. The first three are really something that you can use to understand how InfluxDB and Docker work. And the first one is, contains all the key concept about Docker swarm. The second one is an article more like-is more like okay, let's play with Docker swarm. And the third one is an overview about all the ecosystem that Influx has. The other three links-when I was working on this project and when I was working on-we were trying to put together this demo, I figure out that a lot of stuff was in progress, was something that I was happy to do. And I just saved all these links that I'm happy to share with you. They are really interesting and-no. They are three issues that if someone is interested to convert then in [inaudible]. It's going to be fun if you're happy to do that. And that's it from my side. Jack Zampolin 47:57.557 Awesome, Gianluca. That looked great. Very, very cool demo. So we'll be on here for another few minutes. Happy to answer questions. And if anyone has any questions for Gianluca or me, we're happy to answer. Chris Churilo 48:13.257 All right. We've got a question from Christopher Well. Okay. He says thank you for the demo. Is there are any step-by-step instructions that would allow us to build the demo and play with the process? Gianluca Arbezzano 48:27.956 Yes. Everything is open source and I can send in chat. I have chat here? Yeah. A few links like one is the wikidiff application. It contains the stack that I show you and a bunch of wikidiffs to create your containers. And it contains also the worker and the handler. You can have a look on the code here. And well, obviously the other one is Orbiter itself. And you can use-there is a public image already. Docker image already published it. Here you can use them. You can use it or build your own. Jack Zampolin 49:18.050 Thanks. Thank you, Gianluca. Also one more question. Sotorrios, I'm butchering that. I'm terribly sorry. Nice work. Do you have any recommendation for a TICK script editor or debugger? Chronograf does come with a built-in TICK script editor. And that'll do a pretty excellent job of getting you most of the way. You'll be able to say I want this data. I want it in this way. And from there those TICK scripts are pretty easy to edit manually. The ones that Gianluca has been showing throughout this presentation are generated that way. There's also TICK scripts and text highlighting for a couple of common IDs that generally helps people. We do have a number of examples in the Kapacitor repo itself, as well as on the blog. And we're always happy to answer questions over at community.influxdata.com if you have any issues. Does that answer your question? Or are you looking for something else? Awesome. [/et_pb_toggle]