Meet the Experts: InfluxDB Product Update
Session date: Dec 17, 2020 08:00am (Pacific Time)
Learn more about InfluxData’s time series platform. InfluxDB 2.0 OSS is generally available, and since launch, we have made updates to the product.
Join Tim Hall, VP of Products, as he demonstrates the latest features in InfluxDB 2.0 Open Source.
Watch the Webinar
Watch the webinar “Meet the Experts: InfluxDB Product Update” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Meet the Experts: InfluxDB Product Update”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
-
- Caitlin Croft: Customer Marketing Manager, InfluxData
- Tim Hall: VP of Products, InfluxData
Caitlin Croft: 00:00:04.456 Hello everyone. Once again, welcome to today’s webinar. My name is Caitlin. I’m very excited to have Tim Hall, our VP of Products here to provide an update on InfluxDB. Please feel free to post any questions you may have in the Q&A box, and we will answer them at the end of the webinar. Without further ado, I’m going to hand things off to Tim.
Tim Hall: 00:00:28.495 All right, thanks, Caitlin. Welcome, everybody. Happy Holidays. I know we’re getting close to the end of the year. Hopefully, that means we’ll get more folks joining as things hopefully quiet down for many of you. So today, we are going to talk through a bunch of things. But first and foremost, InfluxDB, the open-source edition of 2.0 is now generally available. We actually announced this in our intention to do this at InfluxDays, the European virtual session in June. And then we completed our extended alpha and beta program on November 10th. We also got a chance to do that at InfluxDays, the North American virtual sessions and sort of go through that.
Tim Hall: 00:01:11.756 Now, we’ve just released our second maintenance release, which is called 203, which includes expanded support for ARM 64. Definitely heard strong requests to get that packaging together. And we’ve also expanded the packaging support for Debian and RPM builds to include some assistance and conflict detection so that you don’t accidentally install on top of an existing Influx instance because there are some steps that you’ll need to take to go through the upgrade process. And of course, we’re getting great feedback from the community, minor defect fixes and some of those are being addressed. And also, of course, we’re always getting enhancement requests coming in as well and definitely use our community channels for that. If you don’t know where those are, I’ve got some links at the end of this to highlight where to do that.
Tim Hall: 00:02:10.660 So without further ado, let’s go into our two 2.0 offering. We are still dedicated to providing a platform, for providing real-time visibility into all the things that you want to manage and run and monitor. And I’m not sure why my font got screwed up here, but essentially calculate, analyze, and act all kinds of data, and the data is anything with a timestamp. And to do that, one of the questions we get asked is what’s different really between the 1.X edition of the TICK stack and InfluxDB 2.0. First and foremost, if you look at this architecture diagram, on the left-hand side you’ll see, for those of you familiar with the TICK stack, the T, I, C, and K, Telegraf, InfluxDB at the core, Chronograf for visualization, and Kapacitor for batch and stream processing.
Tim Hall: 00:03:03.946 And what we’ve done with 2.0 is we’ve attempted to collapse the three things that you see on the right side of the TICK stack there into a single binary, meaning there is a visualization engine. There’s ability for you to create and run batch tasks. We’ve unified the language that you use for both of those activities, which we call Flux, and I’ll talk about that too in a minute. But effectively, what we’re trying to do is simplify the number of moving parts, make sure you have fewer binaries that you need to download and run, and yeah, just make it easier to use. Our time to awesome mantra still holds, and hopefully, you’ll get a sense of that as I go through the demonstration a little bit later today.
Tim Hall: 00:03:47.250 Telegraf remains our most popular open-source project. It’s used by so many people, and we’ve got such a large developer community. It’s used by competitors. It’s used by ourselves, obviously, to collect and gather data, but it’s so popular, and we appreciate everybody’s contributions and activity in building that distributed collection agent. In terms of the offerings, here’s where we are. We actually launched InfluxDB open source, the alpha two years ago in December. And I remember just before the holidays, getting my first build from the engineering team of the very first alpha and walking through the setup process. We then turned our attention to building InfluxDB cloud and delivering that across all three major cloud service providers, Amazon, Google, and Microsoft Azure. And we launched that in September of 2019, and we’ve been running that now for just over a year, and we’re super excited. We’ve crossed over 20,000 organization sign up to use that service. And obviously, it’s been continued to roll it out to different regions and harden it based on fun, operational challenges that you always see in terms of running these things at scale.
Tim Hall: 00:05:15.472 And as we’ve done that, we’ve advanced the open-source bits as well, meaning the cloud deployment is sort of a set of micro services that are deployed across a large sort of Kubernetes backplane. But the capabilities, we continue to sort of boil down into that single binary in advance every few weeks through the alpha program and through beta. And so the bits have been testing, the bits that are being used in anger by many folks that are trying to monitor their stack systems and sensors. And we finally got to the point where we were ready to pull the wraps off the open-source bits. And then, next up on our list is we’ll turn our attention to InfluxDB enterprise 2.0 based on the announcements that we made at InfluxDays. If you want to go and take a look, Paul Dix’s talk on what we call IOx, which will be the new core of the platform and will allow us to get all the capabilities that you’ll see today into a distributed footprint and more. And so definitely check out that talk.
Tim Hall: 00:06:25.507 Now, one of the things that we’re going to try to do is we’ve got a common API across all editions so that if you’re a developer, and you’re sort of developing against the local open-source edition, and then you reach a point where you’re like, hey, I want to scale this out, you can definitely take change the endpoint of your application and point it at the cloud instance. And then in the future, obviously, if you want to manage it yourselves, which actually fewer and fewer of you are doing these days, it seems like there’s definitely a groundswell of adoption of platform as service offerings like InfluxDB cloud. But if you choose to manage it yourself, run it yourself, enterprise will be available, and you’ll be able to use that as well.
Tim Hall: 00:07:08.639 Now, to feed data in, lots of different mechanisms Telegraf, as I mentioned, is a package distributed agent that you can use and deploy. It’s firewall friendly. But we also have 10 supported client libraries in different languages. That’ll make it easy to develop and more are being built as we go along. And of course, you can just consume the native API if you’re a REST expert. Obviously, the open-source is free. There is a free tier in the cloud as well with some specific rate limits on number of dashboards and tasks and how much data you can feed in and some cardinality limits. And then there’s obviously a way to commercially purchase and remove many of those restrictions or at least move them up and then request limit increases if you hit the limits that we put in place.
Tim Hall: 00:08:02.101 So in terms of InfluxDB 2.0, again, I mentioned single binary for all-time series, common API for everything. This is building dashboard cells, user creation, feeding data in the query and the right path is sort of the place where we started with literally the API for any activity, you could do defining tasks, etc., is all consistent. Our new language for working with data with Flux that we’ve been working on and will continue to work on in terms of feature breadth and performance is available and so much more powerful than anything we used to offer and should allow you to write significantly less code when you’re building applications on top of the platform. And bringing all the pieces together allows us to have a framework for offering composable solutions. And I’ll talk a little bit more about what that is and what that looks like and where you can see how people are sharing their expertise based on things that they’ve built on top of the platform.
Tim Hall: 00:09:08.124 Flux, as I mentioned, our new language for working with data, it’s really designed for working with time series data, but it’s not limited to that, meaning we’ve made it extensible. We actually can query across 14 different data sources today including [inaudible] and Athena, SAP HANA and more and more are certainly coming. And so we’ve made it also open source, so people can take that language itself and the engine and run with it in different environments if they wish. But it is the primary language for working with data, both for interactive query and tasks as part of InfluxDB 2.0. And it is relatively easy to get started with. There definitely are some basic syntax and learning curve challenges, but I’ve been amazed at some of the Flux scripts that I’ve seen, our community members create and write and ask questions about even in the early days. So it seems like people are able to understand the structure of the language and how to really take advantage of the power that we’ve placed there.
Tim Hall: 00:10:19.133 So today, without further ado, let’s get out of slides. I’m going to walk through a brand-new setup of InfluxDB 2.0 and what that experience is like, and I’m going to use the Docker image to do that. The Docker image for upgrade, if you’re already on the 1.X TICK Stack is not quite ready for you yet. So I recommend if you want to just explore the capabilities, you can do that. So grab the Docker image we have - and specifically for that reason the Docker images is in a key repo, not in the standard Docker Hub location as we work through making sure that the upgrade experience is good for all of you from the 1.X side. But it is fully featured and functioned and grab it. Keep going. So we’ll stand that up, we’ll walk through the UI, we’ll work through getting some initial data in, and we’ll create a new bucket and our first task and show you how that all works.
Tim Hall: 00:11:15.095 And then I’m going to transition the experience of the upgrade. So I know many of you are existing users of InfluxDB 1.0, and I want to show you what it looks like to go through that upgrade process, show you the various log outputs that are there. Because we do offer support for backward compatibility for existing capabilities, we’re going to look at something called the DBRP mapping through the CLI. This is what allows you to issue InfluxQL queries to InfluxDB 2.0, InfluxQL being the primary language that you worked with in the 1.X line, and so that allows for an in-place upgrade, including to continue to allow your dashboards to work, whether you’re using something like Chronograf or perhaps, many of you using Grafana because of the multi data source capabilities. And I’ll show an example of how that works.
Tim Hall: 00:12:08.145 Talk a little bit about the security setup and the CLI tooling. There’s expanded CLI tools that are out there. We’ll create an alert and show you how the checks and notifications subsystem works for alert creation. And then we’ll get into how to share your expertise through InfluxDB templates, that composable solutions frameworks that we put in place, and how easy that is. And then time willing, we’ll look at variables and some advanced Flux, just things you couldn’t do in InfluxDB 1.0.
Tim Hall: 00:12:42.921 So with that, I am going to show you my existing setup. Actually, this is for the upgrade session, so let me come back to that. Let’s just go right to where would we get the bits for this. So if you go to our downloads page, we’ll see that InfluxDB 2.0 is available now, and you’ll see Version 203 here. If I click on that button, you’ll see all of the different images that we have available. The location of the Docker image, again, as I mentioned, is on the key repo. If you’re on Darwin and Mac OS, you can just grab it here. There is a separation now between if you want just the command line interface for those that are adopting cloud. If you want to work with the command line interface, you can just grab that. You don’t have to grab both the command line interface and all the bits. And then you’ll see here the packages for Ubuntu, the Dev packages, the RPM packages, etc., and generic Linux packages. And then also, just make sure you keep in mind whether you’re grabbing the generic package or the ARM bits. Again, ARM support is for 64-bit only. And so you’ll need an updated ARM device.
Caitlin Croft: 00:13:54.500 So I’ve already grabbed the Docker image. I’ve already pulled that down. And so now the question is where do I go from here? So I have the bits. And to me, Doc’s are your friend, right? We spent a lot of time overhauling the documentation experience. If you have not checked out our new Doc’s landing page or the new documentation, I highly recommend that you do that. If you’re a TICK stack 1.X expert, you’ll notice all of the documentation has been moved here so you can still access it. Its look and feel have been refreshed, but we’re obviously highlighting the experience of InfluxDB 2.0.
Tim Hall: 00:14:38.481 Now, one of the things that we’ve done that’s a little bit special is we’ve added this little where are you running box and when you just check which thing you’re working with, it will then automatically fill in the URLs and all the examples that show up, so that it’s easier to cut and paste those and start working with them. So if you had a custom setup like today, I’m going to start with localhost 9999, I could put that in here, and all my examples will show up like that.
Caitlin Croft: 00:15:11.378 Let’s get into the Docs. Again, if you’re just getting started and your brand new with InfluxDB, definitely tap into the Get Started section before exploring out, and again, we’re going to be using the Docker image today for our new install. And so the very first thing, I’m going to execute this Docker run command to get this thing going. So without further ado, we’ll go over here, and I’ve pasted this. And now one of the things I’m doing that you may find a little bit strange is I’m going to map the port that this image uses. I’m going to use 9999 on the outside of the container and then on the inside of the container, it’s using 8086. Now, you may wonder, “Why is Tim doing that? That’s not in the instructions.” It’s because I want to show you the upgrade process. And since I have a live instance running of InfluxDB 1.8, and it is running on port 86, I’m doing this to avoid port conflicts. And again, if you are running, InfluxDB today and you want to just explore and kick the tires of 2.0, here’s a great way for you to just explore that using the Docker container.
Tim Hall: 00:16:21.215 So let’s kick that off, and let’s see what happens. This is always the exciting part of live demos. Getting warmed up here. And while that is thinking about what it’s going to do, I will start prepping the next part of the demo over here. There we go. All right. Okay, so it’s there. It’s listening. And if I click on my local host page, it should give me a nice welcome screen. Let’s maximize that so you can see it. There we go. All right. So if you are brand new to InfluxDB 2.0, you’re not doing the upgrade process, you can actually execute the installation steps either through the command line interface or the UI. And so this is the screen that you’re presented with. It’s a brand-new install. And again, there’s command prompts for the CLI there that are similar to these. One of the big differences in 2.0 is it is secure by default. And by secure by default, what I mean is, you are required to have either a username, password to log into the UI, or you are required to have a token to access any of the API’s from the get-go. This was not the case with InfluxDB 1.0. And unfortunately, many people decided not to turn on the security features and then have instances running that are on the public internet that anyone can access. So we decided to just bake that in from the start.
Tim Hall: 00:18:25.566 So in terms of getting started, let’s do that. So I’m going to put in a username, thall. I’m going to give it a password, hopefully, that I can type twice. I’m going to give it an organization. InfluxDB 2.0 is multi-tenant by default, meaning you can set up multiple organizations from the beginning to organize your data, organize your dashboards, organize your users. And so in this case, I’m going to use the old thallorg and my bucket name. My initial bucket name is going to be telegraf since I’m going to click some system stats off my laptop here in a minute. And if I click Continue, now it presents me, “Hey, okay, those user details are set. I’m ready to go.” There’s one organization, one user and one bucket that’s been set up. And which path do you want to take now? This is a little bit of a choose your own adventure. I can go through a QuickStart which will set up local metrics collection. So it’ll actually gather the metrics off this local instance and store it and give you a dashboard for visualization. If you are already feeling like you’re an expert and you want to set up things yourselves, you can go into the Advanced mode or configure later which I’m going to do now because I want to show you actually the setup steps for some of these things.
Tim Hall: 00:19:42.375 All right. So now you’re presented with the home screen and the landing page for InfluxDB 2.0 Very first thing at the top left, you’ll notice the left-hand nav bar. Inside the nav bar are a series of options including who you are and what organization you’re logged into. You will notice that you can both create and switch organizations from that little icon. Again, if you have a multi-tenant setup, and you want to partition who can see what, that’s a great way to do that. From a who has been invited to join this, you can actually create multiple users. But you used the CLI tooling today to do that and to set up their particular role. Today, I believe, there’s three roles. There’s owner, admin, and read-only users by default. You can get information about your organization including the ability to rename it if you feel like you’ve messed that up. But there are some consequences to doing that if you’ve already set up things that are using the organization name. And so we give you a little bit of warning about that before you go off and do that. There is a frequency to use, identification, these IDs in various API calls and other things. And so we give you the ability to sort of copy those to the clipboard and use those either in the CLI setup or other API calls if you’re doing development.
Tim Hall: 00:21:09.555 Let’s see. Next up, we’ll look at the data section. This is really about all the things that you’ll need to do to handle your data. Now, first and foremost, we’re showing you a giant page of sources of what technologies exist for you to kind of get data in. So first and foremost, I mentioned the 10 different languages that we currently have client libraries for. So if you’re working in a particular language, C# or Ruby or Scala or Java or Python, which so many of you are, these little tiles, actually, if you click on them, give you additional instructions about how to work with that language and that specific package and give you specific code snippets to speed your ability to construct your application on top of Influx. There are a few more of these that are coming. We’re working on one for Swift for those that are building mobile apps. There’s one that’s being built for Flutter and Dart, again, that are mobile app and a few more.
Tim Hall: 00:22:17.099 And then, down below is a whole series of tiles. If you’re using Telegraf, it’s your distributed data collection agent. You will see all the things that Telegraf is capable of scraping data from and gathering data from. But it does require you to run Telegraf yourself. And so anyway, each of these tiles will give you instructions on what your Telegraf configuration needs to look like in order to get data in. So from an IoT perspective, for example, there’s this Modbus plugin that we’ve got, and it gives you the configuration example here. You can copy that to the clipboard and add it to your local Telegraf config, and it gives you a bunch of other valuable information about working with the Modbus plugin if you’re coming from the industrial IoT world. And of course, there’s a search capability. So if I wanted to look at something like, I don’t know, let’s just say I wanted to look at a Zookeeper maybe. And so there’s a Zookeeper plugin at the very bottom, so I didn’t have to scroll all the way down to the end.
Tim Hall: 00:23:28.629 All right. Next up is, once you’re gathering data, you’re going to land it into a bucket. For those that are coming from the InfluxDB 1.0 world, a bucket may be a new concept. Essentially, it combines the idea of a database and a retention policy into a single entity. So if I look at the bucket that I created at startup, which is called Telegraf - sorry. I click on Settings. You can see I’ve given it a name, and then the retention policy is now an attribute of the bucket itself. And so I can change it to 30 days or an hour or however long I want. In this case, I’m just going to have an infinite retention policy for the moment. You’ll also see two system buckets. So the system buckets are used by InfluxDB to store information about the system. You do have access to that and can use that to look at what’s going on, and you can even build alerts and other things off of the information that lands in the system bucket itself. You can obviously create additional buckets by clicking on that button. Same kind of setup. Just give it a name and a retention policy.
Tim Hall: 00:24:37.877 From a Telegraf perspective, we also, are providing the ability to - this is almost like a Telegraf wizard. And now, you saw a whole series of panels for all of the different plugins that you can create. And today, if you go and create a Telegraf configuration, we’re only really offering the wizard for these sort of five most common things that people are monitoring. That doesn’t mean you can’t use Telegraf again with InfluxDB 2.0. You can use it for all those things that I showed you. It’s just that the wizard support is not available to you yet. And so just to kind of walk through this, and we’re going to use this in a little bit, I’m going to collect the system stats off my laptop. So I’m going to go ahead and say, let’s create a configuration. I’m going to name this Laptop stats. And you’ll see the plugins that it’s going to include, the cpu, disk, disk.io, mem, all that stuff goes in. Create and verify.
Tim Hall: 00:25:36.654 Okay. So you’ll notice here that it says, “All right, great. Do you want to test this?” And so downloading and installing the latest version of Telegraf, really any version of Telegraf starting with 1.9, 2.0 or higher, and we’re just about to release Telegraf 117. Includes the output plugin for InfluxDB 2.0. And we’ve had that out there for quite some time, and we’ve had a nice stable API for that right mechanism. Now, next thing that you’ll notice is, as I mentioned InfluxDB 2.0 is secure by default. And so there is an API token that is required for you to write data in. And you’ll see there’s a token here that I can use. And for the setup, right, I can copy that to the clipboard. And what we’ll do is we’ll go to the command line, and we’ll go ahead and we’ll export that token and set it up. And then next, it says, I can run the following command. Now, what’s interesting here, and those of you that are not familiar with Telegraf, or maybe that are familiar with Telegraf, maybe using it for a while, what this is doing is we introduced a new capability in Telegraf again starting with, I think, 1.9, which allows Telegraf to pull its configuration from a central location. In this case, I’m pulling the Telegraf configuration, the one I just created from my InfluxDB 2.0 instance.
Tim Hall: 00:27:11.109 Now, you may say, “Oh, well, this isn’t necessarily the way I want to distribute the configuration to my Telegraf agents”. Totally get it. You can still use the tried-and-true things like Ansible, Puppet, and Chef to distribute those configurations. But now, there’s multiple ways and centralized configuration management that you can use to do this. You can also have a GitHub style workflow where these configurations can be exported and shared and re-imported as resources into your InfluxDB 2.0 instance. So what that means is, if you create an external Telegraf configuration, you can package it as a resource, you can create as a template, and you can actually import it into the system. Even though the wizard only supports automatically creating sort of those five at the moment, any Telegraf configuration can be re-imported into InfluxDB 2.0, and then this URL mechanism for distributing it back out to your Telegraf agents can be used. Telegraf will pull that configuration at startup once and only once. And so if you make configuration changes, obviously, you’ll want to restart your Telegraf agent to pull those changes down. So let’s copy that to the clipboard, and we’ll go ahead and drop that in over here. And we will hit Return and fire that up and see what happens.
Tim Hall: 00:28:31.252 So it’s pulled that configuration down. And then I can see if this is working by clicking the Listen for Data button. And sure enough, it says, “A connection is found.” So we’ll come back to see where we landed that information here in a few minutes. Now, next up, we also, offer - oh, wait. Before we leave the Telegraf section, if you’re creating configurations on your own, and you just want the snippet of the InfluxDB output plugin so that you can paste it yourself into your own configuration file, we’ve provided a handy link for you to do that here, and it just gives you the specific details. You can also select which bucket you want the data to land in. In this case, we only have one bucket just to choose from, but you’ll see here that the configuration of the output plugin’s here. It tells me it automatically fills in the bucket name, gives the organization that I’m working within, and then your token which I exported as an environment variable is listed there for dynamic usage.
Tim Hall: 00:29:30.662 Moving on to scrapers. Now, you don’t necessarily have to deploy a Telegraf agent to gather stats. I know in the 1.X line, for example, we did a couple of things. First, there was something called the _internal database where the stats and metrics of InfluxDB were populated. Generally, for production usage, we recommend that folks turn that off, and then it’s like, “Okay, well, how do I gain visibility into what’s happening?” A mechanism that you can use is a scraper. A scraper is essentially the database is reaching out and gathering the metrics directly. In this case, I’m going to create a scraper against itself. And so I’m going to say this is my InfluxDB 2.0 metrics. And I’m going to store it also in my Telegraf database, and I’m going to scrape it out of the Docker container. So I happen to know that the [inaudible] IP is where to get that from. I can create that and now every 10 or 15 seconds, it will go ahead and scrape that mechanism. Now, that can be used to scrape any Prometheus exposition formatted metrics. But as you can see, there’s not a ton of controls on there. So you can’t control the frequency at which the scraper runs. If you’re going to go down that path of trying to use scrapers that we sort of recommend them for light use and sort of demo purposes, but not for scale and in production.
Tim Hall: 00:31:03.572 And then last but not least, I mentioned tokens. So the API access is secure by default. You have to have a token anytime you interact with the API, and you can generate those tokens. So you can generate an all-access token and access to all the features, functions, and capabilities. You can generate read-write tokens, and you can scope those tokens as well. So if you want the most flexible token, obviously, you’re going to say, just give it access to all buckets at all time from a read perspective or a write perspective. But if you want to narrow that down for specific applications or specific use cases, you can obviously do that as well. And just give it a name and go ahead and save that.
Tim Hall: 00:31:48.647 Tokens are immutable. So if you create a token with specific permissions, you cannot change them. You will have to delete that token and recreate it. So that’s an important distinction. Once you’ve created a token, you cannot change its scoped permissions. All right. So that’s a walkthrough of the basic data pages.
Tim Hall: 00:32:11.145 Next up is looking at the data exploration capabilities. And so let’s go back into full screen while we look at this. So this is the Data Explorer. You’ll notice that on the very far left of the screen, I have the visibility for the various buckets that I’ve got. And here, it’s listing all of the buckets that the Telegraf measurement has access to. And so in this case, I’m going to look at my CPU metrics, I’m going to look at usage user, I’m going to look at CPU total, and I’m going to just click the Submit button. And I’ve constructed my first query. Pretty easy to walk through and build that if I wanted to scope it just to my laptop, it could have done that, obviously. But you can continue to build up these the filtering mechanisms across this panel as you go along. You can decide to change the filter to a grouping mechanism if you want to. And so that’s another part of the query-building exercise. I wanted to use other functions, I can open this up, and I could see what other functions are available in the Explorer in the Query Builder experience. Now, keep in mind that there’s a ton of additional functions that are available. But these are just a list of the sort of aggregate functions that you can use in the Query Builder today.
Tim Hall: 00:33:35.762 One of the ways that the Query Builder is used is to sort of build up the boilerplate beginnings of queries before sort of diving into more sophisticated ones. And I’ll show you what that looks like in a second. Another significant change in terms of how the data results are populated is you can see this button here called View Raw Data. If you flip over to that, what you’ll get is what the output of that query is actually giving me. This is an annotated CSV format. It actually can be used by a wide variety of tools other than Influx, which is self-describing. So it gives you what the group key is first along the top, so how is the data grouped. So if you see a True, then that information is part of the group key. Next, it tells you the data type of the column. So you’ll see date times, you’ll see double strings, etc., that are listed there, and then it tells you what function was applied. In this case, the Mean function was applied, and then it’s giving you the result. The results are organized into tables, and the tables are numerically ordered and listed based on the group keys. So when the group key changes, you’ll get a new table. And then in this case, you see a start-stop time, value. The field is usage user that I selected, the measurements CPU, and obviously CPU total is the specific field that I selected. And I’m getting it off of my local laptop. Now, obviously, I could go in here and select other fields. And if I submit that query, you’ll see how that changes. Now, I have, instead of one table that’s numerically zero, now, I have Usage Idle as table zero, and I have Usage System as table one, again, because the group key has changed, and now, I’m under Usage User. If I go ahead and turn back and I plot that, you’ll see that there are three lines now that are plotted on the screen.
Tim Hall: 00:35:36.807 Now, let’s say I wanted to do something a little more sophisticated that the Query Builder doesn’t support. I have this button called Script Editor, and I can click into that. And what it does is now it populates all the boilerplate code, so everything from what bucket am I using to supplying the variables that are defined by this sort of stop and start time. For the range, the three filters that I put in place, so I selected the measurement, I selected the three fields that are usage user, usage system, usage idle, and we’re getting that specifically from only for the CPU total instead of each individual CPU that’s running. And then in the case of the function I applied, it’s an aggregate window that’s scoped by the window period based on the breadth of the pixels on the screen in this case, and I’m applying the Mean function. And so, if I want to make other changes to this, obviously, here’s where to do it, and you have access to all of the functions within Flux here on the right including things like geo temporal queries and a ton more. Everything from you can do string interpolation, you can do date manipulation, truncating dates, and using date parts, all kinds of things, and mathematical functions as well, things you can never do inside of InfluxDB 1.0, which is tremendous. So that’s the Data Explorer.
Tim Hall: 00:37:03.498 Now, after you’ve gone through the data exploration experience, you may feel like okay, well, I’m done with that. However, there are three exit points that you have. You can save whatever you’ve done either as a dashboard cell by creating a new dashboard. You can save that as a task, in which case, you will also define where you want to send the data to from one location to another, or you can create a variable. And the Data Explorer is really the place today to build up these different resources and save them off to their ultimate destinations obviously because it gives you access to visualizing the data to make sure you can confirm that that’s what you want. It gives you access to the raw data in case the visualization isn’t entirely clear to you. But this is the place again to explore all those elements.
Tim Hall: 00:37:55.443 Next up, obviously, there’s dashboard. It looks like I have populated automatically the system dashboard based on my Telegraf configuration setup. And you’ll notice that my laptop stats here are populating based on what’s being sent from my machine into this instance. So you can define and create any number of dashboards. You can also add labels to them for quick organization and for search. And we’ll talk a little bit more later about the mechanisms by which you can share, not only dashboards, but other resources including the Telegraf configuration, alerts and notifications, and more. Obviously, there’s a task system. This is new as part of InfluxDB 2.0. Tasks are things that you can create using Flux and scheduled to run on a recurring basis. Things like running down-sampling to reduce the number of data points that you’re working with over time and sort of a roll-up mechanism, if you want to do anomaly detection, sophisticated anomaly detection, and you can do that as well. And we’ll explore that feature once we do the upgrade process.
Tim Hall: 00:39:10.122 Alerts and notifications is another part of the platform. And in this case, there’s an alert check. So a check is a periodic query that you create, and it’ll run on that periodic basis and tell you the results whether it’s okay or it’s found something else. Based on the checks, you can create a notification rule. Notification rule looks at the status of all the checks and determines whether a notification needs to be sent to an endpoint. In this case, you could be checking the frequency of the appearance of a particular condition. This can help you eliminate things like event storms. If you see a check that’s repeatedly failing, maybe you only want to be told about it once or twice before it goes silent. But the rules for the frequency of reporting based on the number of times that you see an incident can be defined separately. And then last but not least, the notification endpoint, where do you actually send things to? And the UI today supports three different things right out of the box: Slack, pager duty, and HTTP, but there is a whole host of notification endpoints that you can send alerts to including Discord, Microsoft Teams, and others. And there’s more coming on a really regular basis. Eventually, those will also be integrated into the UI, but that doesn’t prevent you from accessing that functionality through Flux today. And there’s a nifty video that was just created by one of our developer relations folks, Anais, on using a custom endpoint with Telegram. So check out that video. It should be in the Slack channel in our community site.
Tim Hall: 00:40:49.207 And then last, but not least, settings. So you’ll see here that I have a variable called buckets which is just essentially grabbing the names of the buckets, eliminating the system buckets from the list. And that was used in my dashboard setup. I can refer to variables, and when you refer to it, it appears and then makes it a selection. We’ll come back and talk about templates, and this is how you share your expertise with others. And obviously, the labeling system, just creating labels. So if I want to create one, create a label that was demo, give it a color, create it, and then if I went back to my dashboard, I can add that label as an association here. So, yeah, that’s just an organizing principle within the system.
Tim Hall: 00:41:40.212 Okay, so, that’s the basic walkthrough. I’ve put data into the system. Now, let’s create a task. So one of the things that you can do, I’m regularly pulling this data in to the system. I want to create a bucket. In this case, I’m going to create a bucket called Telegraf five-minute mean, and I’m going to roll that data up into an into - I’m currently collecting the data, I think, at 10 or 30 second resolution. What I want to do is summarize that data, so that every 5 minutes, I’m going to roll it up. And so first question is, well, what do I want to grab? So let’s say I want to grab disk, memory, and swap. And I’m going to clear these fields out because I want them all. And I’m just going to start with that, right? And so if I go now into the script editor, actually, don’t want CPU Total, I want all of those fields. All the fields from all those measurements need to be down-sampled. So first thing is, well, this range is dynamic at the moment, and I don’t want it to be dynamic. What I really want it to do is every time the test runs, I want it to look back 15 minutes. Sometimes, you get late arriving data, and so I want there to be an overlap of the amount of work that it does to ensure that it captures that late arriving data. So in this case, I’m going to add that. And then, next, instead of it being a random window period based on the size of my screen, I’m going to fix the window to that five-minute mean. That’s it. That’s the query.
Tim Hall: 00:43:31.460 Now, if I submit this now, you’ll see all of the different measurements now and all of the fields are shown visually here. And it looks like I’m getting the right outputs. There’s a whole series of tables here based on the various group keys. That looks fantastic. Okay. So now, the question is, how do I save this? I want to say this to a task. So I’m going to call this my Laptop Down-Sampling. And I’m going to schedule that every 15 minutes, and I’m going to send the output to my Telegraf five-minute mean bucket, and I save it as a task, and that is it. The task is active and running. It will fire on that 15-minute window. You can come in here and view the task runs like how frequently has it run. Obviously, it hasn’t run yet. You can edit the task, although I recommend this for more of just sort of a browsing capability. You have to be sort of a Flux expert to be able to edit it right here and make sure you’re not making mistakes. Typically, we’re using the Data Explorer to confirm that any changes we make are yielding the right result. And then, eventually, when the task runs, we’ll get that down-sample data into our five-minute mean bucket, and we can then use that in new queries. So that’s the task system.
Tim Hall: 00:44:55.854 We can also do some cool things by adding ourselves to dashboards and more. So we’ll look at that a little bit later. So let’s go through the upgrade process next. So I’m going to actually kill that process. So we get a little - Zoom likes to steal all my resources on my laptop. So just to simplify that down. So let’s go through the upgrade process very briefly. I have an existing setup that’s running, and that setup includes Chronograf 189. That’s the latest edition of Chronograf connected to InfluxDB 1.83. It also has a couple of Telegraf agents that are deployed, one on my laptop, one on another machine here. At the house, that happens to be my son’s machine because I’m monitoring his gaming access, which we’ll talk about in a minute, and it’s super useful because I’m going to create an alert on that, which is a handy parental tool. Now, inside of InfluxDB 1.83, I have four users. I have an admin user, I have a testing user, I have a user that’s dedicated for read access mostly, and I’ll come back to that, and then I have my write access for Telegraf agents. There about 10 databases that I have, including the underscore internal database and Telegraf that’s been set up with two retention policies. Very similar to the walkthrough that I just did. I also have one continuous query that’s running. Continuous queries was something that people used in the 1.X line to do that down-sampling activity. And so inside of my read-only user, I’ve actually given him permission to read a couple of the databases, but I also gave him full access to the Chronograf database so that if he wanted to do dashboard annotations, he would have the ability to do that visually. And so that’s how he’s been set up. And I have a Chronograf instance that’s connected using that user, and I also have a Grafana instance that’s connected using that user which I’ll show in a second.
Tim Hall: 00:47:12.020 So next, we’re going to run through the upgrade process. I’m going to install and run InfluxDB 2.0. And what I need to do is I need to stop the running instance. I need to run the upgrade command against the new InfluxDB binary, and then I’ll just fire it back up. Now, there are some additional helper scripts for Dev and RPM packages as I mentioned. Those documentation elements are your friends. Check those out in the release notes for 203. And there is no upgrade for Docker users yet, but that is the next thing that we’re working on. And if you’re interested in helping us out with that, hit us up in the community Slack.
Tim Hall: 00:47:50.025 So let’s go through the upgrade process. First thing you need to find where I have InfluxDB running, which I think is here. Okay? So that’s my running instance. Oh, I should show you a running instance. So in the running instance right now, not a smoke and mirrors demo. This is the running instance. I have two hosts that are sending in data. One is called Destroyer because it’s my son’s machine and he’s 11, and that’s how you name machines when you’re 11. And then, I obviously have my laptop here that I’m running against. And I can switch between those with the template variables and populate the various graphs that are running. And then from a connection information, you can see here that I’m here at Hall Compound, and I’ve connected to my running instance, and it’s InfluxDB 1.0. Now, one of the new things in Chronograf 1.89 is this InfluxDB 2.0 off, which allows you to use the authentication mechanisms for InfluxDB 2.0, and we’ll come back to use that here in a minute. I also happen to have a Grafana instance running. I know that’s a popular tool that many people use. And again, similar kind of stats available. And again, similar kind of setup. I can look at Destroyer. I can look at my own laptop, etc. So both those are running. Same data. The data source configuration looks very similar. Same machine, same user setup; nate_haugo, my read-only user, and there we go. All right. So that’s all running. Now, one of the things I want to show in the upgrade process is the fact that I’m just going to continue to keep all those things running, upgrade the instance, and not perturb all of my users or my Telegraf agents as part of the process. Yes, there is a little bit of downtime as we go through the upgrade process. But typically, the Telegraf buffers and things will hold on to the metrics and pump them back in without data loss assuming you go through this process quickly.
Tim Hall: 00:50:03.389 So let’s go ahead and kill my InfluxDB instance. It’s right here, and I’m going to go to where are my 2.0 instances running? And let’s run the upgrade command. Whoops, not with caps lock on. Okay. So the first thing it’s going to do is it does a bunch of checks and scans to make sure that you have not already run through the process before, and it’s got the right disk space, and so on and so forth. Currently, the upgrade process will copy the entire data set from where it currently resides to a new location. It is shuffling the files and moving them around on disk. In the future, we’ll introduce an option to sort of avoid that movement - or the copying. And yeah, so, look for that coming soon, but just keep in mind, it will copy this, copy all the data initially. So let’s go ahead and run back through the setup, and this is very similar to the CLI setup that you would run if you were to set this up from scratch. And it does ask you to enter your password twice. And again, I’m going to give it a thallorg, and my primary bucket name is Telegraf and infinite number and confirm, Yes. All looks good. Actually, no, I’m not going to do that. Let’s go back through the upgrade process. I’m going to actually name my primary bucket something else because I think it might be a little confusing if I name it Telegraf. I have an existing Telegraf autogen and an existing Telegraf down-sample bucket that exists. And so let’s see if the process is re-entrant. I’ll have to go through those checks for a second. Yes, great. So it noticed that it had already generated a config tunnel. So this is part of the troubleshooting process. So it had already created the new config file. So basically, it says, “Hey, if you want to go and be re-entrant back into this process, you’re going to have to remove the files that I already created.” And so again, this prevents you from running the upgrade command multiple times, which is good. So I can just remove the stuff back to where it was, and let’s do it again. All right. Checking for space. Obviously, created a config file. What it does for the config file is it reads your existing InfluxDB config and attempts to generate the resulting same configuration steps setups for you into that config file. When I went through the Docker setup, no config file required. It just used all of the defaults. So let’s go back through quickly and get this set up. Zoom is stealing all my resources. Organization name, and let’s call it default bucket.
Tim Hall: 00:54:03.487 All right. Looks good. Fire that up. Now, it’s going through, and it reads all of the existing TSM data. And you’ll notice here, there’s a bunch of output. Now, I’m not quick enough to sort of read that as it goes by. So all of the output is actually put into the user’s home directory who ran the process. And you can look at that at your leisure and scroll through it. It’s descriptively named upgrade.log. You’ll see in here that it’s copying the various data sources. You’ll also notice here that it migrated the users. And then the very last step is going ahead and fire it up and log into that URL. So now, let’s do that. So we’ll fire up that process, and while that’s starting back up, I mentioned the troubleshooting, right? So, it said, “Yep, you got an existing 2.X config file. You got to remove that.” The two files you’re going to look for, one is the upgrade.log. If you are running continuous queries, those continuous queries will be output to a continuous queries.txt file so that you have the structure of those queries to translate into tasks. The continuous query, “Says subsystem does not exist in InfluxDB 2.0. You use the task
system. And we have provided documentation to assist you in translating those queries into Flux tasks.” These will be located, as I mentioned, in the user’s home directory, and upgrade will log the standard out log and the log file, everything that happened in the process.
Tim Hall: 00:55:40.467 One of the things that we’ve heard from folks is that you may get a message about too many files open. So you’ll need to adjust the limit on your particular system and look for instructions either in our documentation or your operating system for how to adjust the limits to allow you to open more files. Now, the result of this upgraded setup, well, that’s starting back up, is that I still have Chronograf 189 and Grafana running, and those instance continue to function. However, the admin functions that were in Chronograf previously are now disabled. You cannot browse the databases except through the Query Explorer. You can’t see what users are there, and you cannot kill queries. All of those administrative functions are now in InfluxDB 2.0. In addition, it migrated all of the authorization setups for all of the users except for my administrative user. So what that means is if you’re using administrative user to connect to either Grafana or Chronograf, your InfluxDB 2.0 instance, you’re going to want to - or sorry your InfluxDB 1.0 instance, you’re going to want to create a user with fewer permissions before firing up the upgrade process. The admin users are not migrated across, and that is intentional. We want the administrator to review who has administrative access and then selectively invite those folks back into the system. But all of the other users that have fewer privileges are migrated across. You’ll see that our database has been migrated into buckets, and the DBRP mappings have been created to allow InfluxQL to continue to work. We do not migrate across the underscore internal database. That is left behind. And I’ve got my new default bucket, but it does not have the DBRP mappings, so meaning, I cannot access it through InfluxQL. As I mentioned, no continuous queries are running. It does output to that file that you can see. But everything else should still run and continue to run.
Tim Hall: 00:57:54.337 So let’s quickly go and see if everything is working. So we can go back into my Grafana dashboard and switch. And sure enough, I’m still seeing data. No errors. I didn’t change anything, and that all looks good. If I go into my Chronograf instance, you’ll see that the dashboards are there, system dashboard, you can see the query is being fired off. And this is running against the live 2.0 system, and all of the data is still available, and it didn’t miss a beat. It was all stored up in Telegraf agents. Always switched across. It’s great. Now, I can also log into that new instance that we just fired up. And so when you see InfluxDB 2.0, the login screen looks like this, and I can log back in. And there you go. I’m in. From a data perspective, if I go and I check out buckets in my live running system, all of these buckets now exist. There’s the default bucket that we created in the setup process. And then all of the existing buckets that I had previously are present including Telegraf autogen and my down-sampling bucket that I created through my continuous query. You’ll notice that the bucket names are concatenation of the existing database name with a slash and then the retention policy name. Again, that’s the concept. The bucket name now includes the retention policy as part of its attributes. I can go into the Data Explorer in the same way that I did previously and select the Telegraf where all the information is landing, and you’ll see same kind of setup CPU total, submit, and I got both machines now and the two different series are being plotted.
Tim Hall: 01:00:14.700 So, yeah, so that’s pretty quick and easy to do. One last thing you can do is if you’re a Chronograf user, and you’re like, “Hey, I’m not quite ready to dive into the full, new experience. I want to start playing with the data that I had in Chronograf.” You can go ahead and create a new data source connection. In this case, I’m going to create it to my InfluxDB 2.0 setup. I’m going to click InfluxDB 2.0 off my organization name, as you remember [inaudible] that word. Oh, I need a token. All right. So if I go back in here, you’ll notice the tokens is in that same place. I can use this token or I could generate a new token. Let’s just use this one. I can go back in here, populate that, and my default database name is going to be this. And add that connection. Oops, interesting. Live demo. Okay. Not sure why that happened. Let’s go to that. All right. Well, not sure why it’s rejecting my connection. But let’s do - oh, I know why. Let’s put in the specific IP. There we go. It’s my networking setup. So I connected to it. Click Next. Skip that. Finish. And now when I go into the Data Explorer, I will also have the ability to use Flux. So I can write Flux queries now within Chronograf as well. This will help me in sort of migrating across and sort of exploring the new features and capabilities of the platform.
Tim Hall: 01:02:04.993 Last but not least, I wanted to mention this sharing of expertise. In the settings, you’ll notice this button called Templates. You can package up everything from your Telegraf configuration settings, your dashboards, alerts, tasks, and more, and share those with others. And you don’t have to do it. You can share it just amongst your teammates, but we do offer a public-facing GitHub repo to allow you to share this information with others. And there are more than 45 community templates now that are out there for monitoring everything from Docker hosts to gaming templates for Apex Legends. There’s a Cloud Watch monitoring template. There’s a Counter Strike template if you use Digital Ocean and get your billing stats. If you’re working with IoT devices, again, there’s a Modbus template that will show you how to how to get started. In each one of these, there are a collection of assets. So you’ll notice here. If I click on the Linux system monitoring template, and I drill into that, it gives me an image of what that should look like. It tells me what resources are inside of here. And it also gives me this handy URL. So it says, “Hey, if you want to import this into your running instance, you can grab that.” So I can go back to my template setup. I can paste your URL in. I can look up the template, and it tells me exactly that manifest that I saw. There’s one dashboard, there’s a Telegraf configuration, the buckets, variables and labels. So I go ahead and install that template. It will pull all of those resources in, and it shows me up, yep, here’s all the things I’ve got. Now, when I go to my dashboard setup, here’s a list of Linux system template that I just imported. And I can then say, “Well, wait a second, I’m actually feeding data from a couple of Linux systems.” If I go in and actually select where that data is landing, in this case the Telegraf autogen bucket, now, you’ll see all of that appear and including the variables here that have been set up. So, again, I move very quickly between systems, and I didn’t have to create all those dashboards by hand. So if you’re looking for jumpstart kits to get started, you certainly can do that.
Tim Hall: 01:04:30.109 So, yeah, I know we’re at the top of the hour. I’ve run over just a little bit. We did not get into more of the alerts and notifications, but for sure, there’s something you can set up to send alerts, and I’ve done that myself. So I mentioned I’m monitoring the GPU temperature of my son’s machine. And if you notice here in Slack, it’s notifying me that the temperature of that has gone up over the past couple of hours, and it can give me everything from warnings, critical alerts, and when that happens, I know he’s not on doing his schoolwork, and he’s playing his game, so I can go figure out how to redirect him to his normal activity.
Tim Hall: 01:05:16.691 So with that, I will stop. I know there’s a few questions, Caitlin, that maybe we can get to and then we can wrap up.
Caitlin Croft: 01:05:22.969 All right. Thanks, Tim. So given that we went completely over, I thought that was a really great demo. Unfortunately, we are going to skip the Q&A for now. We will answer all of the questions in a follow-up blog. I apologize for that. But we will make sure to answer all of them on a blog, and you can always ping Tim in the community Slack. So thank you, everyone for joining today’s webinar. It will be available later today for replay. Thank you.
[/et_pb_toggle]
Tim Hall
Tim is a seasoned executive responsible for products, support, and professional services at InfluxData. Prior to joining InfluxData, Tim was VP of product management at Hortonworks where he was responsible for leading the product management, documentation, and user experience design teams. Previously, Tim held management level positions at Oracle, HP, Talking Blocks, and Xpedior. Tim holds a Bachelor of Arts degree from Claremont McKenna College in Science and Management with a concentration in Physics.