InfluxDB 3.0 Under the Hood: Overview and Demo
Session date: Oct 31, 2024 07:00am (Pacific Time)
This presentation will explore techniques for simplifying access to real-time data. It will cover the mechanics of modern time series data storage, how it compares to traditional row-based solutions, and the benefits of using SQL with columnar storage for real-time data analytics. Attendees will learn how to optimize storage for real-time analysis using these technologies to make faster, more informed decisions. Using InfluxDB 3.0 as a reference, the presentation will also delve into the latest advancements in time series databases and their impact on the efficiency and effectiveness of real-time data workflows.
Watch the Webinar
Watch the webinar “InfluxDB 3.0 Under the Hood: Overview and Demo” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “InfluxDB 3.0 Under the Hood: Overview and Demo.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Suyash Joshi: Senior Development Advocate, InfluxData
- Pete Barnett: Product Manager, InfluxData
SUYASH JOSHI: 00:00
Let’s just start. Welcome to the webinar. Today, the topic is delivering real-time analytics for time series data in InfluxDB 3.0 with our speaker, our product manager, Pete Barnett at Influx Data. And I just want to give you some housekeeping. My name is Suyash. I work as a developer advocate at the company. And thank you for joining us today. Right before getting started, I would like to remind everyone of some housekeeping. This webinar is recorded and will be shared and made available on demand in the next 24 hours, as well as the slides. If you have any questions, you can use the Q&A section at the bottom of your screen. We will answer those at the end. You can also type it in the chat, and I’ll monitor that. Don’t forget to check out InfluxDB Community, Slack workspace, and our forums. Tons of Influxers and other community members in there are also always answering your questions, so you can reach us anytime. With that, I’d like to hand over to our speaker, Pete. Take it away, please.
PETE BARNETT: 01:18
All right. Thanks, Suyash. And yeah, excited to be having the opportunity to walk through this today. Especially, as I said or Suyash said, right, there’s people joining from all over the world. So, I know a lot of different times, some super early, perhaps some super late. So, appreciate you hopping on here. But today, we’re going to be talking about real-time analytics specifically and how you can leverage a time series database. How InfluxDB 3.0 does it, perhaps a little bit differently than what we’ve seen in the past, and some new sort of initiatives coming out of that, as well as just how you can get started quickly and some maybe quick demos, of course, as well, answering any questions along the way. We’re going to try to keep this to a tight 30 minutes. And to get all that in, there’s certainly a lot to talk through. But ultimately, we think it’s going to be a great, great session here.
PETE BARNETT: 02:04
So as Suyash said, though, Pete Barnett. I work in product here at Influx Data. Influx Data were the creators of several different solutions that have been used for years across millions of instances. And InfluxDB, of course, very well known. Telegraf. Another one which we’ll also be leveraging today for the demo across other solution sets as well. But again, excited to have a chance to chat with you all today. And so, what we’re going to be talking through briefly is, first off, what is time series data and why it’s important, and what it does, the different types. How it’s stored, so you get a little bit better understanding of the technical reasons why a time series database is a little bit just better prepared for these types of questions dealing with time series data. How 3.0 for InfluxDB approaches this, and then how you could sort of put it all together. And then, of course, we’ll finish up with the demo and answer any questions at the end.
PETE BARNETT: 02:56
And so, the first question is, what is time series data? For some of us, perhaps we already know. Maybe it’s kind of obvious. For others, it sounds obvious, but when you have to put it into words, it’s a little bit less clear. And so, we’re going to dive into that a little bit more cleanly to start, just to sort of set the baseline. And the first thing you could think about time series data; it’s any sort of data where the key purpose is knowing what happened and what is currently happening at specific instances in time. You’re going to read that definition really specifically. So, you can think sensor data is a big one. Anything sort of dealing with digital world instrumentation. You can think of if you’re using a Mac and the Activity Monitor or on Windows if you’re using the Task Manager, oftentimes you’ll see those little monitoring observability metrics going up and down for how your CPU, your RAM utilization is. That’s a great sort of instance of time series data. Other types of data points are heart rate monitoring. I’m wearing a smartwatch. And of course, there’s a ton of time series data tracking fitness and tracking just how my overall health is, as well as the experiences I’m having throughout the day. And that is all time series data just coming in rapid succession, consistently with spikes at certain points for certain items. And we’re going to talk specifically about what those two types are as well.
PETE BARNETT: 04:14
And so, when you’re just collecting data at specific sort of periods of time at regular intervals, that’s called metrics. And so, a lot of times, if you’re doing, for example, heart rate monitoring on a smartwatch, that’s going to be taken every several minutes at a regular interval, and you’re going to be able to have that data point coming in. You can think of just being able to track, for example, your CPU utilization and taking a sample perhaps every 10 seconds. It’s another example of a metric of time series data. It’s a very specific type, and it’s also the baseline sort of foundation type that you’re able to stream and understand how that data is being changed over time, where it’s progressing, running a lot of different analyses on that type of data.
PETE BARNETT: 04:57
Now, the second type is called events. And the best way to just think about events is you’re suddenly getting a spike, right? That’s a great example, or maybe something suddenly goes out, or you have a system crash. Those are all examples of when suddenly you have something triggered and it sort of maybe breaks across a standard process it’s something that ultimately is at an irregular interval and is going to give you a sense of a little bit of what you’re either looking for or an interesting point along that time series data. Again, we’re talking about our heart rate monitoring. You could have a scenario where suddenly you have a heartbeat just increase for no reason at all. And that’s a definitive example of an event. Maybe you’re having satellite telemetry and suddenly the satellite goes a little bit off-focus of where it’s supposed to be going directionally. It’s another example of sort of an event along that time series horizon.
PETE BARNETT: 05:51
And so, the big question sort of out of this is why is time series data so important? And the ultimate answer is that if we can better understand the past as well as where we currently are, that’s going to enable us to better predict the future. And there’s just so many areas along the way that we can really sort of point to and say, “Having time series data has given us the ability to better model and better understand not just the world around us, but also where the world is going. Stock markets are a great example. Any financial modeling in that space, health metrics, weather patterns are huge and are being able to really leverage time series data to quickly adapt and adjust models and better understand what the weather may be, not only now, but again, also in the future as well. And through those real-life implementations, we can take an example here of this, again, stocks chart right here, where we’re taking a lot of sampled metrics at specific times. We also see some spikes, though, specifically towards the end. And those would be great examples of events where suddenly the sample of data suddenly crosses above a certain specific threshold. Or perhaps you suddenly have an unplanned event, right, maybe within that stock market itself, where maybe the stock market cuts off for the day for certain reasons that that sometimes happens being another example of an event.
PETE BARNETT: 07:10
You could also look at the top right. Again, also looking at weather patterns and whether or not if it’s raining or not, when suddenly you’re sampling. A great example of a metric. And in the bottom right as well, having, again, a lot of different spikes, a lot of irregular intervals for a lot of those events across your standardized metrics as well. These are some great examples of implementations, but they’re across so many different areas as well. And again, if you look around yourself, you’re going to find there’s so many different series of time series data actually being used every day.
PETE BARNETT: 07:41
So that’s a lot about what time series data is. But let’s take a little bit more of a deep dive into how you actually store this data. And why does it even matter? Why can’t we just use our standard storage solutions we’ve used for many, many years? And those traditional storage solutions are great for what are called OLTP transaction processing. So that’s Online Transaction Processing, OLTP. And if you think about any of your standard databases that are relational, row-based storage, this is exactly the type of traditional storage solution we’re talking about. And these are great for so many things. If you’re doing anything with transactions and trying to understand specific entries and all the data across them, this is going to be a great type of storage solution. See here for ID, temperature, air quality, cloud coverage, very easy to understand for each individual entry what the value is for those specific entries as well under temperature, air quality, and cloud coverage. And there’s other solutions as well, such as document, vector, and graph, which continue to grow, key values as well, and so much more.
PETE BARNETT: 08:46
But the challenges of row storage for time series is that they’re not great for analytical queries. Because when you’re trying to understand perhaps how the temperature has changed over time, you don’t want to have to also ingest the air quality and the cloud coverage. That’s not going to bring the value that you’re looking for. And all it’s going to do is cause your system to have to go through a lot more data. And we don’t want to have to sort of take that into account when all we want to understand is temperature. And so, in this scenario, if we were to say, “Give me the temperature over the past four entries,” what we’re going to have to do is ingest those four rows and then toss out the data that we don’t actually leverage and use. And that’s just a lot of processing that we don’t need. And for things that are really important when it comes to again time, things that need to happen quickly, things that need to happen in milliseconds, that’s how they’re sort of being counted and tracked; fortunately, going through a row-based storage solution just does not enable the speed and quality that you need in that scenario.
PETE BARNETT: 09:51
And so, what we’ve sort of looked at and what the general approach is when you’re trying to do online analytical processing is columnar storage. And that’s where data is logically organized, sort of in slices for aggregation. And that’s the better approach for being able to look across an entire column a lot easier rather than having to look across all these individual rows and then piece the column together. So, in this scenario, if we wanted to know all about temperature, well, we would just call upon the temperature column. And in that way, we can get all the data we need, easily able to analyze and understand it without having to go into the additional IDs, the air quality, the cloud coverage, etc.
PETE BARNETT: 10:33
And so, with that benefit, what we’re doing is we’re just skipping over the processing portion of these additional columns that, again, don’t bring the value we need because we’re doing analytical data on a specific type of entry here. In this specific example, temperature. We’re not having to process this other type of data. And when you’re having to deal with systems on the orders of milliseconds where we need to know what is actually happening, and if you’re sort of taking seconds upon seconds or minutes or longer to process these queries, you’re not going to be getting the actual solution you need because it’s just going to take too long. And by the time you actually get your result, the system is going to have changed, possibly dramatically. And that’s why getting down to those nitty-gritty details on how you can best optimize for fast analytical queries, it’s best to move in a solution that can handle that extremely well. And that’s why we leverage columnar storage in a time series database.
PETE BARNETT: 11:31
And so, let’s take a deeper dive here at monitoring, right? And this is a great example of some real data that was leveraged in the past for CPU monitoring. And these metrics are, generally speaking, looked at not together but individually, right? You’re not trying to understand how your one CPU core and your free memory and what your current network lag is. You’re not trying to look at all those together. And they’re also different units. They’re not the right types. And so, if you’re trying to understand what your free memory is, right, or what your memory utilization is, you’re not going to want to know what the CPU was at that same time. You may want to be able to cross-reference those at some point, but generally speaking, you’re trying to track one at a time in that specific scenario.
PETE BARNETT: 12:15
And so, if we’re looking at these certain entities, again, memory is going to be based upon certain memory sizes. CPU is going to be based on utilization. Cloud is going to be based on time. And there’s just lots of different sort of measurement difficulties trying to group these all together into a single sort of view when you aren’t having to leverage and build upon in the analytical process all of those together. You don’t want to combine all of these together. And so that’s why it’s not a great solution for being able to use a traditional storage approach.
PETE BARNETT: 12:47
Now, this type of data, though, is perfect for analytics. This is exactly the scenario where you want to be able to look at data as it changes over time. And given that, with the CPU solutions, we can very easily say, “Give us the time so we can correlate that.” And also understanding what our CPUs are doing on each individual item sets, but not having to process what your memory, your network, what the specific IDs are across that time as well, because it’s just not going to provide value to us, but it is going to slow us down and considerably. So, for that reason, we leverage this columnar solution to really give a better analytical process to how you can assess and view this data.
PETE BARNETT: 13:29
Now, if you were trying to take a slice of an understanding for what the entire system was doing at a specific point in time, maybe your system goes down, and then you want to give me all the information at that very specific point in time what was my system doing this is a great scenario where leveraging a traditional storage approach works. And the good thing about traditional storage and columnar storage is that you can still run both sort of processes on them; it’s just in what scenario do you want to have the standard approach so that if you need to be time-critical, and you need to have moments where time distinctly matters, how you set up your system is very important. And so, you can still come in and say, “Give me what that information was at a specific point in time across the entire system; it’s just where are you leveraging your database the vast majority of the time? And what type of processing are you doing? Are you doing OLTP or are you doing OLAP? That’s really where the big difference comes in.
PETE BARNETT: 14:25
And that’s your best practices and use. And some great examples are if you’re looking at what’s my average CPU usage the last 100 days. One column is the focus. It’s going to be great for time series databases. It’s going to be great for columnar storage. But if you want to know, again, what was my system status the moment it went down, that’s where you have a row. That’s the main focus. And again, you can leverage these across both as needed. But the main question is, what are you traditionally using it for? Because if you want to do transaction processing on a columnar storage, long-term, perhaps not the best. And the same approach. If you try to do row-based storage solutions on time-critical analytical data over the long approach, it’s not going to be a great solution, and it’s going to give you a lot of sorts of difficulties in being able to be effective in analyzing that data.
PETE BARNETT: 15:15
And so, let’s talk about InfluxDB 3.0 and the approach here. So 3.0’s the newest version of InfluxDB, and it builds upon not only the learnings in 1.0 and 2.0 but also adds some additional adjustments to not just dramatically improve interoperability and more, but also really drive forward the ability to have tremendous improvements in those OLAP queries. The first question is, can we get closer to real-time? For analytical queries, especially most recent data, it’s not just about how that data is stored, but it’s also about how you actually ingest that data. And this is a very, very simplified regular ingestion model. But traditionally, you’re taking that data in, pulling it through an ingester, perhaps running some specific functions, queries, adjustments on it. And then you’re sort of pushing it into storage. And then when you make a query, it’s going to have to go through a querier, which will go into storage, understand that data, run it through some processing, and send it back to the user. It’s a very simplified model, but that’s traditionally the general architecture of these types of approaches on how you can not only ingest data but then also query it as well.
PETE BARNETT: 16:25
But specifically, what we’ve done in 3.0 is enable the ability to leverage an in-memory buffer. Really, query from ingest itself. And so as soon as data gets into an ingester, we can immediately make it available for querying before it even goes to storage. And this is really enabling the ability to go much faster into real-time so that you don’t have to wait for it to be adjusted and stored and then processed from there. But instead, you can just query directly from the adjuster. And again, this is in memory, so it’s incredibly fast. It enables dramatic improvements to your response times, and ultimately, being able to get closer to real-time analytics. And that data still is persisted in long-term storage, but it’s not gated on long-term storage before it’s able to actually be queried.
PETE BARNETT: 17:15
And as noted there, right, that’s Apache Arrow. And we’re really building on the entire FDAP stack to enable tremendous interoperability. And so leveraging Apache DataFusion, Apache Arrow as well, Apache Parquet, which is the new file format, not only is a great columnar storage solution, but so many other solutions also interact very well with Parquet and being able to leverage that for storing, querying, and expanding your database operations across many different tools as well. And so, for that reason, I think just having this interoperability framework is critical for, again, grabbing that analytics and understanding how you can leverage it across many different tool sets, even beyond InfluxDB.
PETE BARNETT: 18:00
And one of the biggest adjustments we made was simplifying with SQL. In the past, we’ve looked at many ways of how you can really interact well with columnar databases, how you can interact well with time series storage solutions. And one of the biggest adjustments with 3.0 is really just bringing back native SQL to InfluxDB. And we even built on that with what we’ve had since 1.0.x, which is InfluxQL, which is a very SQL-like language but just has a few adjustments that simplify it as well for things that are very likely to be used across queries and things specifically for time series databases. And so, for that reason, just bringing that native SQL a much lower barrier to entry to understand and get started with InfluxDB, but also still bringing forward in FluxQL to ensure that you can have that even more seamless approach once you really understand how that language works in that process.
PETE BARNETT: 18:53
And so, when you put it all together, right, what do you get? I mean, the improvements and the ROI is just tremendous when you look at analytical queries using columnar storage for real-time performance. And based upon which type of system you’re using, you can get far better results that just dramatically enable your ability to not just be near real-time but actually take that action needed for these different approaches. And of course, it changes depending upon what type of system you’re looking at. It’s also dependent upon, again, how much data you’re actually processing, what type of data you’re processing in that space, and are you using OLTP or OLAP. But assuming you’re using that analytical processing, having a columnar storage solution, especially leveraging all these additional add-ons that we’ve included with InfluxDB 3.0, it’s just a tremendous improvement that is going to dramatically improve your ability to do this type of processing and get closer to real-time.
PETE BARNETT: 19:48
And that’s also going to improve your entire real-time data workflow because columnar storage is not the only approach, and it’s not the only sort of tool in the toolbox here. There are still, again, so many other storage solutions that work very well for so many different scenarios. Again, Postgres-type approaches and having these relational databases, vectors which continue to grow, graph databases as well, all have their place. And we really feel that columnar storage is another solution that works great for that analytical space and can blend really well. It’s why the interoperability is so important, and it’s why we’re able to really drive that value downstream by having all these different pieces working well together.
PETE BARNETT: 20:30
And so now we’re going to dive into a quick demo for 3.0 before we hop into questions. I know we don’t have too much time left, so we’re going to try to keep this demo very quick. And from there, we’ll answer some questions. So how do you get started with InfluxDB 3.0? The fastest way to get started is going to be through our cloud serverless offering, which starts for free. And you can get started very simply. If you go to influxdata.com, you’ll very easily see, first off, our new website. Great shout-out to our marketing team for putting this together. And if you go to— I’m already signed in. So, if you go into InfluxDB Cloud 2, that’s where you can get started. If not, you can just click Start Now, and that will get you logged in.
PETE BARNETT: 21:15
And so, this is your landing page for InfluxDB 3.0 Cloud Serverless Solution. And I’m going to have a couple of data points to walk through here, but just a quick overview is that this is a scenario where not only we’re trying to guide you through how to sort of create and leverage this type of analytics processing, but also show you the value of it downstream and how you can very easily ingest lots of different data and do it quickly. Many of the solutions here are sort of pre-built, and you can very easily ingest lots of different collectors, such as Telegraf, and sort of spin that up very quickly. We even create tokens and give you the right sort of commands to make that a much faster solution.
PETE BARNETT: 21:53
On the left, you’ll see the visibility to load data in. Again, lots of different ways. If you want to use a file upload or a line protocol, you can. But we have so many client libraries and plugins available to just make this a very seamless solution. I was able to get my MacBook set up for ingesting this data we’re about to see in probably about two minutes or less. So, you can get started across this entire process in probably under five minutes, which is really cool to be able to start analyzing your actual systems and understanding that data. We also have the ability to, again, of course, explore that data. And we’re going to hop into that now as well. Where, again, this is sort of my data explorer, where I’m looking at this Mac data that I’ve been ingesting since a little bit earlier this morning, and we’re looking at it sampled over every five minutes and being able to track how the CPU has adjusted over time.
PETE BARNETT: 22:42
So, if we go ahead and make another sort of call here, we’ll see that things have kind of spiked up ever since I hopped onto this webinar. I wonder if we go into the past hour if we’ll see suddenly things are a little bit more of a gap long-term. And that’s ever since I’ve sort of hopped onto Zoom here. But also having that ability to dive in and say, “Hey, where have I been the past five minutes?” And if I see perhaps a specific spike along the way, being able to quickly dive in and understand a little bit more what’s going on there, what that field is, what that actual host is as well. And these are some great solutions for, again, really understanding what that sort of tracking is doing over time. I can, of course, view that raw data as well. And if I want to, I can even go to my schema browser, which can use, again, that SQL sort of syntax for understanding, again, how your data is transforming, being used really across this entire data set here. We’ll see that, right now, if I were to pull this back—turned off MySQL sync here—that I would be able to sort of select across all these different buckets and CPU measurements, as well as other different pieces that I’m collecting from Telegraf, a lot of just different data sets that I’m currently pulling in.
PETE BARNETT: 23:52
And then being able to, again, visualize that data specifically. If you want to go through more of a traditional SQL table solution, you can very easily do that by— or excuse me, relational database sort of viewing and see the specific entries for every individual entry along the way. A lot more to also go around with dashboards where you can set up metrics and a lot more as well downstream with some other solutions. And so, I highly encourage you to leverage this as a sort of getting started solution. Again, in under five minutes, you can probably get started completely free. And you can quickly monitor your own system. You can sort of monitor external data. If you want to put it on an IoT sensor, it’s a great spot, and just push that data into Cloud Serverless. Lots of different ways you can get started. And again, we think it’s a great solution overall. And hopefully, you can get started very easily with Influx Data in this space, InfluxDB. So, I am going to pause there. I know it was a brief demo, but hopefully, that just lets you know where you can get started. Again, influxdata.com. And we’ll now answer any questions we may have. Suyash, I’m not sure if you’re viewing the question sets here. I do see a few things coming in, but happy to answer as well along the way.
SUYASH JOSHI: 25:06
Thanks, Pete, for that excellent presentation. And yes, there have been a lot of questions. It’s active. So, if you want to have a look, and I’ll read it to you, and you can just answer them. So, the first question is, will the recording be available somehow? I can tell you, yes, we are recording this, and we will email it to you, so you should get it by tomorrow along with the slides. Second question, Dean asks, “Is there a migration tool or other information of moving from 1.8 to 3.0?”
PETE BARNETT: 25:44
Yeah. That’s a great question. For migration tooling, yeah. And that’s something that we’ve been looking at in many different ways. So, for 3.0, I would certainly encourage you to contact someone at InfluxDB, perhaps Suyash himself, or reaching out on our community channels on what that process is. There’s a lot of different approaches you can take. Some of them are sort of dual writing for a bit. Some of them are just actual migration processes that we’re looking into now, depending upon what fits best for your type of scenario. And there’s a lot of different ways that we look at migration and making that a seamless, easy process, and how we can ensure that you’re not having to really set up these sort of whole new architecture pieces on your end. How can we make it a very seamless approach. And the best way is to, again, just look at what makes sense for each individual person approach. So again, I would highly encourage you to contact either someone at Influx Data ourselves using the Contact Us button, or you can reach out to our community, whether it’s the Slack community; there’s also our forums as well on, “Hey, this is my specific scenario on 1.8,” it sounded like, and then you want to move to 3.0, and what makes sense for the size of data that you have. That makes a tremendous difference. How long you’ve been leveraging it as well makes a big difference on your best approach. And then, of course, there are some approaches that we can look at for sort of the best scenarios in each individual area.
SUYASH JOSHI: 27:04
Yeah. One thing I’ll add is that Pete showed two languages that you can query. Nice thing is that InfluxDB 3.0 supports InfluxQL as well. So, the migration, it should be very little work on your behalf, if any. And it’s a nice idea to test it out for limited data and see how everything goes. But happy to answer and help you if you have any questions further on that. You can ask in our forums, or you can even reply to the email that we will send you with the slides and recording. Another question that popped up is any differences between Influx 2.0 stream events and InfluxDB 3.0?
PETE BARNETT: 27:54
Yeah. So, there’s some big differences between the two systems in general, right? And when we’re looking specifically at Influx 2.0— so my understanding here is we’re talking about Influx 2.0’s ability to ingest events is what I would assume and how InfluxDB 3.0 ingests events. So, there are some differences. Again, we’re using different file format structures, and the actual architecture of ingestion is a little bit different. But the actual process is, again, quite straightforward and similar which line protocol we’re going to enable the ability to source support across V2.0 and V3.0 a lot of the same sort of ingestion protocols. And so, you shouldn’t expect to see any major changes along those different processes. There is, of course, the question of how you sort of query that data, which can be a little bit different depending upon what your query language has been in the past. But when it comes to being able to ingest events, if you’re setting up for, again, either something like Telegraf or specifically Telegraf itself, right, we’ll sort of handle that all within the system itself. And you don’t have to worry about how you set up the ingestion process for 2.0 or 3.0. Specifically with Cloud Serverless, that’s all sort of handled by the actual Telegraf plug-in setup itself.
SUYASH JOSHI: 29:06
So, we might be going a little bit over time, but that’s, I guess, okay. We can just take a few last questions. One other is InfluxDB Cloud is charging query count 0.12 cents per 100 query execution. Is this regardless of query costs?
PETE BARNETT: 29:28
I’m not entirely certain what that question is getting at. I’m assuming it’s assessing if the query is very difficult; is that a more complex cost? And the answer to that is no. It’s a very sort of straightforward cost. If you make a query, this is the cost across 100 different executions, regardless of how complicated that query is.
SUYASH JOSHI: 29:53
And Abby, I posted a link to our pricing page that goes detailed into the pricing. So have a look at that as well. Tang asks, “Do you have demo for stock market analysis?” I can take this one. I wrote a blog just a few days ago, so great timing with the stock market web app. So, I will link that and send it to you. And the app, I’m going to further improve it. But the code is on GitHub. It’s a web app. It does data collection from stock market, stores it in InfluxDB. You can query that. So, I’ll share that. Can you show how to transfer Flux query to SQL query somebody asks?
PETE BARNETT: 30:42
Right. Yeah. So right now, Flux and SQL are not directly compatible. And we’ve sort of looked at different ways that we can enable, again, a lot of the different functionalities in 3.0 that Flux sort of brought to 2.0. And for all those people who sort of loved what Flux is, we’ve got some great updates, I think, coming downstream in that type of area. But right now, Flux and SQL are not directly compatible. There are different solutions out there, of course. We find that a lot of these machine learning models, or these LLMs, rather, are very good at being able to translate some specific types of queries to SQL. So would highly encourage looking there to start. And for those that aren’t, again, there’s a lot of different ways that we– the reason why we chose SQL is because it’s so ubiquitous, easy to pick up, and ultimately extremely powerful. And so that’s sort of why we’ve gone that route. But it is not directly compatible with Flux in terms of a direct translation process.
SUYASH JOSHI: 31:34
Yep. I think that answers another question about somebody using Influx 2.7, and they were asking if there was a Flux to SQL transformation. So best way I would suggest, use LLMs; like Pete said, it should automatically help you. Give you the corresponding SQL, and that should make your life easier, hopefully. One other question, Pete. This is a good one for you, I believe. When can we expect Influx Edge?
PETE BARNETT: 32:11
Ah, InfluxDB Edge. Yeah. Yeah. Good news on that. It is right for me. I’m actually the product manager as well for that product. And so, something that we’ve been working really hard on. I know a lot of interest in this product, and we have some really good engineers working on it. I believe, in our last sort of roadmap webinar, the sort of timeframe that we were looking at and communicated was January of next year. And I would expect that we’ll have a little bit more communication out about it probably later this year. So, I would certainly stay tuned. Probably in the December timeframe of when you can start to expect a lot of the sort of initial sort of alpha pieces and sort of getting started pieces. I’ve had a chance to obviously work with it and really excited about what’s coming. And so highly, highly encouraged, just sort of staying tuned. And soon here, we’ll have more information.
SUYASH JOSHI: 33:07
Yeah. We’re really excited about that. And I hope you enjoyed the webinar today. Thank you for all the questions and the chat discussions. And I’ll be sending out an email tomorrow with slides and recording. Feel free to reply if you have any questions or find us again in our community Slack and forum. Again, thanks again, everyone, and have a great rest of the day. Thank you, Pete.
PETE BARNETT: 33:32
Thank you, Suyash. Thank you all.
[/et_pb_toggle]
Peter Barnett
Senior Product Manager, InfluxData
Peter Barnett is a Senior Product Manager at InfluxData, where he guides the development of new InfluxDB solutions. With more than eight years of experience in software engineering and product management, his expertise lies in data analytics and time series products. Peter previously was the Director of Product at a Series B startup and worked as a software engineer for a Fortune 500 organization. With multiple degrees in technology and business, Peter is passionate about solving complex problems and delivering value through innovative, customized solutions.