Simplify Stream Processing with Python, Quix, and InfluxDB

Session Date: Oct 03, 2023
Time: 8:00am (PT) | 3:00pm (GMT)

Quix enables event-driven architecture by combining microservices with event streaming. With Quix, software teams of any size can build and release event streaming apps with AI features that scale, in no time. You can get started in minutes without the complex setup or configuration that comes with managing containers and running your own message broker. Code in pure Python with your favorite libraries and deploy with built-in DevOps best practices to manage your pipeline and monitor your system.

InfluxDB is the purpose-built time series database. A lot of data in streaming applications (i.e. IoT, finance, user behavior analysis and manufacturing) is time series data. Discover how you can become proactive by using Quix and InfluxDB together.

Join this webinar as Tomáš Neubauer dives into:

Quix’s approach for making stream processing easier
How the Quix Streaming DataFrames (SDF) library and its tabular format for data processing unlocks the full potential of real-time analytics
How developers can iterate faster than with a traditional CI/CD pipeline
How to quickly deploy production-ready applications in Quix with InfluxDB

Click here for presentation

Watch the Webinar

Watch the webinar “Simplify Stream Processing with Python, Quix, and InfluxDB” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.

[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]

Here is an unedited transcript of the webinar “Simplify Stream Processing with Python, Quix, and InfluxDB”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.

Speakers:

Caitlin Croft: Director of Marketing, InfluxData
Tomáš Neubauer: Co-founder and CTO, Quix

Caitlin Croft: 00:00:00.685 Hello, everyone. And welcome to today’s webinar. I’m very excited to have Tomáš here, as well as Tun from the Quix team. And they will be talking about how to simplify stream processing with Python, Quix, and InfluxDB. This webinar is being recorded and will be made available by tomorrow morning. And the slides will also be made available. Please post any questions you may have for Tomáš and Tun at the bottom of your Zoom screen. You can find the Q&A box. So we’ll answer all the questions at the end. So we’re really excited to have them here. And without further ado, I’m going to hand things off to Tomáš.

Tomáš Neubauer: 00:00:47.933 Thank you. Thank you very much. So welcome, everyone. So today, I want to talk about Python, Quix, and Influx, and how to use these technologies in stream processing and building even streaming applications. So let me introduce myself. I’m Tomáš Neubauer. Currently, I’m CTO and co-founder of Quix. But I want to share with you my previous experience where I actually got into, well, Influx, streaming, and basically this whole discipline of computer science. So before Quix, I was working in McLaren. I was working on a telemetry data that were recorded and streamed from F1 cars. And everything started when basically a very cross-functional team wanted to use data coming from the car. Different domain experts, like mechanical engineers from [inaudible] or data scientists and ML engineers, they wanted to build models and train data. And we were using in-house proprietary time series database that was really not a fit for purpose for these professionals to use it and extract data that we were recording from the cars. So I was trying to find a technology that would basically be more accessible, and there would be community in data science and visualization tools, et cetera so they can actually get the data in their tools.

Tomáš Neubauer: 00:02:33.542 And because the use case was quite challenging, each car produced roughly 50,000 different parameters. You can think of them as 50,000 different columns. And the number of rows for each car was around 30 million per minute. So we have tested a lot of databases. And most of them actually failed to persist and query this amount of data. And because this was basically time series data, we obviously looked into time series databases and tested different technologies and ended up in Influx because Influx was able to persist this amount of data while not really blowing up the disk space and also allowing us to, for example, visualize very high-resolution parameters, because some of the parameters were more than one kilohertz in resolution. That means more than 1,000 different numbers per second for just that one parameter. And if you look at the session that has, for example, two hours — if you would load all the parameters into your, for example, waveform, you will destroy your front end. So this was basically the first step. We find a database that would be great to use in this use case.

Tomáš Neubauer: 00:04:00.765 But we had to get the data into the database. And this was basically when I get into streaming and technology like Kafka because the format that we were getting this data from, a track, was very compact, a binary format called quartz. That’s because the track is connected to a very expensive, dedicated connection because some tracks are in the desert and in different locations in the world. So it was quite a precious bandwidth to connect to the track.

Tomáš Neubauer: 00:04:35.224 And so basically, this was where we realized we have to build a stream processing pipeline where we get data into the broker, and then we decompose the quartz into engineering values. And then we use another service to sync this data into Influx database. So that basically was where I started to combining these technologies together and learned how to use it. So today I want to discuss different architectures, how to build such architectures, and how to use Influx database together with Kafka, Kubernetes, and Python. I also want to show you that on a demo so we’re not just going to stop in theory. So we’re going to do some machine learning today. We’re going to use TensorFlow. And we’re going to try to detect some patterns in sensor data coming from my phone. So this is Influx 3.0 that we’re going to use today. So I have an account in a cloud. And I have recorded some data into it. And I’m also going to record a bit more.

Tomáš Neubauer: 00:05:54.128 So what is even a streaming application? So it is a system that use some broker, usually Kafka, but there are other brokers as well, and microservices that connect to this broker. And it processes data as a stream of events. Like it could be sensor data like g-force data from the car or GPS location from the car. It could be when you are in a banking app, and you’re clicking on buttons you can detect the user journeys in your software. Or it could be financial data, like stocks, selling, buying, or balances, transactions in a bank account. It also contains, and this is a very important part, data pipelines. So a part of your microservices that are connecting to Kafka, maybe exposing some APIs, some front end, you’re also going to have data pipelines where you ingest data from your sources. You somehow transform them, that’s what I’m going to do today, and sync them either back to your application or possibly to some database, again, example of today.

Tomáš Neubauer: 00:07:10.302 So here, we have a screenshot of our demo that I’m going to show you where we’re getting data from the phone. We split it to two different streams because we have data from GPS sensor, and we have data from acceleration sensor. And we’re trying to keep the measurements dense. That’s one of the things I learned at McLaren. It’s very important to keep your measurements full of data. That means that you don’t have a lot of cells that are empty. So basically, you’re trying to keep compact measurements in your database. Now how you can build such an application, there are multiple approaches. And each has pros and cons. So the first way is to use the broker, Kafka, in our case, client libraries. That is quite a variety of different languages in case of most of the brokers. So you can rely on well-maintained client libraries and use consumer-producer APIs directly. Which basically means you’re going to have to combine Kafka with some orchestration from your microservices, like Kubernetes. The second option is to adopt full-fledged stream processing framework like Fling or Spark Streaming. And then you have to combine what you’re combining in point number one plus another cluster, like Fling cluster and jobs running in it.

Tomáš Neubauer: 00:08:48.488 So when you go the first way, it is a quite elegant solution if you stick with stateless operations. What that means is when your data pipelines just process message by message. So there is no dependency between the messages coming in the stream. What that actually means, well, if I’m going to send the acceleration applied to my phone, and I’m just going to look for each message, then it’s status operation. If I’m, for example, looking at the rolling window of five seconds of applied forces to my phone, then I’m working in a stateful operation. And this is where it gets very complicated if you use this approach because stateful stream processing is one of the hardest software engineering exercise you can endure. And I would really not recommend to try to build it by yourself unless you really want to spend a lot of time on it. The second problem with this approach is that you’re going to have to connect these technologies together. So you’re going to have your database, like InfluxDB. You’re going to have your broker. And you’re going to have your Kubernetes. And you’re going to have to connect all of that together. You have to manage your CI/CD pipelines to build your code. And you release your code to containers. You have to build your Dockerfiles, your Helm charts. And then you have to monitor and observe your services, logging solution, et cetera. So that’s a lot of legwork before you can actually start building your code.

Tomáš Neubauer: 00:10:37.546 And that’s why you might be tempted to go to the second option, which is a stream processing framework like Flink, which will give you a very neat solution for your data pipelines. So you will get stable operations, a bunch of built-in functions, a quite good community if you use Fling or Spark, for example, and CI/CD logging and monitoring for your data pipelines. It will not give you though a solution for your back-end and front-end services. You still need to have your microservices somewhere. So you will not get out of managing something like Kubernetes or a similar orchestration framework. And you also get some unfortunate disadvantages of this solution. So the moment you introduce such a framework to your architecture, you increase the complexity of it by order of magnitude. You’re going to be dependent on Java because these frameworks are built in Java. That might be the problem, it might not. But it’s important to take that into consideration because although there is, for example, PyFlink and PySpark, they are not really first-class citizens in this community and in this ecosystem. And you will eventually run everything in Java anyway. And then there are some problems coming from the fundamental architecture of these systems when you’re actually not running your code. But you build your code in DSL that then runs on a server.

Tomáš Neubauer: 00:12:17.734 So the first thing is when you’re going to build something that use, for example, machine learning, like we do today, and you want to use Python, you will still get these problems with the Java files because under the hood is Java. And if you go to Stack Overflow, you will see lots of questions like this. Then the second problem is that when you’re using the first option, when you’re using client libraries, you might be using Influx client library in Python, for example, which is maintained by Influx community. And it’s up to date with the latest version of Influx. And it’s all nice. The same with Kafka, you are using confluent-maintained client library. So you get all the latest changes. That’s not the case with Fling because you’re not connecting Kafka or Influx to your code, but to Fling. So you have to rely on the syncs and connectors in Fling ecosystem. And they are quite often behind and not supporting every — maybe less frequent configurations, like for example in Kafka, Kerberos, and these exotic settings. And then with the Fling, the problem is that if you want to use a Fling SQL layer, the SQL might be — you might be tempted because you have used SQL with Influx, and this is very really when SQL shine when you’re querying a data. But when you’re trying to build a processing pipeline, which is real-time, you need to build application logic. And this is where it’s quickly get out of the SQL functions, and you need to do UDFs, user-defined functions.

Tomáš Neubauer: 00:14:06.298 And the problem of that is that because of the Java under the hood, even though you think you’re building them in Python, you’re actually not. You’re registering the Python function that then runs in Python environment side-by-side with Java environment. And this is the official documentation. And you can imagine, if something goes wrong, you can just hope for a readable exception. So that’s the one problem. The ultimate problem for me is that the most powerful tool for a developer is debugging. And that’s what you will not get with the server-side engines because, again, you’re not running your code. You’re running a DSL, which then converts the code that runs in a server. So going back to our original options, regardless of which one you choose, actually, the big problem of building such architectures is that there’s a massive lead time before you actually get into building the business logic, the project, the application that you’re trying to achieve.

Tomáš Neubauer: 00:15:23.986 And these numbers are actually significantly lower than what I have experienced in McLaren because they are tailored to a less complicated project. But in McLaren, we actually spent 25 people in a platform team in more than eight months. It was almost two years. And we haven’t actually got into desirable state where all the different cross-functional teams can actually develop, without any friction, these pipelines and work with the data. And we still have inter-team dependencies, like data scientists waiting for SRE to deploy a new version of the model or getting logs from Kubernetes. It wasn’t still there. And so this is basically when the Quix comes into place. And it is a tool to skip this first part of the journey, where everything gets connected in. So that means your infrastructure, your Kubernetes, your Kafka, your Git, your databases, like Influx, all working together from the day one so you can actually focus on your code, on your Python, so on your business logic and on your SQL queries, and not on how to get this data into this infrastructure, how to connect to this topic, and how to build a CI/CD pipeline that will build the Python code into container. All of that is covered by the platform. So I’m going to show you that in a second.

Tomáš Neubauer: 00:17:04.374 Then the way we’re going to basically use Influx and Quix today is basically — we’re going to use Quix to process the data before we landed in Quix. And then we’re going to use Quix to query the data in Influx to train machine learning model. And I’m going to show you the whole lifecycle of that operation on a demo. So the last bit that I want to talk about is Quix Streams library because we’re going to not use server-side engines. But we’ll still need to process data in data pipeline. And this is where we basically developed a new way of processing data without learning a new language or a new DSL. So we look at the Python community. And basically, there is one big common dominator that everybody knows and use. And it’s pandas. It’s such a prevalent technology or library in the Python ecosystem. And millions of people around the world use this library to work with the data which are stationary, whether that is CSV file or basically just data in a tabular format in your Jupyter Notebook. And we thought because the stream processing is a new paradigm and actually has a learning curve because you have to adopt a different thinking, here — what we achieved and what we basically set up to do is to use the same interface for stream processing so you don’t have to learn a new way of thinking when you’re building your app. It’s the same interface. Like it would be a batch.

Tomáš Neubauer: 00:19:00.065 So here’s an example of the code. So for example, here in the middle, we have a projection of columns. So we’re selecting three columns. Then we have a filtering, as we’re doing in Jupyter Notebook. And then here we have an example of a rolling window, exactly the same API. You can basically copy-paste it from a static analysis into streaming, and it’s going to work. But obviously, under the hood, it’s a lot of engineering to make it happen because in streaming, data are not in your code, not in memory. Data are flowing to your code from left to right. And you just get piece by piece. And that means a lot of engineering under the hood to take care of checkpointing, state management, and messy serialization visualization, all together to build reliable, resilient, and scalable data pipelines. Cool, so that for the theory.

Tomáš Neubauer: 00:20:07.908 And let me show you the app in action. So I’m going to try to hide — let me put this at the bottom. So this is the pipeline. And I have here a couple of microservices, a couple of data pipelines, and some syncs into InfluxDB Cloud. So first thing I’m going to do is I’m going to start my phone. And here I have an application which is called Quix Companion App. By the way, everything I’m going to show you today is open source. And you can replicate it by yourself. It’s in a GitHub, including the Android app, which you can also download from the store. So I’m going to start. And this will start sending telemetry data, time series data from my phone. And you can see that everything gets green. They just tell me that there is a data. So I can explore this here just to see that everything is going okay. So here we have the stream. And I can go and subscribe for g-force data. And if I shake, it’s going up and down. There we are. So that’s working.

Tomáš Neubauer: 00:21:37.762 We can also go to messages and explore one of the messages. There we are. So we have gForce columns in this JSON message. And now, what’s happening if I go back to pipeline is that we have here a splitter. The splitter is a super simple Python function that will just take a GPS data and gForce data and split them. So we can sync each in different measurement. That’s why we’re trying to basically [inaudible] tables. And here we have a g-force sync, which is basically a sync that guys from Influx created in our library. So if you go to code samples and you search for Influx, there we are. So we have the destination and source. So this is example of destination where basically data from topic being synced to a measurement in Influx free cloud. So you can obviously look at this code in our GitHub. It’s open source. And pull requests are very welcome if you want to improve these items in the library. The moment they get merged to main, they will appear in our platform.

Tomáš Neubauer: 00:23:04.633 So going back, I will now open InfluxDB Cloud command console and show you how it looks like. So here I have the measurement g-force. And if I select the three columns, there we are, now you can see — I’m just going to have to make the zoom. Sorry. So now if I zoom to this part, you can see data being saved into Influx successfully. Cool. So now we have this data. And what we want to achieve here is to use the history data, save in Influx, and we have lots of them, to train the model to detect shaking. Now the original idea is not to detect shaking, but crash of the cyclist. But here, we’re not really going to crash the bike. So we have to improvise. So we’re going to detect shaking using TensorFlow model. So here I’m going to switch to Jupyter Notebook. And here I have a code that is basically just a copy-paste of a connector that folks from Influx created in our library. And it’s getting the data from that same measurement I showed you. So here you have select from g-force. And we’re going into a specific session that we have recorded to train our model. So here’s the code, super simple. It’s using the InfluxDB client free.

Tomáš Neubauer: 00:24:53.348 And then here we have two data sets where we labeled the ones where we were shaking and ones we were not shaking. So this is example when I was walking around the office, going to the stairs up and down, but not shaking my phone. So then some analysis has been done just to understand what we were looking at. And eventually, we have trained that model using TensorFlow API. And at the end, we export that into a pickle file and then published that into blob store. Now here is where we’re going to move back to Quix because to do this for the first time in Jupyter Notebook or Google Colab, to understand what has to be done, it’s very good to do it in a colab. But then when you finalize that, and you want to do retrainings and maybe run this overnight because you have massive machine learning models to train for, this is where it’s useful to create a job that would run in Kubernetes. That’s exactly what I have done here. So here I have a model training. And that’s basically exactly the same code like in a Jupyter colab, a Google Colab. But this time, it will run as a container in Kubernetes. So first, let me show you how it works now. So here we have ShakeDetection. And that’s the model deployed real-time to listen to this data. And then we have here a simple front-end application that’s consuming this data.

Tomáš Neubauer: 00:26:52.419 And what I’m going to do, I’m just going to stop this because I was sending a wrong data. So let me refresh this. Cool. So now we’re getting some telemetry data here. We’re actually getting a lot of shakes. And I think that’s because we have here the version which is old. So let me just deploy a different version. Cool. That should give us a better experience here. There we are. So now we’re getting telemetry data from this phone. We’re getting the battery, where we are located. So if I zoom out here, we’re getting in San Francisco. And now if I take this phone and shake, we get shake detection from the model. So now that is to some degree of — the model has some degree of how precise it is. And you can maybe do a wrong shaking, and it will not work. Now you see if I shake a bit differently, now it worked. But it’s not perfect because you need to improve this model over time as you get more data and retrain it.

Tomáš Neubauer: 00:28:23.453 So let’s do exactly that. So going back, I’m going to stop this session and just record a very short one with a different shaking so we can use this new data set and add it to the training set. So I’m going to press start, do, again, shaking, but this time, in a way that we want to test. And I’m going to stop that session. Going back to our platform, here, this is basically a view on top of Influx. And here’s the session that we just recorded. And I can check the data. And it looks nice. We have some shaking there. So all the three dimensions recorded some forces applied to it. So I quite like that. So let’s use this session. Here, I will copy the ID. And I will extend the query to Influx so I will have more data in my training set. So here in model training, I would pass this here as a parameter. There we are. And I will call this a version 12. Press start. Cool. So now this is being scheduled as a container in Kubernetes. And now it’s running. And we’re getting logs. Amazing. And we have created a model with version 12 and with a precision of 0.981. So that’s an accuracy of 80%. Obviously, it could be much better. But also, it could be much worse.

Tomáš Neubauer: 00:30:22.071 Now we have this artifact. And you might be thinking, “Well let’s deploy it.” Well that could be a bit premature because it might be that the model slipped. And we really don’t want to do that. So here on the bottom, I have a backtest bench where I’m going to use Influx to replay some of the sessions that happened before to see how this new model is going to react. So here I have replay. And if I go to this model and I update it to version 12 — this is our newest version. So now we are adding the environment variables in a container. And it’s running again. And I will replay this session. This session has shaking in it. And as you can see, the model was very successful with detecting that. And equally, you can get the ones that are not shaking. And you can test if it’s not overfitting the data set. And when you get through this, and you like your model, and you can literally run it through a month’s worth of data. And basically, this will work equally. And you can then check the results. I’m just going to go to shake detection and change this model in a real-time pipeline. So this means that now, if I go back to this front end, and if I shake, yeah, it’s working a bit better than before. It’s a bit more sensitive. Cool. So we have retrained the model.

Tomáš Neubauer: 00:32:21.971 And you might notice in my bottom — sorry, in my top part of the screen, that there is something called prod. Well the way how this works is that you have different environments. And each environment is assigned to a branch. So here I have a branch called InfluxDB because I was working on it. And here I have a version production, which basically is almost the same thing, just all the model. And now because I have trained a new model, and I’m happy with it, if I want to release it to production, the way I’m going to do this is — I will create a pull request from this “influxwebinar” branch to prod. And because this is open source, it will lead us to a GitHub. Here, I can say, “Improved model,” create a pull request. And here I have changes that I have done to the pipeline. So I have a change in the version of the model that I’m using, a new pickle file in a real-time pipeline. So yeah. So that’s it. I hope it made sense.

Tomáš Neubauer: 00:33:50.408 And obviously, I’m here to answer any of your questions. And if you want to give it a try, I have here some QR codes, first of all, the GitHub, where you can find all of the code I have showed you. You can use this pipeline by yourself in Quix. And you can sign up for trial and use it with a trial of Influx Cloud together to basically replicate exactly what I have done today. And thank you for coming here today. And yeah, if you have any questions, I’m going to try to answer them.

Caitlin Croft: 00:34:29.337 Awesome, Tomáš. Thank you so much. So if you have any questions for Tomáš, please use the Q&A at the bottom of your Zoom screen. I know there has been a few questions in the chat that Tun has answered. But I also just want to make sure that it’s on here for those who check out the recording and such. So let’s see.

Tomáš Neubauer: 00:35:00.887 So I see that question. I can go from the top, I guess.

Caitlin Croft: 00:35:02.897 Sure. Go for it.

Tomáš Neubauer: 00:35:04.376 Yeah. So what is the difference between stateful and non-stateful? So non-stateful operation is when you just get message, and at it completely isolated from the previous message. And therefore, you don’t have to care about the state because when you get restarted, you just continue from the empty state. Nothing is really needed to restore. Stateful operation is, for example, an average speed of the car in the last minute. If you get restarted, you’re going to be missing that 60 seconds before the restart. So you need to persist that state and manage it so you don’t use the state twice. You don’t corrupt your state by accumulating data twice. There’s a lot of engineering in a stateful processing. Yeah. So basically, the difference between database and state is that database is for saving your data as they come for analysis, for training, for the dashboard, for queries. State is just a supporting thing of a processing data pipeline.

Tomáš Neubauer: 00:36:27.552 Is there a new consideration to take into account in terms of data structure in InfluxDB when one works with the Quix platform? Well Influx and Quix both are heavily into a tabular format. So actually, it’s going quite one-to-one. So we have a timestamp as a first-class citizen, as the Influx. And then we have columns in our protocol as well. So yeah, there’s no need for any hard thinking regarding the different format. Does Quix support on-prem hosting, as the InfluxDB has clustered enterprise? Yes. So we are working on a bring-your-own-cluster offering. And they will basically let you install it wherever you want, for example, on-premise. Yeah, that’s the same question. Somebody’s asking if they can invest in Quix. I’m not sure right now. But I guess it will be in the future. Can we get the slides, a link to the recorded the video?

Caitlin Croft: 00:37:52.729 Yes.

Tomáš Neubauer: 00:37:53.116 I guess you can. Yeah. Yes. If there would be a Docker version of Quix, that’s a very interesting question. So we are quite tied, Quix to Kubernetes. So basically, when you’re deploying your data pipelines, they are Docker containers running Kubernetes. So to make that happen, we would have to rebuild the part where we’re orchestrating our containers to something like a Docker [inaudible]. But the actual platform which orchestrated Kubernetes can potentially run in Docker. But you’re always going to need the Kubernetes to run your function and your microservices.

Caitlin Croft: 00:38:48.635 All right. There’s a question here. I’m interested in the mechanics of reading data from InfluxDB and processing data, maybe using ML and generating forecasting models. Would you see this as a valid use case for Quix? Also, could Quix be developed as a replacement for Kapacitor, which is part of the older iteration of InfluxDB?

Tomáš Neubauer: 00:39:14.654 Yes. So it is use case. So the moment you basically want to serve your model, especially when it’s in a — even a streaming application or basically just a real-time situation, then you can use Quix to load the models, to train the models, but also to query Influx from the back-end services in Quix. The second question was — can you remind me? Oh Kapacitor, yeah. Yes. So basically, in Quix, you can do similar functions that used to be possible in Kapacitor in all the versions. Some of them are supported by our Quix SDF, so basically, things like downsampling maybe or filtering your data. So for that, you can use Quix instead of the Kapacitor that was in place. And that’s why actually I did this image here where I basically replaced the Quix. I put the tile where the Kapacitor used to be in this older architecture.

Caitlin Croft: 00:40:41.796 Cool. Is there a benchmark in regard to performance to other stream analytics tools, i.e., the load testing metrics, like ingestion rate, using the race car example that you had initially?

Tomáš Neubauer: 00:40:56.496 We have. So if you go to our website, there’s a block where we were comparing Quix to Fling, Spark Streaming, and a different version of these frameworks. So you can dig it up and look at the benchmarks. We also have an open-source benchmark suit, which we have built. Which if you will be interested to use and maybe implement the technology you are interested we haven’t benchmarked, you are free to do so.

Caitlin Croft: 00:41:30.781 I am looking to develop a DTN WebSocket API to stream financial trade time-series data, and then process it in real time, and then put it into InfluxDB and query it. It’s the processing of the streaming data that has been a problem. Is this a Quix application?

Tomáš Neubauer: 00:41:55.744 100%, yeah. I have nothing to add really. It’s exactly the use case. Yeah.

Caitlin Croft: 00:42:05.252 Perfect. Awesome. Let’s see if there’s any other — Tomáš, I’m trying to see, did you make it through all the questions that were in the chat?

Tomáš Neubauer: 00:42:14.886 Let me see if there’s something new. Can you please list up all the plugins dependencies that you made use of in this demo? Yes. So first of all, to replicate this, let me just go to here. This is called the IoT phone demo and Quix I/O organization. It is public GitHub. So here is a mono repository with all the microservices that you have seen here in folders. That’s how the structure works. So for example, if you want to look at a shake detection, here, you have requirements TXT, and you have — if I switch to the InfluxDB branch, you have what libraries were used there. So if I, for example, do the model training, look at the PIP packages that I’ve used. This is basically inside of this repository. And I can actually maybe just put it in the chat. So you can, guys, use it. And when you create a trial of Influx Cloud and Quix Platform, then what you can do is use this repository, fork it to your account, and then use it to basically build the pipeline that I have showed you today with one click of a button. It will all synchronize. And all the microservices will pull in your workspace. And it’s just going to work out of the box. So you don’t have to manually do it service by service.

Tomáš Neubauer: 00:44:07.334 And then let me see other questions. Interested [inaudible] regarding Kapacitor and Quix for streaming data processing engine — do any of Quix samples for Python have a stateful use case that require windowing and lining? Yes. So there is. And you can check the code samples here. But also important to say, we are working right now on a new version of this library. And it’s going to be ready in a couple of months. And that’s going to have a lot of new stateful processing functions in it. But here, I recommend to look at some rolling windows, examples, etc., to see how they could be used. Thank you for [inaudible], definitely going to start trying. Yeah. So I think I have reached all the questions. And [crosstalk] —

Caitlin Croft: 00:45:08.467 I think so. I think people were really excited about this talk. So we really appreciate Tomáš and Tun for presenting on today’s webinar and answering everyone’s questions. I just want to remind everyone that this webinar has been recorded. And the webinar and the recording and the slides will be made available probably by tomorrow morning. If you have any more questions for the Quix team, everyone should have my email address. I’m more than happy to put you in contact with them. And you can, of course, then go back and rewatch the recording and check out the slides. So we really appreciate everyone joining today’s webinar. If you guys have any last-minute questions, please feel free to post them in. If not, I think we’ll wrap things up. So we really hope to see everyone on another webinar. It’s always exciting to hear how other community members are integrating InfluxDB into their suite of products. There’s just so many applications for time series data, so got to get that information out there. All right. Well thank you, everyone, again, for joining today’s webinar. Thank you so much to Tomáš for presenting. And we really appreciate Tun for joining and helping answer people’s questions as well.

Tomáš Neubauer: 00:46:38.400 Thank you very much, everyone. And thank you, Influx, for having me here.

Caitlin Croft: 00:46:41.899 Thank you. Bye.

Tomáš Neubauer: 00:46:44.526 Bye-bye.

[/et_pb_toggle]

Tomáš Neubauer

Co-founder and CTO, Quix

Tomáš Neubauer is a co-founder and CTO at Quix, where he works as the technical authority for the engineering team and is responsible for the direction of the company across the full technical stack. He was previously technical lead at McLaren, where he led the architectural uplift of the real-time telemetry acquisition platform for the Formula 1 racing team.

Simplify Stream Processing with Python, Quix, and InfluxDB

Watch the Webinar

Tomáš Neubauer

Co-founder and CTO, Quix

Session Registration

Product & Solutions

Developers

Company

Simplify Stream Processing with Python, Quix, and InfluxDB

Watch the Webinar

Tomáš Neubauer

Co-founder and CTO, Quix

Session Registration

Product & Solutions

Developers

Company

Sign up for the InfluxData newsletter

Follow Us