How azeti Monitors PLC and SCADA Systems Using MQTT and InfluxDB
Session date: Aug 24, 2021 08:00am (Pacific Time)
azeti are the creators of an industrial IoT platform that enables customers from process and manufacturing industries to leverage their unused shop floor data in order to lower process complexity, maintenance and operations cost.
By collecting thousands of data points directly from sensors, machine controls (PLC) or control systems (DCS / SCADA), azeti has been able to save customers hundreds of thousands of dollars annually. Discover how azeti uses InfluxDB to enable IIoT use cases like condition monitoring and predictive maintenance for their clients. In this webinar, Florian Hoenigschmid and Sebastian Koch will dive into:
- azeti's approach to enable IIoT use cases
- Their methodology to improve machine health and utilisation
- Why they use a time series database to store vibration, temperature and other sensor data
Watch the Webinar
Watch the webinar “How azeti Monitors PLC and SCADA Systems Using MQTT and InfluxDB” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “How azeti Monitors PLC and SCADA Systems Using MQTT and InfluxDB”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Caitlin Croft: Customer Marketing Manager, InfluxData
- Florian Hoenigschmid: VP Strategy & Sales, azeti
- Sebastian Koch: Managing Director, azeti
Caitlin Croft: 00:00:00.060 All right. I think we will get started here. Hello, everyone, and welcome to today’s webinar. My name is Caitlin Croft. I am very excited to have azeti here talking about how they are using InfluxDB to improve industrial IoT monitoring. And they’re monitoring PLC and SCADA systems, so I think it’ll be a fantastic session. Once again, please post any questions you may have for our speakers in the chat or the Q&A, I will be monitoring both, and we will answer them at the end. And just want to remind everyone, please be respectful of our speakers, as well as all attendees. And without further ado, I’m going to hand things off to Florian and Sebastian.
Florian Hoenigschmid: 00:00:50.755 Well, thank you, Caitlin, and hello from my side. Thanks for tuning in, and thanks for the opportunity to talk today about azeti and what work we’re doing with Influx in the field of IoT. First of all, I’d like to introduce myself. My name is Florian. I’m the vice president of strategy and sales at azeti. I’m with the company for over eight years, and my main focus is on helping our customers in the industrial sector to digitalize their production environment. I brought along with me Sebastian, who is our managing director, and he will cover the second part of that webinar, which has a way more technical focus.
Florian Hoenigschmid: 00:01:32.966 Let me start with what we’re going to do today. I prepared a little agenda. I would like to start with talking a bit about who we are and what we’re doing. So kind of give you an overview about azeti before we touch on the typical customer environment we’re working in. So sort of like the characteristics of the environments of our customers where we position or use our software platform. Then, this is going to be followed by the concept of the digital shop floor, which we enable with our software platform. And then, I’d like to talk about the process, how we get data from the machine to the dashboard. And the second half of that webinar is going to be more technical, as I said, so this is covered by Sebastian. He will talk about operations. So basically, how we use Influx DB as part of our solution and our experiences from the last couple of years. He will also cover what data sources we connect to and how we acquire and annotate data. And we conclude the session with a Q&A.
Florian Hoenigschmid: 00:02:43.421 All right. Let’s kick it off. So who are we? So we are azeti. We’re a software platform headquartered in Berlin. We’re around for already 15 years. And we started as an IT monitoring software company. We evolved over the years into an IoT platform, focusing on Industry 4.0 use cases. So we have our customers from the process as well as the machine-building industry to digitalize their production environment and implement use cases around condition monitoring and maintenance. So think of machines have cockpits that provide details about what’s going on in a machine. If it’s going to fail or if it needs to be serviced. As I said, our focus is purely industry clients. So we have customers in the metal and chemical industry, so the process industry, as well as the machine-building industry.
Florian Hoenigschmid: 00:03:39.048 And our experience in the field of IoT already goes back 10 years. So that means we are carrying out projects since 2011 in the field of IoT. And came across a lot of challenges on our way, from connecting to very old assets, like 50-year-old generators, retrofitting them, but also to work behind unreliable connectivity and to just deal with large amounts of data. But all that, our experience, our expertise in the field of IoT, our IoT platform, as well as our focus on industry clients, helps us to follow our vision. Which is that we want to build the data-driven shop floor that enables smart and sustainable industry. And especially the word or the data-driven piece is very important for us because we believe in the power of data. We believe that data creates a lot of value for customers, and we see the need to give access to that data in order to unlock the value of it, so. And by doing that, we came across a couple of, say, certain characteristics in the customer environment we are usually working in, which kind of makes it - it’s rather complex, and we help to overcome those complexities.
Caitlin Croft: 00:05:02.324 So the first thing is that we usually work on the shop floor where there are a lot of different machines. There are a lot of processes. There’s data from all those kinds of assets: machines, processes, product data. And all that data is really important for certain applications in the field of IoT. If you think, for instance, of a machine health cockpit telling you the state of the machine, you want to know what’s going on right now. So you need the live data. But in order to assess the behavior, you also need may be data which is just going back five months or even two years. So also, the historic data is very much important in order to get the full picture about the health of a certain machine. So you need to capture that all. And you just need to not only capture that from one machine, but from the entire shop floor, usually, to make sure that all the processes are running smoothly. On top of that, sometimes, it just happens that the resolution is also quite important. So that means you just not only need data every second, but maybe every 10th or 100th of a second. So high-resolution familiarity is also quite decisive for certain applications. And that just adds up to large amounts of data we’ve got to deal with. So this is quite typical.
Caitlin Croft: 00:06:19.137 Another thing is that the environment, the shop floor, is quite heterogeneous. So that means we have lots of different machines, lots of different control systems, lots of different controllers and databases, lots of different data sources all in all. And they all, of course, are of different make and model. So they are from Schneider Electric, from ABB, and from Siemens. And they all come with different interfaces with different protocols. But in order to carry out certain IoT use cases to implement them, you need to also connect to them. So that means you need to understand the language on the shop floor, the different protocols, and you need to extract data from all of those sources, which can be rather complex because there are many different protocols. Yes, there’s standardization. Yes, there’s OPC UA. But it just happens that a lot of other protocols are also still common. Like 40, 50-year-old protocols, like Modbus, for instance. Vendor-specific protocols, like Profinet or Profibus, for instance, or EtherNet/IP. And maybe even sometimes you have to get a new data source onboard there. So retrofitting certain machines because there’s simply no interface to connect to. And that even adds up to more complexity there. So this large variety of data sources is also something we deal on a daily basis with.
Florian Hoenigschmid: 00:07:40.752 Of course, it’s not just the data arrives on the platform, in your IoT platform, but also that it meets certain standards in terms of quality. So you need to make sure that data is complete. The data is accurate and precise. That actually, the data which has been specified is arriving at the platform. So this is also something we, on a constant basis, need to deal with. And, of course, since we talk about production about an industrial environment, the shop floor, they just don’t produce from nine-to-five, they produce around the clock. So that means they also need the data for certain applications. Think again about a machine health cockpit. Around the clock. So that means you’re making sure that you can supply data 24 by 7 constantly. So that someone on the night shift, at 2 AM in the morning, is capable of seeing or able of seeing on his tablet, what’s going on with that specific machine.
Florian Hoenigschmid: 00:08:39.049 So with all that complexity on the shop floor, how does azeti fit in there? So we just learned that the shop floor or the digital shop floor is something which is quite important since access to data is quite important. And we also see that it’s rather complex to get to that data. And this is exactly where azeti is coming in. So what we’re basically doing is, are is doing is, that we abstract the entire complexity of the shop floor. So that means all the different protocols, all of the different interfaces, the different makes and models of machines, and connect to them, abstract their complexity, and help to get the data out of those systems into apps, where actually the value is being created for end-users. So, for instance, if you need data from PLC A, B, C and from another control system from another database, and need it for a machine health cockpit. So you don’t have to worry about how to connect to those specific single systems. You rather connect to the azeti platform and it just does all the work for you in abstracting this complexity.
Florian Hoenigschmid: 00:09:50.278 And we start already on the shop floor with that. So our software is connecting at the edge, right next to the machine where the data is generated through protocol adapters to all kinds of different assets. So a OPC UA, PROFIBUS, PROFINET, all different kinds of PLCs and sensors. Extracts the data, pre-processed it already at the edge, and then sends it over MQTT to our azeti IoT platform. That platform usually resides in the custom IT infrastructure and also in the AWS or Microsoft Azure and it serves a central data hub. So basically, the data from all the site controllers, which are right next to the machine on the shop floor along production lines at different locations, all arrive there through MQTT. And there we treat the data, we manage the data, and we prepare in that way that it’s being consumable by IoT apps which are basically sitting on top. So those apps can be machine learning models which need to be supplied with data, machine health cockpits, utilization and performance applications. All of that relies on the azeti platform and just uses that as a single source of data from the complex shop floor environment.
Florian Hoenigschmid: 00:11:05.784 And with that, we follow a certain process in order to get the data from the machine all the way up to the application. So first of all, you need to get clear what’s the scope of your use case. So what are you trying to achieve? What are your goals of the use case? And then you need to check, what kind of data is required in order to implement that use case? So that means you have to check what is available there on the shop floor, what it can get access to through azeti, and maybe what is missing. So once you figure that out - what is there and what is missing - you engage your data or you kind of think about your data acquisition strategy. And think about, where do I need to connect to and maybe if I even need to retrofit any sensors? So it could just happen that for certain applications, like a predictive maintenance application, you need, in order to assess the condition of a machine, vibration data. So sometimes that just comes not in the right chronality, so that means sometimes you need to add another sensor to achieve maybe measurements of 10 or 100 hertz to get the data in and right chronality in order to carry out that implementation.
Florian Hoenigschmid: 00:12:26.904 So those are all things that need to be thought of when you develop your data acquisition strategy. And once that has been done, you start to set up the system, the azeti system, connect it to the data sources, and start to extract. Then you need to, of course, check the quality of that data. So, again, if it’s arriving complete. If it’s precise, accurate enough. And then you also start to integrate it with other data sources that are all necessary in order to carry out that use case and to achieve your use case goals. Once that has been also done, you’re kind of at the starting point to actually start building your application with either the azeti toolkit or using the azeti APIs to start building with other applications through our app and run it by stakeholders constantly to validate if you’re actually on the right track. Once that is done, you test it, of course, if it’s stable enough, and then you provision it. And this is kind of the way how you get the data from the machine or the process - at least, we are following - from the machine, all the way up to the application. And Sebastian, right now, is going to talk a little bit more about the technical details behind that and how we facilitate it with our platform.
Sebastian Koch: 00:13:46.200 Thanks, Florian. Let me share my screen. So I think I need permission to share my screen, Caitlin, or you have to unshare [crosstalk] -?
Caitlin Croft: 00:13:55.700 I believe if Florian stops sharing, you should be able to share.
Sebastian Koch: 00:14:00.992 Okay. Let’s see if this works out. Right. So you can all see my screen.
Caitlin Croft: 00:14:07.184 Perfect.
Sebastian Koch: 00:14:10.747 Maybe a couple of words as introduction about myself. My background is still very technical even though, nowadays, I’m more in my managerial duties as the managing director of azeti. I’m with the company since almost 11 years. So I’ve been present during our experiences in the IoT field. And with the next couple slides, I want to give a real-world intro, not about all the things that are running smooth and proper and great and technology works out fine, but rather what didn’t work out and what are the challenges and what we had to learn maybe the hard way. So let’s dive in right to the first topic. Operations. A big part of our value proposition is that we are running the entire IoT platform for our customers, primarily on-premises, nowadays within their own cloud subscription. So large corporations, they typically have their own Azure environment or their own AWS environment, which is covered with their local networks and their shop floor even, and we are running these things.
Sebastian Koch: 00:15:22.610 All IT departments, all engineering departments, are overloaded. The markets are very scarce in regard to good recruits, so most customers have a hard time starting their projects. So we come in, and we do the IT operations of the IoT platform because this is, sometimes, even overwhelming for classical IT departments that just run some VM’s. Why? Because we stretch from the edge. This means from the local machine at the shop floor, close to a sensor, close to a [inaudible], up into the cloud. We have an ops team. We provide 24 by 7 operations. And there we had to learn some very hard facts with and about Influx, amongst other software, in the last seven years. We started really early with Influx doing our research project where the platform originated from. And the first thing we learned, which was actually not that hard, but this was a learning along the way with our customers, is that the chance to choose between the open-source variation, a SaaS offering - particularly, we’re using Influx Cloud version number one - and the enterprise cluster, helps us.
Sebastian Koch: 00:16:35.488 We’re using Influx, the open-source version, in our monitoring. Partly, we’re using it in our staging development testing environments as your standalone service or even in Kubernetes. I will talk about Kubernetes in a second. And with other customers, we’re even operating our platform in China as well as in Europe. And the customer shows that in China, the SLA is totally fine for this certain point in time to go with the open-source version unclustered. Whereas in Europe, where the core market was, they want to have a higher SLA. And so we chose to use the SaaS offering from Influx, a system less that we have to operate. And the customer was happy to use Influx Cloud on AWS.
Sebastian Koch: 00:17:22.030 With our latest and largest deployment, there the story was different. SaaS wasn’t an option and the amount of traffic we are expecting and the scale required something apart from OSS because OSS, the open-source version of Influx 1.0, doesn’t support any load balancing whatsoever. Yes, you can relay, but this is not close to being production-level. So we chose an Enterprise cluster. And our customers like the choice and we like it as well. It’s something that we really enjoyed over the last years. And we grew from probably five years on OSS. And now in the last three years, we have been mostly under paid subscriptions of Influx.
Sebastian Koch: 00:18:06.078 A nice catch on the Enterprise license that I was surprised with is you get a very good deal on a development license, if you ask nicely. So if you have staging environments, it’s very nice to run enterprise clusters in staging environments. A second hard learning - Flux, which is the most sophisticated query language of InfluxDB, solves a lot of issues InfluxQL has, specifically when, yeah, doing aggregations, particularly if it’s across the [inaudible], but it brings a lot of complexity. We see that Influx wasn’t really built from scratch for non-technical users, but more for the experienced technical users - for developers - which makes perfect sense. This is how the solution was tailored by our core customers. Our IT non-skilled engineers, maybe electric engineers, they feel very familiar and very comfortable in typically SQL, but they have a hard time writing a FLUX query, which is very close to JavaScript. Even our software engineers they took some time to get into that. It solves a lot of issues for experts, but FLUX is not an option for our unskilled users. Therefore, we have to abstract. We have to hide the complexity. So on our dashboards, we have three different options how users can query. They can use our query builder, drag and drop, no SQL in there; they can use InfluxQL; or they can use FLUX if they really want to go very deep on the technology.
Sebastian Koch: 00:19:39.016 Another hard learning is that Influx, by design - maybe we didn’t do our research proper - doesn’t like high cardinality tags. And these tags are very important, specifically in industrial IoT applications because most of the time the time series itself is not sufficient to create a use case. They always need metadata. For example, a batch number. You have a machine. It’s producing a part, or it’s producing a liquid, and this has a batch number. Quite often, it’s called a cycle or a cycle ID. And this is increasingly statically into tens of thousands maybe per day depending on the velocity of the machine. So this is a high cardinality field. But IoT customers, our customers, what they want to do is they want to go back in time and do a query that says, “Give me all our measurements, all metrics, all monitoring data of batch number 15,300 something, something, something.” And we need to get the data fairly quick. Influx doesn’t like that because it’s not built for high cardinality tags. So we had to [crosstalk]. We use tagging for other low cardinality fields. That works brilliantly and is outperforming anything we built by ourselves. But for high cardinality, we needed to do some workarounds. And I will talk about that later. Might get fairly technical. Also, if you’ve got questions, throw them in the chat. I’m really looking forward to answering some technical questions.
Sebastian Koch: 00:21:08.710 Point number four goes back to the first one. In IoT use cases - and I’m really just talking about IoT because this is probably different than IT monitoring and metrics, in general - our backend had to stitch in the metadata. Stitching means you have a set of time series; the time, the value, and maybe a tag, something like machine ID, but also the metadata; which is the batch ID, the material used, the ingredients, the serial of the certain process ID, the recipe ID. We have to stitch them together because this is the metadata customers search for. That they use in their reports. That they use to alert to alarm. In the end, dashboard, what we’ve seen, and what we had to learn the hard way is primarily not about the value itself, but it’s about the metadata. “Show me everything from the red door that had a dent in our production last Wednesday.” It’s not, “Show me the thickness of the color painting of all red doors.” They know that it’s fairly good because of QA, but they want to see the outliers and they want to see where metadata has a certain relation, specifically, to quality assurance, to tracking, to process control. And we are aiming to the manufacturing and process industry where this is very important.
Sebastian Koch: 00:22:27.085 How did we do that? Also, we were running INFLUX always on, yeah, bare metal, VMs, EC2 machines. And there is no proper Kubernetes support for Influx version 1. We are still in version 1. In version 2, everything is resolved. They did a good job there. But we have existing deployments and we can’t migrate it easily. And we are running nowadays a fairly big amount of workloads on Kubernetes. With software of our own, it’s not that big of a deal. We can create a Helm chart. In those officially supported, you can run it on Kubernetes, but it’s hands-on work. And if you look into other software, this is nowadays state-of-the-art that this comes out of the factory that this is part of the solution. Here we have it as some hands-on work. Which is something if you guys are going towards Kubernetes with Influx 1 if you need a Helm chart, contact me. We’re happy to share that. We are releasing Helm charts in due course. And Prometheus support. We are very much excited, and following a large amount of DevOps best practices. And we are migrating out of our own processes to become more DevOps aware in regards specifically to observability, and Prometheus is a big part. And Prometheus, therefore, we had to not hack, but do workaround where others are better. In the end, these are just learning stuff that wasn’t straightforward that we had to do some things about. Everything is solvable, but one has to be aware about that.
Sebastian Koch: 00:24:12.979 Observability. Yes, the TICK Stack is great if you live in the TICK Stack world with Telegraf, Kapacitor, and Chronograf. But most recently, due to the fact that we’re using more and more Kubernetes and other things, we migrated. And since a couple of years, we’re using Grafana as our visualization backend and moved everything to Prometheus from a big mixture of different software. So for Influx, first of all, we had to get somehow the metrics. This works perfectly fine because Telegraf brings an input plugin out of the box. Super nice. Easy to enable. But how do we get these metrics into Grafana if you want to follow our own standards? And our standard is PromQL. This is our standard because all alerting is done in PromQL. We use Telegraf to retrieve it. And we are basically exposing the Prometheus client, which is opening and exposing a metrics endpoint. And with this endpoint, our Prometheus instance is able to scrape the metrics endpoint. It’s actually a workaround, right? We’re kind of masking Telegraf to behave like a Prometheus endpoint and Prometheus is happy with that. It works, it’s stable, it’s super easy to configure. It’s a bit bulky, but it does the job. And that’s one thing that I really like about - we really like about - Telegraf, is due to the fact that with different combinations of input and output plugins, you can basically build things together that they do the job as you like it.
Sebastian Koch: 00:25:59.409 A learning here: filter the metrics. If you expose the metrics endpoint, it will expose everything. Specifically, operating system metrics, JBMs, proxy. So with a little bit of research, we added that name pass. If you want to try this with your own Influx and want to use Prometheus, just add that name pass that you only get the metrics that are really for Influx. Otherwise, you’re spoiling or overloading, potentially, your Prometheus. How we’re running it at the moment, we monitor all our Influx service in the same fashion. We have a standard set of Prometheus alerts that we will, yeah, open up fairly soon. Alerts that we came up with. And also, the influx support helped us with. We reached out to Influx and said, “Hey, what kind of alerts are you using for your cloud customers?” And they gave us some of the capacitor rules. We migrated this to Prometheus and are using this as a standard set of alerts for our Influx environments, no matter if it’s OSS or if it’s an Enterprise cluster.
Caitlin Croft: 00:27:03.983 Data sources. We’re getting more and more technical here. How do we acquire data? How is the data actually coming into our data stores? Influx is just one of them. This is an example setup we see in the shop floor. There’s no one fits all solution, unfortunately. We started a couple of years ago with an IoT gateway, where we thought, “Okay. Let’s put this into the IT gateway and it will solve most requirements for customers.” But that doesn’t work. In the end, in some shop floors, the customer wants to use VMs all the way around. They migrate away from [DIN Rail?] IoT gateways. Some customers need that IoT gateway with separated ether interfaces because they need to have that, yeah, physical separation. Specifically, network. And specifically, on the shop floor, and specifically in the process industry, because they have different zones department on explosives and gasses and yeah, different explosive zones.
Sebastian Koch: 00:28:03.301 So we had to accommodate different deployment scenarios. We are using our own azeti site controller. It’s a small piece of software written with Python that runs on almost any platform. And we’re using different deployment methods. Hardware: ARM and Intel. Mostly ARM nowadays because this works very good [and it’s a small factor for a larger service?]. We’re containerizing it, of course. And there, specifically, if you look to the left-hand side, you can see the different protocols. So you have a PSE that’s a SIEMENS protocol, you have a PSE that’s OPC UA or DCS control system, and you also have these APIs, like a RESTful API, a SQL server, or maybe even CSD files that we’re grabbing. So our site controller is grabbing these. And this could be any IoT gateway, right? This could be any other IoT server or it could be even a third-party system. We are retrieving data from the database as well.
Sebastian Koch: 00:29:02.872 And here comes something that we actually added to our software a couple of years ago out of necessity. Because back in the day there, we had to transfer a lot of computing to the edge because we had unreliable network connections. We can also run our custom scripts. Small apps, you could name it. In the end, it’s just small scripts. In Python, containers that you can run that, for example, aggregate data. That condense data. That enrich data. That, for example, collect images from a webcam, put them together, store them as a Zip, or merge together 1,000 CSD files and put them into 1 large time series bucket and publish them via MQTT. Ultimately, all of these edge layers is using MQTT as our designated protocol to collect things from the shop floor.
Caitlin Croft: 00:29:54.026 Still, even in Germany, quite often the shop floor has a limited amount of bandwidth. Rural areas in Germany, some factories are just connected with 5 megabits or 10 megabits for the entire factory, so we couldn’t consume just 5 megabits. MQTT works brilliantly if you want to do bulk uploads, if you need quality of service, if you need the Last Will and Testament. So if the connection goes down, the edge gateway can tell the server how to deal with a connection that broke. And this is why we’re not using HTTP or web sockets. Because, again, connections are flaky. And our IoT platform - specifically, the MQTT broker in our backend - is receiving all of this data on different MQTT Topics and in different formats. One value at a time is very important. One event at a time. If something was changed, for example, a warning or bulk uploads. Every 5 minutes 50,000 endpoints or 50,000 measurements.
Sebastian Koch: 00:31:00.864 Last slide and this is as technical as it gets today. Wrong slide. How do we deal with the data? How does the actual data look like? We go from the left-hand side to the right-hand side. At the left-hand side, we have our sources. Sensors. A sensor could be as simple as 0 to 10 vaults, 4 to 20 milliamps. Frequencies, curves, currents, voltages. They’re typically translated by an IO module or by an IT gateway or by some sort of protocol converter into a digital signal. Zeros and ones. Could be HEXs, could be something. We are receiving this. We are translating these protocols. Florian already mentioned plenty of them, from Modbus to OBCUA, different PSEs. A lot of custom-built stuff. Experienced quite some customers that build a PSE, together with some software company 15 years ago. The software company got bankrupt, and now they have to PSE, but the documentation is gone. So we also had to reverse engineer APIs to get the data from these PSEs.
Sebastian Koch: 00:32:13.163 So we use the data. And at the bottom there, I see there some - let me mark it for you guys here. Highlighter. How we convert from the ones and hex into a format, we use XML and JSON, depending on the format. We actually enrich it and we put every single measurement and add the metadata we already know. So this is the acquisition phase. And here, we already streamlined the data. So in the end, for the customer it doesn’t matter where it comes from - whether it’s Modbus or Siemens - data is data. It always has, at least, a value, a timestamp, and some other metadata. We acquire it and we calibrate it. Calibration, quite often, there’s an offset. A temperature sensor wasn’t calibrated, or it was already installed and you can’t go there and physically calibrate anymore so you have to do it in software or you know that there are certain false positives there or you want to add some custom code. For example, convert Fahrenheit to Celsius. Super easy example.
Sebastian Koch: 00:33:23.989 But what about if you want to have a three-sigma analysis or if you want to do the first aggregations there? If you want to surface some outliers, create a baseline, or have let’s say the [fifth most common person types?]? We are doing this at the edge already because sometimes computing power there is ready. And after calibration, so let’s say from Fahrenheit to Celsius, I can see that I put the unit wrong there. At the same time, we are using an MQTT backlane for all of these different modules. So acquisition, calibration, annotation are different modules and they use MQTT to communicate in between. So you have different stages. Like a pipeline. Like a modern software pipeline. And we can also add custom code there. Could be a simple mathematic aggregation or an AI model. And we publish and subscribe to these MQTT topics already at the edge. This is where we already have everything about data and MQTT.
Sebastian Koch: 00:34:29.312 And after calibration, again, next step here, we annotate. And this is where we add this kind of metadata I was talking before on. Imagine a use case. We have five sensors from Modbus, five sensors from the Siemens PSE. They come together. We have all of these values, these timestamps. But we also have the local CSV file that’s generated every hour from an industrial PC where it tells us the cycle time, the product batch number, the name of the part, and let’s say, the supplier ID. So in this stage, the annotation stage, oops, we can add the batch, for example, we can add the material ID, and could even change the timestamp. This all happens at the edge. We upload it to our backend. This is in our server. So in the cloud premise per deployment. And here, we dissect the data and we put all-time series and everything that is low cardinality into Influx data into the Influx database.
Sebastian Koch: 00:35:34.312 We don’t do any downsampling. Most of our customers don’t like downsampling. They’d rather pay for more storage than have down-sampled data. And we add the metadata into a relational database. Most commonly, Postgres and elastic search clusters also depending on the data. And there we do some stitching in the end for our end users, which are at the top. I left out that part. But this is how we go from zeros and ones to ultimately a query. InfluxQL query. Select percentile from the melt_t, temperature in our process measurement, where we have the material ID, MG, magnesium, 15. Of course, some further stuff is around. And this is our data pipeline from the edge into our cloud platform. So I hope that I didn’t lose too many listeners, or that I wasn’t too narrow or too high level for some of the folks here, depending on your profile. I hope you have some questions. Happy to answer them already. And I hope that this kind of brought some value to you and was interesting. Thanks.
Caitlin Croft: 00:36:44.038 Thank you, Florian and Sebastian. That was great. I don’t think you lost anyone. It doesn’t seem like anyone dropped off. And there already are a few questions. So great job. So we’ll dive into the questions. Before we do that, I want to remind everyone once again Influx Days is coming up, so please be sure to check it out. I know people are always interested in IoT use cases and getting more information. So I’m sure there will be some presentations on that. All right. Is that a custom XML format or is it based on a standard format?
Sebastian Koch: 00:37:22.290 That’s a really good question. We’re using custom XSDs to validate XML. This is not technically adept. It’s from our past. As we started a couple of years ago, JSON was very early and it was very hard to validate. And the configuration due at the edge, for example, Modbus register, goes for that value. They want to have some validation so our customers don’t have to think about if they’re doing things right. That’s why we’re using XML and using custom XSDs. And we even have XSD’s depending on sensors. So yes and no. It’s not full standard, but we’re following mostly the XML standard. If we could change it nowadays, we’d probably use some YAML. But yeah, you’re always smarter after. Hope that answered your question.
Caitlin Croft: 00:38:07.600 I got a thumbs up from the [crosstalk], so good job. And if you guys ever have any follow-up questions, I’m always happy to connect you with the speakers if you guys want to take this offline and get more detail. Is the enriched data shipped back to MQTT before it reaches InfluxDB, eventually?
Sebastian Koch: 00:38:32.547 Can we repeat the question?
Caitlin Croft: 00:38:34.719 So I think what they’re asking is, does the enriched data get sent back to the MQTT broker before it reaches InfluxDB?
Sebastian Koch: 00:38:45.099 Yes.
Caitlin Croft: 00:38:45.375 So once the data is enriched, where does it go before InfluxDB?
Sebastian Koch: 00:38:49.461 Yeah. It goes into MQTT. And there we have in our edge software it has a store forward method. Once there’s a connection, it takes the bulk, uploads it. Broker takes it. Dumps it into Influx.
Caitlin Croft: 00:39:02.986 So you mentioned using Flux and how much better it is than InfluxQL. Can you share any tips that you learned along the way? It definitely takes some time to get used to Flux, so just curious if you can share any tips that you learned along the way figuring it out? [laughter]
Sebastian Koch: 00:39:21.722 We were using the Influx support, actually, because we didn’t really bother to try that much. So we reached out to Influx support and asked, “Hey, can you give us an hour or 90 minutes intro?” And then it was just hard learning. What turned out to be very good is the latest Chronograf has this Influx query builder. I used it by myself because I didn’t want to bother to go into all the bits and pieces, and this is easy to try out. The API is way more complicated. So if you want to start with Flux, the Chronograf Flux builder gives a good idea how this works.
Caitlin Croft: 00:39:55.950 Perfect. I know you touched on this a little bit but just curious. How often are you collecting metrics from all these sensors? And in total, how many metrics do you think you’ve collected from all the sensors?
Sebastian Koch: 00:40:10.155 That’s a good question. Yeah. As every good IT guy, I answer with; it depends. Some sensors they just expose type metrics. Just as simple as that. A simple temperature sensor, for example. But if you see they expose up to three, four, five thousand symbols and they all change, they can change in 10 or 100 hertz. So overall, we’re speaking about per deployment at least tens to hundred thousands of individual standalone metrics. So a physical temperature sensor would have, let’s say, five metrics. Temperature in Celsius and Fahrenheit. The last one, I don’t know, elevation and something, something. So a physical sensor has more metrics and we count this as metrics. I think Influx sets data points. So to answer the question, we are going towards the billions if not trillions of data points per cluster. This is how we scale the cluster at the moment, and this is how they are growing.
Caitlin Croft: 00:41:14.918 It’s kind of amazing to me if you’re dealing with billions or even trillions of metrics that your customers don’t want it downsampled at all.
Sebastian Koch: 00:41:24.115 Yes. It can be painful, but it’s also a nice challenge. We were advising on downsampling, but we will only - we start down sampling once the clusters will.
Caitlin Croft: 00:41:32.861 So do you think that they will eventually want to downsample? I know that’s sort of a characteristic and it’s common with timestamped data where at the beginning when you start collecting it you want all of it but then after a while; you don’t need it to quite the granularity? Do you think they’ll want to downsample in the future?
Sebastian Koch: 00:41:51.486 Yes. For most customers, we are targeting enterprise customers. So we consult them as well. These projects take years. And most commonly, after maybe a year or one and a half years, the data acquisition phase is done, right? We have all machines collected. The customer has an insight of the shop floor. And then they already started creating their first apps. And then after a year, they realize, “Oh. Now my data science team or external data science company could do a certain advisor.” And then they want to have all the raw data. If we would have downsampled beforehand, this will be bad. So after this step, once they have the applications and use cases production-grade, they know this is the stuff we can downsample and we can throw away.
Sebastian Koch: 00:42:32.772 Most of our customers aren’t there yet because this is brand new. Industrial IoT, most customers have been in the PSE phase in the last years. We only have a couple of customers that are really, really big scale nowadays where they have dozens of people working on that across all factors. So to answer the question again after some [inaudible], yeah. I believe that down the road we will dump data on cold storage and not downsampled because customers will know that the value lies in that data. If you throw away data - this is just from my opinion - this is burnt money. I will advise our customers to just put it on Apache Glacier or other stuff and leave it there.
Caitlin Croft: 00:43:11.181 Perfect. How do you integrate all types of data into a unified source? Are you using OPCUA, OPCDA, IoT sensor data?
Sebastian Koch: 00:43:22.958 Yeah. OPCUA is actually not our best friend because most of the companies that implement OPCUA don’t follow the standard. And quite often, these things break. We just had an occasion where a customer bought a device, OPCUA certified, but it turned out that they had cryptic data in there. Stuff broke. We had to fix things. So OPCUA was built to be that protocol, the one to rule them all but doesn’t work out. What we do is our edge software is translating it all into a common format in MQTT using JSON and we are doing the translation. There is no one thing that can translate everything. So we have different sources. It’s a big chunk of OPCUA, yes. A very small chunk of OPCDA is to be translated to OPCUA because OPCDA drivers are very seldom nowadays. And some IoT sensors, the good ones, they have native MQTT support. So they just publish right into our topics. No translation there. We have databases. Old Oracle ones. Custom stuff. So it’s a blend. We do the translation.
Caitlin Croft: 00:44:28.947 Perfect. And just to add on to that, Christopher added on, whoever asked about OPCUA, you should look into Eclipse Sparkplug. And Sebastian agrees. Are you using Kafka in your broker layer? If so, what type of data volume peak TPS do you see?
Sebastian Koch: 00:44:54.418 A very good question. Yeah. Some of our biggest engineering concerns in the last year. We started very early with our platform. Back then, Kafka wasn’t even that production great. So eight years ago, as we started, MQTT was brand new. Influx was brand new. And we started with ActiveMQ and JMS. So our backend is mostly using ActiveMQ as a broker and a publish-subscribe system with Apache Camel on the backend and JMS queues. Nowadays, if you could rearchitect after seven years, yes, we would do Kafka. So we are migrating at the moment. We’re replacing everything with either native MQTT and we’re getting rid of JMS and getting rid of ActiveMQ and Apache Camel because they have too many breaks in changes and we are getting there. I can’t answer about the amount of data. I know for our typical broker we have hundreds of messages received per second. One message can receive 1 metric or 5,000 data points. So this is the type of scale we’re talking. But probably a blink of an eye for a good Kafka cluster because they are ridiculously scalable, I think. So yeah, Confluent will probably not even bother giving us a [inaudible] because our data’s too low for Kafka. [laughter]
Caitlin Croft: 00:46:09.175 Let’s see. The next question is, what is the relationship between the relational database and InfluxDB? So I can add onto this first. So time series databases are optimized for collecting, storing, retrieving, and processing of specifically time series data, which is a little bit different than relational databases. Relational databases, these databases are optimized for the tabular storage of related data in rows and columns. So it really just depends on what your data is and what you’re trying to accomplish with it. Sebastian, do you have anything that you wanted to add about the difference between the two?
Sebastian Koch: 00:46:52.319 You summarized it very well. My recommendation is leave it as it is. There is no one database that can do both. There are a lot of suppliers nowadays that proclaim that they can do both relational data and times series data in the same system. That can’t work out. You have to compromise. We went really good over seven years now, having specialized databases for the job. We have graph databases, document stores, RDBMS, and Influx, and that that works out fairly good for us.
Caitlin Croft: 00:47:28.250 Yeah. It all depends on what your data is and pretty much all of our customers are using multiple types of databases, so. All right. Moving along. There’s lots of questions here. Any use cases in thermal powerplants that you have implemented?
Sebastian Koch: 00:47:45.494 No. Not yet. Not in a thermal powerplant per se. But our mother company, Aurubis AG, is probably one of the largest copper recycling factories in Germany. They have - I’m not an energy expert here, but - kilowatts if not megawatts of energy they consume. A lot of high temperatures there. So we are dealing with quite a lot of process and a lot of energy use cases there. Specifically, in saving energy. We just implemented a very nice use case where we are saving a couple of thousand Euros in energy consumption per day for them just by using data. And answering the question, no, not therma powerplants per se. But yes, we are very experienced when it comes to energy and therma power use cases.
Caitlin Croft: 00:48:38.270 MQTT state is hard to back up. What happens when the MQTT process is stopped or crashes?
Sebastian Koch: 00:48:46.517 Another good question. It seems like it’s just software engineers here that are breaking their heads to implement this for their employers; I assume. [laughter] Yes. We’ve been there. We suffered from that. We have a local store that contains all the data. So even if MQTT is losing data, and this happens even with clustered brokers and all that stuff - we had this with ActiveMQ - we have it local and then we can just [inaudible]. There is no easy way around that because quality of service doesn’t work all the time. Yeah. [inaudible] is not the solution either. If you have that issue, I can only recommend either use a local store or one of the mature IoT Edge gateways. Azure has a good one. AWS has a good one. azeti has a good one. Plenty of others. Apache programs there. Or try out MQTT Version 5.0 because they are solving some of the issues there.
Caitlin Croft: 00:49:44.996 Do you have any hands-on experience regarding the query performance between Flux and InfluxQL?
Sebastian Koch: 00:49:54.012 No. We tried to see some performance issues, but for the queries, the queries we try for comparison and comparing it in regard to the use experience. But we’re running on a fairly large Enterprise cluster, which is currently at 10% of the capacity. So it’s so beefy that we wouldn’t recognize that.
Caitlin Croft: 00:50:17.930 For those who have not digitized yet their environment, how long does it take to onboard a new company that is still using software from 10 to 15 years ago? Can you give us some examples?
Sebastian Koch: 00:50:31.619 Okay. Assume we have a company, a large factory, and you are melting steel. Everything is super old. Legacy shop floor, people wearing hard hats, right? You don’t have a computing center. I would come in probably with our consulting. See what you have. Create a plan on how we can acquire data. This is back to the process that Florian was showing, right? We would come up with a plan, how do we tap into the data silos? Your factory are melting steel across five different locations. You have individual servers because Bob is maintaining the server. And Bob doesn’t document it, and Bob doesn’t like to connect it to the cloud. And we have to tap into that. We provide a lot of interfaces for this. This takes probably a couple of weeks to roll out because the complexity is not the software, but rather getting there.
Sebastian Koch: 00:51:25.598 Once we tapped into the data silo, anything else is just using our software and us helping out that customer, assuming that you have no IT personnel or project managers. A really good example with our mother company. We wrote out our platform within a couple of weeks, including setting up the project team. But they were very good staff. They had a scrum team already. Agile methods in place. And there we would talk about a couple of weeks to deliver the first uses cases that saved money in a large corporation. So we’re talking azeti being the startup guys being agile and bringing things and having a company that’s very corporate and very risk [inaudible]. So answering the question, again, very hard. It depends. But we talk weeks and not years. And we don’t talk months, compared to our other market members.
Caitlin Croft: 00:52:19.425 And I think it also kind of depends on where the data is being extracted from. In
my experience when I’m talking to other IoT customers, sometimes it takes them days, maybe even hours, to get it implemented, but it all depends on what the infrastructure is and if the machine is old, and how to get the sensor data from it.
Sebastian Koch: 00:52:40.200 Yeah. For a typical PSE, let’s say you have a machine with a PSE that we support. The longest it takes to get this machine connected to our platform is finding, “Bob, where’s the key to the enclosure?” Getting a patch cable because they don’t have any there. Getting it hooked up and asking the IT department to open up the firewall. The rest is fairly automatic on our side because we have, what we call, a PSE with it. It’s scanning the machine, figuring out, what type of PSE it is, and getting the right data in there. It appears in a browser. And then you, as a customer, don’t have to care for this very tedious and painful homework of doing data acquisition. This is our value proposition, right? We are solving that issue. Because in the end, the customer will know what to do with the data. They are the process engineers. They know what to do with this particular PSE information.
Caitlin Croft: 00:53:29.473 All right. So I think this person missed the beginning of it, so do you mind just going over again which version of Influx you’re using?
Sebastian Koch: 00:53:38.178 We’re using Version 1.8.
Caitlin Croft: 00:53:41.188 And you’re using InfluxDB Enterprise, correct?
Sebastian Koch: 00:53:44.960 Yeah. Open-source Enterprise and Influx Cloud, all on 1.8.
Caitlin Croft: 00:53:48.986 Okay. So for using InfluxDB Cloud, how do you prevent the cost rising, for example, with the amount of storage and querying that you’re doing?
Sebastian Koch: 00:54:02.984 Yep. For our large-scale customer, the ones where we’re talking hundreds of gigabytes, an Enterprise clusters probably the better choice. But for anything below 100 gigabytes, Influx Cloud scales fairly well. One neat detail which also gave enough stability for our customer to go Influx Cloud is that if you have these performance peaks, for example, they are dumping a couple of gigabytes of data just right now because they’re connecting a new machine, Influx will not upgrade you right away into double the price subscription, but they will call you [a couple of days ago?]. They will call you and say, “Hey, your clusters actually above your limit. We upgraded you. Please take it under control so we can downgrade again. If not, we have to talk about a subscription.” And that’s really fair. We’ve been there, I think, three times over the last two years with a customer. And all of the time, the customer was either dropping tags, high cardinality PSEs, dumping gigabytes of high cardinality tags, or we are actually upgrading. So that works well. But yes, if you want to dump an infinite amount of data in Flux Cloud, it’s probably not good for that. I hope I don’t destroy the value proposition of Influx Could here, [laughter] but it’s probably not for Terabytes of data.
Caitlin Croft: 00:55:16.070 Well, so many companies are dealing with this right now, right, migrating to the cloud. So it depends on what best suits your need. And one of the reasons why I love InfluxDB is that it’s open-source. So you can start playing with the open-source version. Figure out what you need. Figure out if Enterprise works best for you or Cloud, so. Is there any reason why you prefer MQTT over APQP or AMQP?
Sebastian Koch: 00:55:46.839 Yes. MQTT is way more mature. We have more libraries at the edge. We started seven years ago, right? We had to choose what was stable. And we had a research project where we were comparing different protocols, including AMQP, and what was the other one? STOMP, I believe. And MQTT was the clear winner. Back then, we were betting. Because only three years after MQTT became, I think Eclipse project and standardized. So this is one reason we tried it and we went well with that. It has a broader community. And I don’t know if MQP is ready nowadays to have things like parallel subscriptions, quality of service, Last Will and Testament, all of these things. But I’m not an AMQP person.
Caitlin Croft: 00:56:37.645 Fantastic. Thank you, Sebastian. Thank you, Florian. There was a ton of questions. Thank you, everyone for joining today’s webinar. If you have any further questions for Sebastian or Florian, please feel free to email me. I’m happy to connect you with them. Once again, the session has been recorded and the slides will be made available in addition to the recording later today.
Sebastian Koch: 00:57:03.419 And please don’t hesitate. Email us if you have more questions. Just email me or hit me up on LinkedIn. I’m happy to get into a conversation. Doesn’t need to be about business, but we can just talk about technology. We love to connect. Just feel free to open the conversation. Thanks.
Caitlin Croft: 00:57:19.910 Thank you both and I hope everyone has a good day.
Sebastian Koch: 00:57:23.581 Thanks.
Florian Hoenigschmid: 00:57:25.678 Thank you. Bye.
[/et_pb_toggle]
Florian Hoenigschmid
VP Strategy & Sales, azeti
Florian Hoenigschmid is a VP of Strategy & Sales at azeti. Since 2013, Florian has been helping industrial clients from the telecommunications, manufacturing and process industry to build and implement their digital transformation strategies. He worked with large enterprises in North America and their European counterparts and successfully drove the co-creation of large-scale IIoT solutions to digitalize maintenance and production processes. Over the last 8 years, Florian held different positions in product management, strategic alliances, business development and sales.
Sebastian Koch
Managing Director, azeti
As Managing Director, Sebastian is, among other things, responsible for products and engineering and is particularly keen on modern operations, automation and engineering culture. In 2020, he successfully led the acquisition of azeti by Aurubis AG. He joined azeti in 2010 and held different roles ranging from Customer Consulting, Engineering to Management. His free time is well occupied by his three kids (boy and a pair of twin girls) and trying to finish DIY (outdoor kitchen) projects at home.