Getting Started: Edge-to-Cloud Architecture with MQTT and InfluxDB 3.0
Session date: Oct 10, 2023 08:00am (Pacific Time)
HiveMQ are the creators of a MQTT broker that makes it easier to move sensor data to and from connected devices in an efficient, fast, and reliable manner. It is designed for cloud native deployments to make optimal use of cloud resources. The MQTT protocol helps reduce the amount of network bandwidth required for moving data, and efficient IoT solutions result in lower total costs of operation. InfluxDB is the purpose-built time series database, now with unbounded cardinality. InfluxDB 3.0 is the newest core of InfluxDB built with Rust and Apache Arrow.
Discover how to use MQTT and InfluxDB to build your own modern industrial IoT architecture. In this webinar, we will demonstrate how to use HiveMQ Edge — an open source MQTT gateway for IIoT with InfluxDB. You’ll learn how to connect HiveMQ Edge to an OPC-UA data source and push the data to HiveMQ Cloud, and how to use InfluxDB for data persistence and analysis.
Join this webinar as Kudzai Manditereza dives into:
- Industrial IoT monitoring best practices
- An overview of how to use HiveMQ and InfluxDB 3.0 together
- Live demo
Additional resources:
- HiveMQ Edge Download
- Line 1 Simulation JSON
- Line 2 Simulation JSON
- MQTT Essentials
- Unified Namespace (UNS) Essentials
Watch the Webinar
Watch the webinar “Getting Started: Edge-to-Cloud Architecture with MQTT and InfluxDB 3.0” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Getting Started: Edge-to-Cloud Architecture with MQTT and InfluxDB 3.0”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Caitlin Croft: Customer & Community Marketing Director, InfluxData
- Kudzai Manditereza: Developer Advocate, HiveMQ
Caitlin Croft: 00:00
Hello, everyone, and welcome to today’s webinar. My name is Caitlin Croft. I’m really excited to have our friends from HiveMQ here today. Kudzai will be talking about how to get started creating an edge-to-cloud architecture with MQTTs and InfluxDB 3.0. Please post any questions you may have in the Q&A. We will definitely be monitoring those and answering them at the end. And once again, the recording and the slides will be made available by tomorrow morning. And without further ado, I’m going to hand things off to Kudzai.
Kudzai Manditereza: 00:39
Thank you. Thank you so much, Caitlin, and welcome to everybody. Thank you so much for joining our session today. My name is Kudzai Manditereza. So, today we’re going to be talking about the topic of edge-to-cloud architecture using MQTT and InfluxDB 3.0. And this is going to be particularly oriented around industrial IoT architecture, kind of like giving you a blueprint, a framework for you to kind of integrate your data using MQTT and then have that stored in an InfluxDB database. And the framework, really, for our discussion today is going to be centered around the concept of a unified namespace to show how MQTT and InfluxDB combine to kind of give you that architecture that allows you to easily digitally transform or build systems that allow you to integrate data from the OT world into the IT.
Kudzai Manditereza: 01:39
So, again, my name is Kudzai Manditereza. I’m a developer advocate at HiveMQ and I’m also an industry 4.0 researcher, and I’ve also run an independent YouTube channel called Industry40.tv. And you can connect with me on LinkedIn. I’m Kudzai Manditereza. And you can also send me an email, if you have got any questions, on the email that you see on the screen. Okay, so for a quick introduction, just to kind of give you a basic background of what MQTT is, for those of you who are not familiar with the protocol, and for those of you who are also familiar with it, kind of give you a refresher of exactly what MQTT enables in an IoT setup. So basically, MQTT is an IoT communication protocol that is based on a publish/subscribe mechanism of communication whereby instead of having two components connect directly to each other to exchange information, instead, you have got a third-party server which is called an MQTT broker, which sits in the middle between all the different clients and coordinates all the communication that needs to happen between these multiple participants of an IoT network.
Kudzai Manditereza: 03:05
So, this is different from HTTP, where you’ve got a direct connection between a client and server, and you’ve got this synchronous request-response exchange of information. So, with MQTT, this is really an asynchronous communication that is achieved through a message-oriented communication or a message-oriented middleware. And the big, really, advantages with MQTT, particularly in the industrial IoT space, is this idea that it’s an open architecture, it’s an open standard, meaning that, really, no one owns the standard. Any vendor, any system, any developer can pretty much implement MQTT without having to worry about any royalties. And the other issue is that it is lightweight, which is a big issue, particularly in the industrial IoT space where you might find a situation where you’ve got multiple PLCs that have got thousands of tags, some of them, and you want to have a way that is able to transmit all those tags in a way that is really lightweight, is not heavy on the network. Because the idea, really, with digital transformation or industrial IoT is that we want to get all the data up to the IT. As much data as possible because eventually, this data is going to be used by things like analytics, AI, or different applications. So, you want to get as much data as possible. And the idea, really, here, is that you’re not making any assumptions about how this data is going to be used, as opposed to the old way of doing things, whereby, for example, you’ve got your request-response protocol, like your point-to-point protocols, whereby because you have got to make that manual direct connection to a specific tag or group of tags, you can only connect to so much data points. You can only get data that you think you might need to use.
Kudzai Manditereza 05:06
But with MQTT, it allows you to push all the data up to the MQTT broker and have any application that is interested in that piece of information, consume it, and use it as it pleases, which is really kind of like the central concept around this idea of edge-driven. Information is being pushed from the edge into the IT, into the broker, into the server, which is different from having a master, so to speak, application always requesting information each time without knowing whether the information has changed or not. So here with MQTT, information is being pushed by the client from the edge of the network whenever there is a necessity to do so. So HiveMQ, basically, is a company that provides an MQTT platform. So, this includes client libraries, which allow you to develop MQTT applications, whether for sensors, for devices, for PLCs, application, or whatever the case may be. We provide those client libraries; C#, Python, and all the different. And then recently, we introduced a HiveMQ Edge, which is an embedded MQTT broker that is lightweight, which can deploy at the edge. And most of the demonstration here today is going to be centered around HiveMQ Edge in combination with HiveMQ Cloud. And we’re actually going to have a practical demonstration, by the way, to show you how to integrate data using some real scenarios.
Kudzai Manditereza: 06:48
Now, we also have the HiveMQ Platform, which consists of a fully managed HiveMQ Cloud, which is an instance that you can sign up for for free, and then you can connect up to 100 clients. And then we have got different versions of HiveMQ Cloud. We’ve got Starter and the different ones. So, you can check out hivemq.com to kind of get a picture of exactly what is included in HiveMQ Cloud fully managed. And then we’ve also got the HiveMQ Self-Managed, which allows you to deploy HiveMQ in your own data center, or if you want to deploy it on Azure Cloud. But as a standalone MQTT broker, we provide that. We also have got Kubernetes operators and all the tooling that you need to develop your MQTT platform. And recently, we introduced a HiveMQ Data Hub, which allows you to monitor your MQTT traffic to check for the validity of the data. We’ve also got extensions that allow you to connect to multiple databases. We actually do have an extension that allows you to connect to a InfluxDB database. But today, what we’re going to be showing you is using the native connector that is inside the InfluxDB Cloud or Serverless platform. So, we’ve got multiple ways that we could use to connect to InfluxDB using also the Telegraf. But today we’re going to show you the native connection to that.
Kudzai Manditereza: 08:18
So now, for those of you who are not familiar with, really, InfluxDB, so, InfluxDB is a platform that allows you to manage different types of time series data. So, as you may be aware, really, a lot of IoT sensors and IoT device machines produce data that is timestamped. And for you to make sense of that data, really, it needs to have that timestamp to it so that whenever analytics need to process that data, there’s always a reference to time to know exactly when this happened. And in some instances, you need a time resolution that is as low as milliseconds or nanoseconds, depending, again, on the use case that you are building up, or you want to store data in 10 seconds or in second resolutions, it’s up to you. But InfluxDB really allows you to kind of get this fine resolution of your time series data by offering this purpose-built database. And one of the big things, really, about InfluxDB is that it allows you to run this in different environments. So again, they’ve also got InfluxDB Cloud Serverless, which is a managed service that allows you to spin up a database, which I’ve done for this demo, and I’ll show you how you can do that. For free, you can actually connect your broker or your devices to an InfluxDB database and then start to see all your information go through. And then they’ve also got InfluxDB Cloud dedicated and also InfluxDB Clustered, which you can host on-prem, right?
Kudzai Manditereza: 10:03
And kind of, like, the big thing here, really, is that InfluxDB recently announced InfluxDB 3.0. For some of you who are already familiar with InfluxDB, this database was rewritten in Rust and kind of using the Apache Picket file system, which allows you to save up to about 90% of storage costs. So, it was really a different approach that they brought to this. And also, one of the big things with InfluxDB 3.0 is really unlimited cardinality, which means you really don’t have a limit size to the amount of data that you can store, which kind of opens up a whole lot of different use cases that were not possible before. And there’s some collectors that you can use to collect data for you to store into InfluxDB. And they also provide client libraries for scripting languages. There’s an API, and then again, there’s also the time series database which actually does the storing of your time series data. So, this is also another picture that gives you a clear view of what InfluxDB 3.0 is.
Kudzai Manditereza: 11:19
So, on one end there, you see you’ve got your IoT sensors and machines which are your data sources. They’re all producing timestamped data and then that data needs to be collected and persisted in a database storage. So, InfluxDB again, as I’ve already mentioned, provides different mechanisms to be able to do that. There’s Telegraf, which is a server agent that allows you to connect to different data sources and be able to act as a bridge to persist that information into InfluxDB time series database. And then there’s also client libraries and there’s also more tools that you can use to collect data into InfluxDB. And there’s the actual storage itself, and you’re able to use some SQL queries to actually dig through your data. And then, there’s also integrations that allow you to then perform some data visualizations and analytics and machine learning applications and more that you can then use to dig through your data and generate those insights, which is basically what you really want to get to at the end of it all when you’ve connected your OT data to IT data. You want to be able to, kind of, get the insights that then inform the decisions as a company, as a manufacturing company, to either produce more or operate efficiently, whatever the case may be. The ideal situation is that you end up with an analytics dashboard that kind of gives you all of that information, which is what InfluxDB allows you to do.
Kudzai Manditereza: 12:52
So, now we’re going to jump right into the demo without further ado. But before we jump right into the demo, I want to kind of discuss the current lay of the land, as it were, as far as the industrial landscape is concerned. For some of you who are in industrial automation, they might be familiar with this automation pyramid, which is really a way of demarcating between different zones within a manufacturing operation where you’ve got level 0, where you’ve got your sensors and actuators, and then you’ve got another level up where you’ve got your PLCs, you’ve got your SCADA, you’ve got another level for your MES and your ERP, and all the way up to the cloud. So, this is kind of, really, a way of creating these demarcations for security reasons and also having these systems that are sort of common in the time that they take to process this information. For example, the sensors and actuators, they kind of process information within milliseconds because they are connected directly to the process. So, if you’re in a chemical plant, you are getting all of these chemical changes within milliseconds. So, you want your sensors to be able to react in that specific time range. And then you’ve got your PLCs that also have got an order of time in which they operate to your SCADA system, to your MES, and ERP. So, I’m not going to delve much into this, but you can check it out and find out more about this.
Kudzai Manditereza: 14:27
So, this is really a restrictive way of kind of connecting or integrating data from OT to IT because what it means is that data needs to move in kind of a linear fashion from the sensors to the PLCs, from the PLCs to the SCADA, from the SCADA to the MES, from MES to ERP. And if you are familiar also with the industrial landscape, you might know that there’s specialists for each different level. There’s PLC specialists, there’s MES specialists, there’s SCADA specialists, and there’s ERP specialists. What that means is that you end up having to spend a lot of engineering costs or a lot of engineering effort just to integrate your OT data into IT data. So, what we’re going to talk about today is kind of like an architecture, a framework that allows you to seamlessly integrate your data from OT into your IT environment using MQTT and InfluxDB. And the framework of operation that we’re going to be talking about here is the unified namespace.
Kudzai Manditereza: 15:30
So, this is kind of a high-level picture of an architecture that you’d use for integrating your OT data to IT or cloud using MQTT. So, basically, what you see here is that we’ve got an edge or OT network on one end and then we’ve got an IT or cloud network on the other. So, typically, what you want to do is that you want to deploy an MQTT broker or multiple MQTT brokers at the edge. I mean, you could push data straight from the PLCs or your systems into the cloud, but for some latency issues and other issues where you want to keep your data local, you could deploy your MQTT brokers at the edge, right? And then you have all those federated brokers pushing their information to a centralized MQTT broker either in the cloud or within a factory. So, we’re actually going to show you how that architecture looks like in detail after this. So, this is kind of like an overall high-level picture where you see HiveMQ Edge, which has got protocol adapters, allows you to connect to machines or equipment, and PLCs to collect all of this data that is normally not well-organized and normally not able to integrate directly to MQTT. So, you want to be able to connect this data first and then be able to contextualize that data, normalize it, and push it into the MQTT network.
Kudzai Manditereza: 17:12
So, the HiveMQ Edge provides those adapters that allows you to convert from legacy protocols such as Modbus and OPC UA. And you could also connect MQTT clients directly to that embedded MQTT broker within HiveMQ Edge. So typically, that’s how you’d deploy it. You have federated brokers at the edge that are collecting data locally and then using an MQTT bridge to push that information to a central server or a central MQTT, which then creates a unified namespace, and then use a connector to then connect that information to InfluxDB Cloud, which is what then historizes all of that information. So, for some of you who might be familiar with the concept of the unified namespace, you’d understand that, really, the unified namespace is an event hub. All of this information that is being pushed from PLCs or machines through that broker federation is a group of events. It’s event-driven architecture in a way where you’ve got information that is changing, whether it’s temperature changing from 20 degrees to 25, that’s an event. That data gets pushed up. So, your MQTT broker is all about holding the data that is in motion. And then you need to have a way– because with data in motion, you are not able to kind of have a historical data. And an MQTT broker, really, is not a data store. So, this is where InfluxDB comes in as a time series database that is capable of consuming your unified namespace and historizing all of this information so that when you need to train your models, when you need to dig through your data, or you need to go back to see what happened, or whatever the case may be, you’ve got all your data stored in a time series database. And for real-time events, you can then look through your unified namespace. So, it’s a way of combining your event hubs with a data storage mechanism that gives you a complete picture of a unified namespace.
Kudzai Manditereza: 19:18
So, now let’s kind of go through the actual scenario that we’re going to be demonstrating today to kind of really drive the point home about this idea of exactly how do you go about building this edge-to-cloud architecture using MQTT? Because one of the things that I find a lot and that we find a lot is kind of like a lot of companies in the manufacturing space because MQTT is kind of like a different way, really, of approaching data integration, we’re used to a client server or request-response communication, so it’s difficult to say how do I position my broker? Where do I put my broker? I’ve got my production lines, I’ve got my SCADA systems, I’ve got my IT network, I’ve got this and that, where exactly do I put the different components that make up this unified namespace architecture or this MQTT architecture within a factory? So ideally, what you want to do is you want to be able to deploy a small footprint MQTT broker at each area of your production facility. So, if you’ve got a packaging area and then you’ve got maybe a separate area where you’re performing some other operations, you want to be able to deploy your broker within each area. So that’s kind of the standard approach that this architecture recommends, but it’s not like a hard and fast rule. So, we have seen a lot of people actually even go down and deploy broker per line. For each line, have a small broker. But ideally, what you want to do is that you want to be able to deploy your broker as close to the source of the data as possible. So, in this case, what we’re doing is we’re having a broker deployed at each area of your production plant.
Kudzai Manditereza: 21:09
So here, we’ve got a HiveMQ Edge, which is deployed at a packaging area. And as you can see, we’ve got line one, which is where the filling happens, and then you’ve got a line two, we’ve got the storage tank. So, there might be other operations or other cells within that particular line. But for demonstration purposes, we’re picking the filling machine as that work center within that line one. And then here we’re picking the storage tank. So, what you see is if you are familiar with MQTT or not, MQTT uses topics to push information or to publish information, which is more like an equivalent of tags, if you are familiar with the industrial automation where you’ve got the name, which is the tag and the actual value of the tag. So, your topic then becomes a tag. But then MQTT allows you to represent the name or the source of your data using this hierarchy, where you’re using this forward slash to create a hierarchy. So, what you’d then be able to do in this case is that I’ve got a device, a groov RIO. If you look behind me there, I’ve got a groov RIO where I’m simulating data from that groov RIO and I’m publishing that data under the topic line one filling machine. So that is the topic that I’m publishing it. And then on line two, I’ve got a Raspberry Pi also there where I’m simulating some sensor data and then I’m publishing that under line two storage tank. So that is data that is coming from a storage tank. And this information is being pushed to HiveMQ Edge, which is a local MQTT broker. So, it’s essentially creating a local unified namespace where you’ve got the root being the packaging area. That’s your root namespace.
Kudzai Manditereza: 22:59
So, these devices are able to exchange information within their local MQTT namespace without knowing about the enterprise namespace. So, this allows you to exchange information and communicate within your area using that particular MQTT broker. But then now, if you need to push that information for context within an enterprise, this is where you use HiveMQ Edge to then append your enterprise and site in area names to all this information. So, all this information that is coming from your different lines then gets appended to this topic namespace, which is then pushed to your enterprise broker. So, this could be pushed directly to the cloud, or you could have another broker that sits within a particular site which is collecting all of this information from different areas and then push it to the cloud. Again, there isn’t really any hard or fast rules about how many brokers you need to kind of – level of brokers you need to stack up to build that architecture. So, it really depends. But what we’re showing here currently is where you have got the packaging area that is pushing that information to a local broker.
Kudzai Manditereza: 24:13
So now, as you would imagine, you’d have multiple areas that have got multiple embedded MQTT brokers that are creating local namespaces and then using an MQTT bridge to then push that information out to a cloud-based MQTT broker. So, let’s jump into the demo quickly. So, first of all, as I have already mentioned, I’m simulating some data within my groov RIO. So, what you’re currently looking at here is a Node-RED flow that is running within my groov RIO that is simulating all this data. So, as you can see, I’m simulating filling temperature, filling pressure inlet, and filling speed. And then here, I’m also attaching some metadata or some data that is not necessarily the measurements. So, you have got the asset type. What asset is this? So, this is, again, based on how you plan this as an organization to say, “How do we create this model? What information makes sense for us to include in that particular object that we’re pushing from the edge?” And by the way, here, because this is an MQTT client, we’re publishing this information to the local broker as an MQTT client. But if this was an OPC UA device or a Modbus device, this would be collected using the protocol adapters on HiveMQ Edge, which I’m going to show you shortly on how you are able to do that.
Kudzai Manditereza: 25:49
So, we’re enriching this payload using all of this information that would help us identify it when it then lands on InfluxDB if we then want to do some querying or all the kind of analysis that you want to do. And then, this is really being joined into a JSON object. So here, basically, I’m creating a key-value pair from all this data that I’m simulating, and then, finally, I’m publishing this information too. So I’m currently running HiveMQ Edge on my Mac and I’m publishing under this topic, if you notice this from the previous slide where I showed you this. So, this is the topic that I’m specifying to say I’m publishing this data to this particular topic namespace. So, this data is being published currently. As you can see, we’re connected to this MQTT broker. So next, we have got– so let me see if I can move this. Right, so what we’re seeing here is now data that is coming from the Raspberry Pi. Again, this is a flow that is simulating data for a storage tank, right? And basically, here, again, we’re creating all this data and then publishing this under this topic namespace, which is a line two storage tank. So, this is also publishing to that HiveMQ Edge broker that is running on my MacBook.
Kudzai Manditereza: 27:26
So, now we’ve got two clients that are pushing this information under that namespace. So, now let’s go into HiveMQ Edge to actually see this information. Okay, so I need to move this. Okay, so what you’re looking at here is the interface for HiveMQ Edge. So, sign in. So, basically, this is the home page. Whenever you need to connect anything, that’s what you’re going to see first. And then if you need to go into Protocol Adapters, you can go to Protocol Adapter Catalogs. As you can see here, we’ve got a Modbus to MQTT Protocol Adapter, and then you have got OPC UA to MQTT Protocol Adapter and the different adapters including the HTTPS. And then you can also simulate your edge device here if you don’t have a real device that you want to connect to. And by the way, I need to mention here that this HiveMQ Edge is 100% open source, so you can just download this and start playing around with it right away. There’s no licensing at all that is required for this. So, we’ll share the information at the end of this slide with all the information about the download instructions where you’ll be able to get this.
Kudzai Manditereza: 28:51
So, these are all the protocol adapters. So, you could go to OPC UA, you could specify information about your OPC UA server, the subscription, security mechanism, and all the different authentication that you need to do here. So currently, since we’re receiving all of this information, I’m going to show you how we’re actually doing the appending where we’re adding that unified namespace because this information is coming from your line one and line two. But now we want to forward this to an enterprise broker or a high-level broker. So how do we do this? We go to Unified Namespace. So, basically, what we provide here on the HiveMQ Edge is a structure that follows the ISA 95 hierarchy, which allows you to kind of create your unified namespace using enterprise site, area, line, and cell. So what I’ve done here is I’ve created Fresh Farm Dairy. So, this is basically a dairy production plant that is located in Munich, and this is located in the packaging area. So, I could extend this further and put more, but remember, we’re already publishing information from the packaging area, so I need to extend this by just specifying the area, the site, and the entire enterprise.
Kudzai Manditereza: 30:12
So, once I’ve specified the namespace prefix here, I’ll then go back to HiveMQ MQTT Bridges. So, the bridge is then what then determines to say what information do I forward to this namespace. So, I’ve already connected to HiveMQ Cloud where I’m pushing all of this information. And if I go here, you’ll see. So, I’ve got this information where I’m specifying the URL and port number and all the broker configuration details, security details here. But more importantly, here, I’m able to specify the filters to say, “These are the topics that I want to push to a high-level broker or to an enterprise namespace.” But in my case, I’m receiving information from line one and line two for temperature and filling machine, and I want to push everything up. So, I’m not filtering through this information. But you might be in a situation whereby you’re receiving information from many different devices, and it doesn’t make sense for you to push everything up to the namespace. So, this is where you use this filter. So here, I’m using the hash sign to filter everything from incoming and then push everything to the destination. So, I’m filtering everything that is coming and pushing it under the prefix of this unified namespace that I’ve specified here. So, which means now we have got data coming from our different production areas, from different lines, going into a HiveMQ Edge broker, and then it is now being pushed to HiveMQ Cloud, which is a serverless broker that you can set up.
Kudzai Manditereza: 31:54
So, let me quickly show you HiveMQ Edge here. HiveMQ Cloud, which is where all this information is going. So basically, if you need to sign up for HiveMQ Cloud, you go to the HiveMQ website. As you can see here, you can sign up for free. And then you’ll be able to create a broker cluster. Two broker clusters, actually, that allow you to connect up to 100 devices. So, I’ve got two clusters already here, as you can see. So basically, you go in here, you’ll see all of this information that you need to connect to it. So, I’m currently connected to this. And then you can also specify your connections, your access management. You can specify your access credentials here. I’ve got my HiveMQ credentials specified here. And we have also got some integrations in case you want to integrate to some other platforms there. So basically, that’s all you need to do to sign up to HiveMQ Cloud. So, I’m already pushing information to HiveMQ Cloud. So here, we’re going to use this MQTT.FX MQTT client that is subscribed to the HiveMQ Cloud. And so, all of this information that we’re looking at is information that is being pushed directly to HiveMQ Cloud.
Kudzai Manditereza: 33:15
So, what you notice here is that information is coming on these two different namespaces. So, now we’ve got a unified namespace where we’ve got Fresh Farm Dairy Munich packaging line, line two and Storage Tank, and then you’ll see shortly we’ve got line one filling machine again. So, there you can see all of this information is being pushed here. And here, we’ve got our JSON object that is coming through with all the information that you specified and all the data and all the time stamp that is going to be important when we then go into InfluxDB. So, now we’ve got our architecture that is using MQTT at the edge, pushing this information to the cloud. And then now the next step is to then—so already, you’ve got a unified namespace if you want to consume this information from different applications, you have got an MES system, you’ve got a custom application that needs to consume information from this unified namespace, you’re already able to subscribe to HiveMQ Cloud and receive information from your entire enterprise just by subscribing to that one broker. Information is already filtered through and then you can get all the information that you need. If you just need information from line two, you subscribe to line two. And this is where you’re also—suppose we’re also publishing OEE information, let’s say we’re publishing the overall equipment effectiveness of a filling machine, that would also live under that particular line or that particular topic namespace. So, you’ll be able to subscribe to receive that information from it.
Kudzai Manditereza: 34:47
So now, we want to historize all of this information to put it into a database. Our unified namespace needs to live in that time series database, which is our InfluxDB. So, we now need to connect to HiveMQ Cloud to our InfluxDB instance, and then be able to persist that information and then use it for analytics at a later stage. So again, if you need to sign up for InfluxDB Cloud, you get a free account that allows you to start collecting all of this MQTT data, persisting it, and perform all sorts of operations that you need with that data and integrate it with many other third-party systems if you wish. So, if you want to connect to– if you want to sign up for InfluxDB free account, again, you just go on to click on InfluxDB. So, I’ve already signed up. So, when you do it for the first time, it allows you to register. So, you can use your account, create your user email to create an account, and then you then get your free credit as you can see here. So now, since I’ve already signed up for my InfluxDB account, let me quickly go in here and show you what we need to do initially.
Kudzai Manditereza: 35:57
So basically, I mentioned at the start of this call that there’s many ways, really, of ingesting data into InfluxDB. You could use Telegraf, we could use some client libraries that are sending data directly, but in this case, we’re using the native subscriptions that are within the InfluxDB Cloud platform that allows you to subscribe to an MQTT broker. So, in this case, as you can see, I’ve got two subscriptions. [What?] is really interesting with the unified namespace is that within InfluxDB now you’re able to subscribe to specific topics. So again, InfluxDB is for timestamped data, but you might find that in a unified namespace you have got some data that is not really timestamped, right? So, you want to be able to subscribe to specific topic that is giving you all of this raw information that is timestamped. So, as you can see here, I’ve created two subscriptions. The first subscription allows me to subscribe to line one, so I’m subscribed to receive all the data from line one. So, what I need to do here is to first create a name for the subscription, provide the hostname or IP address of my HiveMQ Cloud here, the port details security credentials, and then here, specify the topic to which you are subscribing to receive information.
Kudzai Manditereza: 37:18
So, you can see here, for this subscription, I’m subscribing to receive all information that is coming from line one. So, if you remember, well, this is your filling machine data, which means we’re receiving all filling machine data. So, if you have got different PLCs, system sensors within your line one, you receive all of this data here. Now, because my data is JSON, so you could use here InfluxDB’s Line Protocol. If you’re using InfluxDB Line Protocol, you could use some regular expressions, but here, I’m using JSON, so I need to define the passing rules for JSON here for the data that I’m receiving. So, this is what I’ve done here. So, because I’m sending timestamped data from my groov RIO and Raspberry Pi, I’m getting that time property and then using it as a timestamp within the InfluxDB database. And then here, I’m taking the asset type which I specified, and then I’m using it as a measurement within InfluxDB. And then I’m taking the asset ID and using it as a tag within the InfluxDB. And then from there, I then create different fields. So, I’ve got temperature, which is the (inaudible), and then I reference it, which is the filling temperature here. So, this is the name within InfluxDB, and then this is the name within the JSON object. And then I’m also reading pressure, and then I’m also reading the speed at (inaudible) per minute. And then I’m also reading the inlet pressure.
Kudzai Manditereza: 38:47
So, if you have got all the other different data points that you want to measure, this is where you would specify them under the field property. And then you use your measurement and tag to represent your data. So, this is the same thing that I’ve done also for line two here to specify the topic that I’m subscribing to and then the passing rules for data that is coming from there. And then, all of this information is actually going to a bucket that I’ve assigned to it. So, now I go to my Data Explorer to see all of this information that we’re actually publishing from the edge. So, as you can see here, this is the Demo Bucket. This is the bucket where I’m specifying that all this information that is coming from MQTT subscriptions, I need it to be persisted on this Demo Bucket. So, as you can see here, I’ve got my filling machine and storage tank data that is already showing here, and then, here, I can then select which data points I want from there, and then the asset IDs that are coming from my devices, and then these are the actual topics, and then I can then see all of this information that is currently being generated from the edge.
Kudzai Manditereza: 40:08
So already, you can see if you’re a process engineer, you’ve got all this data sitting here that you can then go on to do all different kinds of filtering, and then you can export this data, and then you can also perform all kinds of operation on this data. You can view the data in different ways. So, I could look at it from a table point of view. So, as you can see, I’ve got all the timestamps of the data, I’ve got the values, I’ve got the field, and then more importantly, I’ve also got the topic from which this data is coming from. So, you’ve got all your data that is sitting within InfluxDB database. It’s got all the topics, which means there’s context around where this piece of data is coming from, which you can then use later on if you’re connecting this to an external system or within InfluxDB itself, to then be able to query through all of your data and be able to get the information that will make you perform some informed decisions as far as your operations is concerned. So basically, this is the edge-to-cloud architecture that I wanted to show you today, but there’s more that you could be able to do here on InfluxDB as far as machine learning, the AI module, also, that they introduced that allows you to dig through your data and make all the different kind of insights that you can come up with here today. So that brings us to the end of this demonstration. I’m happy to take your questions here.
Caitlin Croft: 41:42
Awesome. Thank you, Kudzai. Really appreciate it. So, there are lots of questions for you, so let’s jump right into it. So, the first question I actually will answer. So, someone asked, could you please tell us what the reasoning behind the depreciation of the native MQTT subscription in InfluxDB Cloud 3.0? And yeah, I’m happy to answer it. So, I’ll just be honest. The adoption of this feature was lower than anticipated and most customers are still using Telegraf, which is our native data collection agent, which provides more features for MQTT ingestion than this feature did. So, after the team looked at it, we determined that our customers would have a similar and better experience using the MQTT Consumer Telegraf plugin. So, I hope that answers. I know it’s always tricky when organizations depreciate certain features that some are using, so I hope you understand. Let’s see. Kudzai, first question for you is, is SCADA the only supported protocol? What about Modbus, etc.?
Kudzai Manditereza: 43:10
Okay. So, I suppose you mean on the HiveMQ Edge platform. So, basically, the adapters that are supported currently, these are the adapters; we’ve got Modbus, OPC UA, or HTTP, and also, a simulated one. Now, as I mentioned, this is open source, and we actually have Siemens S7 protocol adapters and Allen-Bradley protocol adapters coming very soon. Those, I think, will be published—well, I can’t say for sure when, but generally, in Q4, I think we should see those being released, so. But just keep your eye on the GitHub repo, and then you’ll find out more about the other ones that we’ll be pushing through.
Caitlin Croft: 43:56
Right. Is there a way to use additional connectors for HiveMQ Edge for protocols like Ethernet IP, CIP, or Siemens S7 Plus optimized block access?
Kudzai Manditereza: 44:13
Yeah. So, basically, I think this question really kind of speaks to the previous one, where I mentioned that we are releasing S7 protocol adapters. That is, kind of, really top of priority right now because there’s been a lot of requests coming through for that. So, definitely, you are going to be supporting as many protocols as possible.
Caitlin Croft: 44:34
Right. All right, let’s see. A few people were asking if the recording was going to be available. Yes, the recording as well as the slides will be made available by tomorrow morning. And we make it super easy for you guys. You just have to go to the page that you registered for this webinar tomorrow morning, and it’ll actually be converted over to the recording and the slides. Also, you guys all should have my email address. So, if you guys can’t find it or need any help, feel free to email me. And if you are trying to connect with Kudzai, feel free to email me and I’m happy to connect you with him. All right. Can you have a different time stamp between the sensor’s measurement and InfluxDB?
Kudzai Manditereza: 45:27
I’m not sure I understand the question. So, basically, what I was saying, can you have different sensor measurement? So basically, you choose whether you use because it’s either you use time stamp from InfluxDB platform, or you use the time stamp from when the data was generated. So architecturally, you decide whether you want to use that time stamp from when that data was generated or the time stamp from when that data landed on InfluxDB. But ideally, you want to be able to use the data from when that– the time stamp from where that data was generated, because in some cases, you find that some PLCs, they buffer information, maybe when there’s a network outage. So, you want to be able to know exactly when that particular event occurred, right? So ideally, you want to use the time stamp from the source.
Caitlin Croft: 46:20
Right. Herman—and I apologize if I mispronounced your name—if you have any further questions, I’m happy to unmute you if you want to expand on your question. How do you guarantee observability and monitoring for HiveMQ Edge?
Kudzai Manditereza: 46:43
So, we’ve got a log. So HiveMQ Edge does actually have an API endpoint that allows you to receive all the information; configuration information, and also a log file for you to get all of this health or status-related information that you need. So maybe if you’re talking about observability from a standpoint of tracking the movement of the actual metrics or the data throughout its journey, so that is a capability that is available with the Enterprise version of HiveMQ that allows you to—that connects with Open Telemetry. It’s got an Open Telemetry interface that allows you to monitor the movements of your information right from the source up until it leaves the MQTT broker. So, that is possible within the Hive MQTT Enterprise edition.
Caitlin Croft: 47:41
Let’s see. What are the pros and cons of MQTT in HiveMQ as the transport and edge servant versus InfluxDB?
Kudzai Manditereza: 47:56
So, I mean, these are two different technologies altogether. So, InfluxDB is a time series database. Hive MQTT is a messaging platform, right? It’s a messaging platform, and the other one is a time series database. So, there isn’t really a comparison to make there. So, I’m not sure. Maybe if you want to unmute them if they’re on the call to find out exactly what they meant.
Caitlin Croft: 48:27
Let’s see here. Apologize, I was scrolling down to find the next question. So okay, Hans, I’m going to allow you to talk if you want to expand on your question. You should be able to unmute yourself.
Attendee #1: 48:46
Yes. Okay. Can you hear?
Caitlin Croft: 48:48
Yes.
Kudzai Manditereza: 48:48
Yes.
Attendee #1: 48:49
Okay. I’m just curious to know the pros and cons of InfluxDB Edge and the synchronization of InfluxDB Edge up to Influx Cloud, and then the approach shown here where you use MQTT and HiveMQ as the edge and the transportation.
Kudzai Manditereza: 49:16
Oh, okay. All right. So, basically, these are two different approaches in the sense that the other one is technology-focused, is MQTT-focused, whereby you’re connecting everything that is MQTT, and then the other one is InfluxDB to InfluxDB. So, you’re syncing what’s already within your InfluxDB Edge into InfluxDB Cloud. And then if you’re using MQTT InfluxDB, you’re taking from whatever is MQTT, might not necessarily be InfluxDB, into InfluxDB Cloud. So, I wouldn’t maybe say there is a pro and con for each approach. This is all kind of use case specific to say, “Based on what you want to achieve, what would make sense?” If you want to collect data from multiple devices or sensors that support MQTT right into InfluxDB in the cloud, then, obviously, you want to take that approach. But if you already have a deployment of InfluxDB at the edge and you want to move that data to InfluxDB Cloud, then, obviously, that makes sense. So, it’s use-case specific. I hope that makes sense.
Attendee #1: 50:34
It makes sense, but it doesn’t give me any pros and cons.
Kudzai Manditereza: 50:39
So, do you have a use case that you have in mind?
Attendee #1: 50:45
You explained the use case as if I already had an InfluxDB deployment at the edge.
Kudzai Manditereza: 0:53
Yes.
Attendee #1: 50:54
Then it would be natural to use that replication. Yes, of course.
Kudzai Manditereza: 50:58
Yes.
Attendee #1: 50:59
But what if I’m greenfielding here? What if I need to select?
Kudzai Manditereza: 51:06
Yes. So again, this is where if you’re greenfielding—now, if you’re thinking about to say, “I want to add so many PLCs, I want to add so many MES systems, and I want to use an open communication protocol between these systems,”—because, remember, data storage is one aspect of your architecture. There’s also the communication that needs to happen within the PLCs themselves. So, if you are greenfielding and you want to establish a communication mechanism between your components outside of storing the data, so, obviously, here, you want to use MQTT protocol because it’s the communication protocol that gives you the openness of communicating between multiple systems. And, really, that’s why I was kind of saying this is more of an apples to oranges comparison as it were, because you could still use MQTT with InfluxDB Edge and do the replication in combination while you kind of keep the messaging restricted to MQTT and the syncing to the cloud. So, I’m not sure if it’s giving you the pros and cons as it were, but, obviously, InfluxDB is not a messaging platform, right? So, you wouldn’t use it to communicate between different PLCs. MQTT is not a database platform, so you wouldn’t use it to store and replicate data.
Attendee #1: 52:43
Fair enough. Let’s park it there and allow some other questions.
Caitlin Croft: 52:48
And Hans, I’m happy to connect you with Kudzai over email afterwards if you have more follow-up questions. All right, so someone’s asked if you can share the JSON file for this demo. Kudzai, if you’re open to it, we can share that if you’re open to it.
Kudzai Manditereza: 53:06
Yes, absolutely.
Caitlin Croft: 53:08
Okay. Cool. So, the answer is yes, we will. So, someone asked, “Do I have to launch Telegraf externally in order to push MQTT data from a broker, or can I launch Telegraf within my InfluxDB Cloud 3.0 instance?” And the answer to that is you can launch Telegraf internally if you are using the 3.0 cloud version. So, if you’re using InfluxDB 3.0 cloud, there shouldn’t be any issues with that. All right, let’s see. With this approach, HiveMQ Edge to HiveMQ Cloud, is the data stored locally in the HiveMQ broker during the connection outage periods, or is there a separate strategy or software that is needed?
Kudzai Manditereza: 54:12
So, HiveMQ Edge is MQTT 3.1.1 compliant, so it follows all of the specifications. And what I mean by that is that it allows you to, kind of like, use persistent sessions whereby you’re specifying that if the client is not—if that data is not consumed, it will be stored within the HiveMQ or whatever platform you’re using to host HiveMQ Edge, right? But it’s also important to mention that this is dependent on the amount of resources that you’ve got, like how much data you can store until your network is restored. So, if you’ve got massive amounts of disk space, then you’re able to store a whole lot of messages using that persistent session. But if you’ve got a small amount of resources, then again, you won’t be able to store much. So, it’s all dependent on, but definitely, it allows you to do that, as does any MQTT platform that is 3.1.1 or 5 compliant.
Caitlin Croft: 55:29
Kudzai, someone is asking you to point to the docs with the simulated edge devices. Is there a URL for that, or?
Kudzai Manditereza: 55:39
So, we’ll also provide those flows. We’ll also provide those flows for you to download.
Caitlin Croft: 55:45
Perfect. Perfect. Yeah, people are asking if there’s a GitHub repo for trying HiveMQ to send data to InfluxDB. So, we’ll try to include as many links as possible for you guys afterwards to help you guys out. All right. Just going through a bunch of the questions here. I think we’ve answered most of them.
Caitlin Croft: 56:20
Someone just wanted to confirm, is the data from Node-RED, is it being pushed as a JSON object, and is this required?
Kudzai Manditereza: 56:30
So, the data from Node-RED is pushed as a JSON object, but this is not required. As I mentioned, InfluxDB is capable of consuming JSON but is also capable of consuming InfluxDB Line Protocol. So that could be pushed using InfluxDB Line Protocol into InfluxDB platform, so it’s not required.
Caitlin Croft: 56:54
All right. I think we’ve gone through most of the questions, Kudzai. I know there were a ton of them. I think there’s a few that we might have not gotten to, but please feel free to email me if you guys have any other follow-up questions. Happy to connect you with Kudzai. Really appreciate everyone joining today’s webinar. I know HiveMQ and InfluxDB is always a really popular topic and there’s always tons of questions. So, thank you everyone for joining today’s webinar. It will be available by tomorrow morning as well as the slides, and we’ll also make sure to include links to the resources that Kudzai mentioned. And I hope everyone has a great day. Thank you, everyone, for joining. And thank you so much Kudzai.
Kudzai Manditereza: 57:41
Thank you. Thank you so much for having me, Caitlin, and thanks everyone for joining.
Caitlin Croft: 57:45 Thank you. Bye. [/et_pb_toggle]
Kudzai Manditereza
Developer Advocate at HiveMQ, Founder at Industry40.tv
Kudzai is an experienced Technology Communicator and Electronic Engineer based in Germany. As a Developer Advocate, his goals include creating compelling content to help developers and architects adopt MQTT and HiveMQ for their IIoT projects. In addition to his primary job functions, Kudzai runs a popular YouTube channel and Podcast where he teaches and talks about IIoT and Smart Manufacturing technologies. He has since been recognized as one of the Top 100 global influential personas talking about Industry 4.0 online.