Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
Session date: Sep 27, 2022 08:00am (Pacific Time)
LBBC Technologies are the world’s leading designers and manufacturers of industrial autoclave technology. Aerospace customers use this equipment in the manufacture of high performance castings, like turbine blades. With hundreds of machines all over the world, LBBC are pushing the boundaries of the support they can offer customers. All LBBC equipment comes fitted with industrial gateways which simplify the data connections between industrial PLC controllers and web services — like AWS. This enables LBBC to offer their customers “Connected Support” and Web SCADA. Through their Connected Support software solution, LBBC are providing customers with advanced diagnosis tools used for troubleshooting and process optimization. Discover how they are using a time series platform to enable faster remote anomaly detection and quicker time to resolution.
Join this webinar as Andrew Smith dives into:
- The architecture that LBBC have chosen
- The role that InfluxDB plays [alongside other elements of LBBC’s IIoT infrastructure]
- The way in which industrial customers are using InfluxDB [to monitor equipment condition and provide advanced support services]
- An example of how the infrastructure is delivering valuable insights that are leading to competitive advantage
- InfluxDB tips and best practices (including the MQTT Native Collector)
Watch the Webinar
Watch the webinar “Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Caitlin Croft: Sr. Manager, Customer and Community Marketing, InfluxData
- Andrew Smith: Process Control Engineer
Caitlin Croft: 00:00:00.670 Hello everyone, and welcome to today’s webinar. My name is Caitlin Croft. I’m joined today by Bria on the InfluxData side and also Andrew from LBBC who will be talking about how he is using InfluxDB. So the session is being recorded, and it will be made available later today or tomorrow. So you can check out the recording as well as the slides. And if you have any questions for Andrew, don’t be shy, please post them in the Q&A in the chat and we will answer them at the end. And without further ado, I’m going to hand things off to Andrew.
Andrew Smith: 00:00:38.914 Excellent. Thanks Caitlin, and thanks to you all for turning up. This is new to me, and I’m really pleased that people have dropped in. Straight off the bat, I’d really like to really plug — I’d love to connect with people who are trying to do the same kind of thing. I don’t think this should be a one-way thing. I would love to hear feedback of, “Hey, why did you do that?” Or, “We think that that was a really dumb idea. Check out how we’ve been doing that.” So yeah. I’d really like to connect with people in the space. Okay. I’ll make a start. So making the most of process data and I’m really framing this around connected support. So quick bio about me, which Caitlin’s asked me to do. My background is in controlling systems. I’m a chartered engineer. I started in process control mid ’90s, so glass manufacturer. Then after roughly a decade in that water and wastewater treatment, basically SCADA/PLC control, data mining, modeling, model-based control has been my thing. So I’ve been in and around data and manufacturing for a good while. 12 years overseas, involved with the tech startup, a bit more manufacturing and a bit more consultancy on how innovation is done in small companies. And then last three years here. So focused on for a machine manufacturer, the IoT use case and what we do with the process data, where we collect it, why we collect it, what we do with it. I now have transitioned from the role where I have built that infrastructure and I’m now the lead innovation engineer for LBBC. I live in Leads in the UK, married with two kids.
Andrew Smith: 00:02:36.903 Okay. So connected support. I’m just going to quickly frame what I’m going to be talking through. I’ll talk a little bit about our machines and the type of processes they represent; why we wanted to have connected support and the value in that; where the data comes from because for this, maybe there’ll be some commonalities across industrial IoT and some things that are different; typical users because before we went and looked for systems, we had a use case in mind; typical scenarios, so what goes on between our customers and us that needs these kind of infrastructures; listing what kind of requirements we had before we even went shopping for solutions; the architecture we ended up with; some geeky stuff on actual the nitty-gritty of data and some Flux code; and then looking at the role that Influx plays in actual use. Then I’m going to give some examples of troubleshooting and a neat example on data mining and processing. So we’ll get started. Okay. Investment casting, that’s what LBBC do. And we produce the equipment globally that does that. We supply to big players, so Rolls-Royce, Pratt & Whitney, GE. And investment casting is basically the process of forming some kind of casting. We’ll use the example of turbine blades, and that’s from a wax pattern, a ceramic core and a ceramic shell. So basically, you’re going to end up with a turbine blade that has an outer shape but also has an inner shape. And so we produce machines that remove wax from that process from the pattern, and we also have machines that actually remove the core from within, for example, a turbine blade after it’s been cast.
Andrew Smith: 00:04:44.592 Okay. And this was taken this morning. This is at our workshop. This is the window behind me. So all these machines are going to places like — what do we have here? Russia, China, and the US I think. Okay. So what were we trying to do with connected support? You see, because we supply these machines all over the world, and typically, as a machine supplier, once they’ve been commissioned and tested here, they’re shipped out, they’re commissioned in a customer’s site. And the less we hear about it, the better. But we provide warranties on all these machines. So for a year, at least, it’s definitely our problem. And during that period, we really want to hear what customers here. So if an alarm goes off, we want to know about it. We don’t want that to be translated through email. So we want to take those real-time alerts beyond what we call the foundry fence. That’s the foundry of our customers. We want to see what they’re seeing. So if a valve is not opening, we want to have that represented pictorially. We want to see what the machine has been doing, how it’s been performing for the entire life of the machine. Was it doing this two years ago? We want some kind of way of doing advanced data processing. So not just basic limits on is the temperature too high or too low? But have we ever seen these combinations of 10 variables in this 10-dimensional space before? Has the machine ever done this before? And then in a similar line, because they’re not exactly the same, data-driven contingent monitoring so spotting imperceptible changes. For example, if a motor in step 46 is consuming 5% more power than it ever has done before, then we are looking at a bearing degradation. So we want to be comparing data now with data, say, two or three years ago.
Andrew Smith: 00:07:00.733 Okay. So just a bit of detail about some things that we can’t change, which is where the data comes from, this is one model that we have, an LC 450 core leacher. And key to a lot of this, which will be the same across a lot of industrial settings is these things are controlled by industrial PLCs from the same three, four, five manufacturers of PLC. And in those PLCs, we have some specific parameters that we’re interested in. So in this case, there’s some heaters, there’s a bunch of valves, some pumps. There’s some states of the machine, what cycles it’s in, what phases it’s in, some safety stuff about what condition the lid is, some pressures, some temperatures, and some levels in tanks. And any process that is controlled by an industrial PLC will have similar parameters, some of them Boolean, floats, and integers. And we’re stuck with that. So in terms of our needs and requirements, the first thing is cost to be able to ingest, query and store data for the life of a machine. If these things are spitting out, say, between 20 and 50 variables a second, and we’re wanting to store that for the entire lifetime of the machine and maybe compress some of it, we need to know that we can do that. We need to have standard hardware between PLC and cloud. A lot of manufacturers will be very concerned that we’re connecting these things to the web at all. And so we need to be able to use standard hardware that works across different PLC types. We produce a lot of equipment with two or three different brands of PLC. We don’t want to keep changing hardware, so we want to have standard components. Security is a big one because we are now holding our customers data. They want to know that their competitors can’t tap into that or steal it.
Andrew Smith: 00:09:23.224 Not losing data. So if various components of our cloud infrastructure are down, we need to be able to store data here so that we can forward it to there when systems are back up. And we need to be able to guarantee that data did actually arrive and have receipts for all of our data. Serverless. This is maybe a bit of a thing of mine. I don’t really want to own anything that’s physical in the building. Basically, there shouldn’t be no rack with a PC in it and a fan that we have to take care of here. The entire thing should be serverless, and I don’t even want an EC2 somewhere on an Amazon server. We need to be able to record data for the lifetime of the machine, visualize it, and explore it. Even if you’re an expert, have live dashboards that basically tick along as well as open and close. Our service guys can see those live and get alerts to our mobile app and also SMS. Okay. And then be able to process data in complex ways at will. Okay. So if I talk about some typical users, I haven’t got to the Influx bit, but we need to understand who all is using this. So maintenance engineers, they will typically be the people that customers will contact first of all. Maintenance engineers may not have any software experience and have limited interest in PC or software or web apps at all, but they are responsible for finding and fixing issues. Customers don’t know what’s wrong. They just want you to fix it quick. We might have process experts. They might be specialists on how the machine runs. Again, they’re not software people. They don’t love software. They love the process, and they’re obsessed about improving it. And they often get called on for particularly challenging issues that are represented by some kind of data. Managers, just summaries, performance metrics. And you process data people. This varies from industry to industry. So in the glass industry, you might have people who were just obsessed about how this entire factory was running. The kind of people that talk about six sigma. Again, not necessarily programmers. And they’re wanting to create statistical queries from data. And then you’ve got your database and software people who may have specialist experience.
Caitlin Croft: 00:12:10.059 So a typical scenario would be a customer rings us up and says, “The machine is behaving really strange. It’s not working. Fix it.” Of course, this happens so very rarely and then it’s our maintenance engineers who either have to fly out somewhere, or now, thanks to the infrastructure we have, have the tools and capabilities to solve a lot of problems. So a maintenance engineer will want to visualize various parameters together. So it might be some temperatures, some pressures, some valves, some pumps, and then troubleshoot what happened and when, and then draw some conclusions and then be able to communicate that back to the customer. Because sometimes it’s the customer who has done something wrong, not maintaining something, or the customer is misunderstanding about a use. In some cases, that may go to a process expert where a maintenance engineer is seeing something on a machine and having to call on a process expert and say, “I’m seeing this,” and they need to be able to communicate what this is in terms of data and then ask questions like has it always happened? Does it happen on other machines? Which other machines does it happen on? And so a process expert will then want to be able to look at multiple machines for the same parameter and run queries and design some kind of analysis. Okay. So I’ll skip through to another case, which I will talk about later is where we discover hidden patterns in data that represent a fantastic opportunity that we can tell a customer something that they cannot know in any other way, and we didn’t even between us know we had the data to do that. So that’s a neat use case that I’ll talk about later on.
Andrew Smith: 00:14:10.314 I would say that this is the requirements slide again, but with things set in purple that we think that Influx fulfills and the other things are fulfilled by the way in which Influx sits in the entire infrastructure. And of course, most of you are thinking, “Okay. What does this infrastructure look like?” Okay. So now we get round to what we ended up with. Even though this is a summary of what it actually is, we’ll consider it piece by piece. So the first is that we discovered, after looking a lot of different gateways that connect to industrial equipment, reliably that far and away, everything speaks MQTT and OPCUA. [inaudible] the UA. And that’s the way in which our data actually originates from the customers plant. And before I go too much further, this same infrastructure handles all of our machines that are out there. I think currently we have 20 or so live machines. So in this, Amazon is the MQTT broker, and yet we’re aware that Influx also has this functionality now. We’ll come to that later, but we use Amazon as our MQTT broker, with good reason. MQTT messages are then handled. They’re sliced and diced by Lambda functions. If you’re familiar with the Amazon function. These are little pieces of code that wake up when a message arrives. And Amazon will scale that. The faster messages arrive, the more lambdas it will spin up. And it’s the job of those Lambda functions to post data into our Influx database and to update our live dashboard. So at this point, data is channeled into two different tools for different purposes, Influx for recording historical and a dashboard for recording the live update of what’s happening now. Any failed data within this infrastructure is queued in DynamoDB because that’s what Amazon provide within their infrastructure that’s available on a serverless basis. And when connections become available, then that data will be emptied out from DynamoDB and sent to where it was supposed to be headed. We built this at a later stage after we’ve had some connectivity issues. And then meanwhile, this whole code is monitored and we’re alerted, and Influx is also logged if the Amazon infrastructure is having any problems.
Andrew Smith: 00:17:17.344 Now, we do actually use a third-party software called Servant for our live alerts and live dashboarding, because we can provide the customer with their own view of their own machine. So we don’t use Influx for everything. We use a dashboarding software for what that is good at and Influx for what it’s good at. So I hope that gives an idea about the chosen infrastructure. We tried a lot of different things and considered a lot of different things. Okay. So what do the messages look like? And I introduced this partly because when you’re using a standard industrial gateway, you will typically not have complete control over what format these messages are writing. All these manufacturers have their own opinions about what shape messages should be and what format timestamps should be in, for example, and whether they come with time zones, which for us is a big issue because all our machines are all over the world. So initially, it comes as a very cumbersome list of a PLC tag. So that’s a PLC tag name, a value, which sometimes is a float or sometimes as a Boolean and a very long-winded timestamp. But what you notice here is that a lot of these timestamps are identical. So in this second, a lot of things happened at the same time. In fact, all this stuff happens in the same second. Okay. So after it gets to the broker — and this connection to Amazon, these messages are free. You don’t pay for ingest messages, which is a neat way of doing it. So at this stage, the first processing is to attach a client ID to the message.
Andrew Smith: 00:19:26.515 And next, these messages are basically sorted into a single timestamp in the — let me just — Amazon [inaudible] site. Okay. So it’s a single timestamp that is in the format that you’ll be familiar with for InfluxDB and then a list of all the things that happened in that timestamp, which is much more readable. And the reason it ends up as this first is because the next step is the line items that get posted into Influx. And so there’s a very easy translation between the two. And so basically, that very long cumbersome message gets collapsed down into basically five lines that get posted into Influx and return within 250 to 350 milliseconds. Okay. As a point of interest, the dashboarding system wanted data in a completely different format, which is why we had to have the flexibility of Amazon to be able to interface and tie these things together. And I actually think that that probably will be pretty much if somebody’s designing an industrial infrastructure for IoT. I think you will need that flexibility. Okay. So with this infrastructure, what is the connected support procedure? We have a dashboard with an app. A customer is looking at a dashboard that represents just their machine, and we have the ability to look at all machines worldwide and see all errors that are happening worldwide and prioritize those. So LBBC and customers will be notified of the same alert at the same time. I don’t know if I have any on my phone right now. No. All good right now, but we would be able to then bring up the dashboard for that machine, have a look briefly of what’s been happening with the alerts that’s been happening a lot, and then somebody might jump on that dashboard initially to see if there’s an obvious solution. And I would say that most issues can be resolved at that stage. If it’s something really unknown, then we start digging in the Influx database, and that’s when it comes into its own, because we can use Influx to construct any detailed analysis of what’s happening and what caused a particular event or whether we can see patterns or whether it’s happened before, whether it happens across machines. So that’s when we start looking into the Influx data.
Andrew Smith: 00:22:28.297 So what’s the value of Influx to us, and why did we particularly choose Influx? What other things did we have a look at? So I’ve decided to frame this as things that InfluxDB delivers. So that vertical axis is how well it delivers it. And this horizontal axis is value for the IoT use case because we’ve kind of realized halfway through this process that Influx hasn’t been designed specifically for industrial IoT. So the things that we value for the IoT use case are low cost. I would say it’s very, very cheap to store a lot of data and ingest a lot of data and to process a lot of data. Visualizing data was one of the first things that also, I think, blew us away in terms of a database that comes with visualizations already built in. So we don’t use an Influx database somewhere with Grafana looking at it. We are using the cloud infrastructure, which knits it together as one thing, and our engineers would use it as one thing. It is installed as one thing. It ingests as one thing. Processing data and the ability to write complex queries is something that Influx does very, very well. It’s possibly less of a thing that we use day to day. And yet when you’re trying to kind of pick out some particular issues, being able to do that is very, very valuable. Same with automation. I think we, as in the IoT use case — clearly a performance UI is really, really valuable. I think Influx delivers this pretty well, as it does the ability to share insights. So it’s critical to our use case for an engineer to be able to chase down some data and then send a single link to another colleague, and when they get that link, they’re looking at the same data. That’s really critical. It’s critical between us within the company. It’s also critical between us and customers. I would say that I’ll come to some things that we feel that Influx doesn’t do so well, but it does that mediumly well. Security. I’ll come onto this later. It’s something that we value very highly. Yeah. We’ll say more about that later.
Andrew Smith: 00:25:22.155 Okay. What other products did we consider? The Influx team really encouraged me to kind of get into the why? Why did we choose Influx? So I can remember probably about three years ago, two and a half years ago, I tried to forecast if we were going to have all of our machines going out the door online, month on month, what kind of data size would this database grow to, and what kind of write in terms of gigabytes per month would we be performing? And off the basis of that, we looked at what would that cost for ingesting that data and storing that data over a period of time. And these were very, very rough and ready. There were some off-the-shelf products which basically take away all your industrial IoT worries and charge you an arm and leg. There were a selection of build-it-yourself. And in there, right at the bottom here, is Influx DB. I would say at the time InfluxDB was the second cheapest way to do this, the cheapest being AWS Timestream, which we discovered after poking around, wasn’t actually even available. It was shouted about everywhere, but you couldn’t actually buy one or buy an instance. And so at the time, we knew that this was going to be way cheaper than any other solution. So we started prototyping and testing and found that the performance was not only just price but also visualization. Okay. So I want to talk about a data exploration example, because I’ve tried to — it’s very difficult in a quick webinar to talk about what our data looks like and what we use it for. Everyone in industry for the last 20 years has been collecting data in massive SQL databases, and we all know that it’s not what you collect, it’s what you do with it that’s really important. So I want to really spend a good chunk of time talking about a new use case.
Andrew Smith: 00:27:40.169 Okay. So I’ve already introduced a core leacher that removes a ceramic core from within a turbine blade after it when it is a turbine blade. And if I give some background to this machine, the machine uses hydroxide, potassium hydroxide. It’s a dangerous and costly chemical, and they use the same tank of hydroxide month on month. And as it dissolves more and more core, which is silica material, it creates silicates, and the hydroxide gets depleted. Now, both the silicates that are remaining and the hydroxide are high PH. Actually measuring PH, it’s always 14 or 15, but measuring how much hydroxide is left has a real customer value. I’ll come to that in a minute, because if you knew when your hydroxide was going to run out or you knew you were running with 50% hydroxide, you would massively be able to reduce quality defects. And so it’s a value to customers, and no one can do this. So a little bit of insight that we combined that background with is that hydroxide has a very low vapor pressure, and that if you’ve got not much hydroxide, when you are pressurizing potassium hydroxide, you get a slightly higher vapor pressure, because the more hydroxide you have, the lower the vapor pressure is. And the other thing we had was loads of data, because this was one of the first machines that we connected up. We had between about 18 months of data during which they had been just running this machine daily and exhausting the potassium hydroxide. We had pressures, temperatures, valves, pumps, everything. And it was a very rich data set. It included both operational prep, operational conditions with fresh hydroxide, spent hydroxide, and everything in the middle.
Andrew Smith: 00:29:52.225 Okay. So the challenge was can we estimate hydroxide strength purely from data, and can we predict when the hydroxide for this customer is going to be exhausted? Now, bear in mind, this is a challenge that’s difficult even if we were to fit the machine with lots of instruments to measure different things. We discovered the challenge after we had just been collecting data from this thing for a year, and we were going to try and do it without any instruments at all, purely from analyzing data. Okay. So to start getting into the nitty-gritty of what Influx can do and what we use it for in terms of processing, I’ll introduce you to cycle number 1247, which it looks like was done back end of 2021. And so along the bottom here, we have a valve opening and closing or a couple of valves that [inaudible] things. Yeah. We have a pressure that’s ramping up between, as it happens, nine bar, and then it relaxes down to about two bar, and that is the core leaching cycle. Now, I’m only interested in the place where potassium hydroxide generates its own pressure, so I’m only interested in certain data points in this data. So we have to isolate those. And at the beginning, this data is very noisy. So we now have to draw a best fit line through all this data and extrapolate back to the start of the cycle and then decide, “Right. We’re going to give that vapor pressure for that cycle 2.4 or something.”
Andrew Smith: 00:31:44.148 Okay. So if I just talk briefly about the Flux code. So remember, we’re trying to isolate data points, draw a best fit line, and come up with a single value. So first, I’m selecting parameters of interest, pivot, and fill, which is probably one of the most common things we use for working out what things happened together — I think pivot and fill should be a single function — the computation of some auxiliary filter variables to help us work out do we need this data or do not need this data, and then a filtering process, and then on this page, setting up regression of all the points, computing some terms to compute this regression analysis, calculation of the slope and intercept of the line, and then I think computing a final value, and preparing it for some kind of display. Okay, now we’re going to do that with a lot of data, because this was a year’s worth of data. And actually. We had 12 months of data, 150 cycles, 1.3 million points, and that’s only pressure points. I mean, the database had stored some 200 variables, so we haven’t even started to dig in that stuff yet. But we have dug in the single variable pressure and a couple of valves opening and closing. Now, that actually is done by Influx in 10 seconds. And I don’t know if this will work if I just jump out here. So that’s performing that analysis. Again, there’s no need because it’s actually stored somewhere anyway. There you go. I think it got to 10.5 seconds.
Andrew Smith: 00:34:03.208 So this is the computation of a vapor pressure, one vapor pressure for every cycle. And stuck in here somewhere is the point that I originally showed you. And we can start to see this pattern emerging, which no one had ever seen before. So here’s that same data. That is cycle number 1247, but we now see that it exists as part of what is a bit of a bigger pattern. And here’s the way the pattern went. Baker pressure increases over months and then suddenly drops to almost nothing, and then it decreases over another couple of months and then drops. So here’s how we interpret that data. We contacted the customer around about the time we saw this first drop, and we asked them, “So what’s happening with your hydroxide?” And they said, “Oh, we replaced it.” So we were comfortable that we’re now making some sense of some data that’s been hidden. We then let them run for the next six months or so and watch the same pattern happening. And then round about the time that I kind of thought, “Well, last time it got this high, they changed it, so maybe I’ll notify them.” So I emailed the customer and I said, “I think your hydroxide needs changing because we’ve constructed some data analysis that seems to indicate that your hydroxide is spent.” And they were very surprised, quite impressed, and they were giving back, “Well, yeah. Your timing seems right on cue. We did actually think it was close to spent, and we’ve seen some performance issues. We trust your data.” And so they did actually change their hydroxide. And this parameter dropped down to next to nothing again. And we now will be able to tell them — actually, this slide was done some time ago, but we’re now close to being able to tell them when they’ll need to order ahead this potassium hydroxide and also what strength they’re running with right now. So I hope that gives a little bit of a flavor of the power and what we’re using it for and the value that we’re getting out that. That is now becoming an R&D project to deliver a metric to a customer that none of our competitors can do right now.
Andrew Smith: 00:36:47.397 Okay. Some learning. And this comes with a caveat. This is how we see it only. It’s an opinion, really. Very early on, I think I was reading documentation about how to get data in, and I kept hearing about this Telegraf, Telegraf, Telegraf. And I would suggest against it for industrial IoT. I would say that Amazon Lambda and the Influx API is a better solution. And I’ll qualify that. Telegraf needs a host. I don’t know if it’s just me that’s allergic to having any host that I have to maintain, but again, I didn’t want an EC2 to be able to host Telegraf. And the hardware that sits on the machine couldn’t host Telegraf either. Amazon is serverless, and the Amazon infrastructure that we have costs about £10 a month for millions of messages from all of our machines. In fact, we haven’t actually even broken the £10 a month barrier. We’ve only barely broken the $10 a month. Okay. Amazon scales with data rates. So the more data we were to throw at it, it would just start up parallel instances as necessary. And the Influx database seems to be able to cope with everything that we can throw at it. I think there’s possibly a lot of people who will be kind of thinking, “Well, all that Amazon stuff was unnecessary because now Influx have their own MQTT ingest and that’s designed for IoT. Why aren’t you using that?” We have thought about it, whether we should put our solution to that, and we’ve decided that actually the Amazon Lambda and Influx is still a better solution. And I qualify that again. This will need a lot more — maybe this will be a topic of discussion afterwards, but this happened to us. If Influx is down, there’s no one answering those MQTT requests, and you’ll need more advanced store and forward capacity than MQTT offers. MQTT as a protocol offers some store and forward functionality. But if something is down for seven hours, MQTT is not going to store that data magically for you. And then when it wakes up, you will lose data. And we have lost data. We’ve proved that.
Andrew Smith: 00:39:32.003 And the second reason is that I am fairly sure that most industrial IoT solutions will need data to go to multiple endpoints, not just Influx, but to other things as well. It may be a factory MES system. It may be to dashboarding. It may be to apps. It may be to integrate with other systems. And for all those things, you need to verify it got there. So in those cases, a direct MQTT connection from an industrial gateway right to your Influx database won’t help, because you need it to go to multiple places, not just one. Second thing which has always confused us, and it took me a while to kind of pick my way around this, but for the industrial IoT use case, I believe that measurement, the tag measurement, should be used as your host name or the equipment name or the equipment type. I’ll qualify that. I don’t believe that for the IoT use case measurement as a data type makes any sense for IoT. I’ll qualify that as well. So equipment name is a much more natural primary thing that you start looking for. If you get a call from your customer and you’re in, say, a sewage treatment works and they say, “Aeration lane seven is having a problem,” you will want to look at aeration lane seven. And you won’t primarily decide, “Okay. I want some temperatures or pressures or valves.” So you’ll decide on area first, and Influx forces you to in the way that the dashboard is designed — it forces you to pick measurements of the first thing that you choose. So if measurement is your host name or equipment name, it works far better.
Andrew Smith: 00:41:37.788 Yeah. And the data explorer doesn’t allow you to explore multiple measurements. So for example, a control engineer will naturally think in terms of control loops, PID loops, or whatever. So you’ve typically got, for example, some temperatures or pressure or an on/off thing, and you want to see them together. That is what a loop is. They’ve got all different dimensions, whereas if you have filed all your data as a measurement type, you won’t be able to actually explore those together. So I’m very clear on this that for the IoT use case measurement should never be used as the unit or the measurement tank. Okay. So Influx DB has been a game changer for us in terms of being able to visualize, process, and explore our data. There are some things that I do believe they caused friction as we tried to use it for the IoT use case, and I think they may also cause the same friction for other people as well to be aware of them. It took a lot of effort to put together an Amazon infrastructure lambda and API. And as I was doing that, I was kind of thinking, “I want to use this product, and I want to get my data there. I have to design this thing, and I wish that Influx could just sell me one of these things.” So if only Influx gave users a pre-prepared, fully functional cloud integration, I think that would have been great. Influx doesn’t allow us to restrict logging in to access only a subset of data. So that means we can’t actually — if one of our users say, “Hey, we really love what you’re doing with data processing, we actually have our own process experts that want to learn about that machine and optimize it. We’d like our own login to our own data on Influx.” Well, the problem with the way the Influx currently works is that users are all owners. They can all view everything and they can all delete everything. And it would be great if there was a view, as there is from any other type of software — you have an admin and somebody who can just do something — that would be great.
Andrew Smith: 00:44:16.135 We love that you can share views into data with links. So in these links, the timeframe start and stop, and any variables are actually encoded in the HTTPS. That means with a single link, I can email a colleague, and they can go to a dashboard and they can see what I’m looking at, and we can explore it in that way. It doesn’t work for the great feature, which is notebooks. And actually, neither do variables for the IoT use case, I think there’s some things that people will trip over. Yeah. I think the UI delivers most of what the industrial IoT or process control use case demands. There’s some notable exceptions that we repeatedly keep bumping against, and I think as other people use it, you’re probably [finding?] the same things. If you’re using Influx for troubleshooting and exploring data, the display of how it displays Boolean values as an issue, I think, particularly affects industrial IoT. Synchronizing of multiple charts — clearly, if you’ve got a dashboard and you’ve got four charts on, some of the pressures, some are temperatures, some are tank levels, and some are Booleans. If you zoom in on one of them and kind of say, “Hey, what’s going on here?” It’d be really neat if they all zoomed, because clearly, you’re interested in something, and so you’re interested in the timeframe of that for all the charts you have. So being able to synchronize the timeframe when you zoom or pan and XY plots. They need a bit of Flux code to actually implement. You can’t click, click, click, click and end up with a plot of temperature against pressure, for example, or pumping efficiency against level. And these are the typical things that a process control engineer or a systems engineer or a process engineer will be trying to use Influx for.
Andrew Smith: 00:46:21.156 So I think there are some mismatches, but in general, it has been a game changer for us. The final thing is about security. Username and password is pretty basic. I think industrial IoT users are probably going to need to protect their data prompting you with multi factor authentication. So we would feel far more comfortable being able to tell a customer, “Your data is safe with us,” if we know that no one can actually view their data without having a code on the phone. But I do want to leave on a positive, just remembering that impressive kind of ability of InfluxDB to be able to run that 100 lines of code for 150 cycles, 1.3 million data points, and be able to do that in less than 10 seconds. Incidentally, it used to take something like a minute, and I’ve seen that rapidly come down over the period that we’ve actually been looking at this use case. It literally used to time out because it was bordering on too much of what Influx could take. It then came down to 30 seconds, and then probably about — I think, six or four months ago, it dropped again. So something great is going on as well in the background to make Influx kind of leaner and meaner in terms of its number crunching ability. And we’re very pleased to have chosen a product that is being actively developed. So with that, that’s all I’ve got. I’m looking forward to what happens next.
Caitlin Croft: 00:48:08.940 Awesome, Andrew. Well, you’re in luck. There’s tons of questions, so we’ll jump right into them.
Andrew Smith: 00:48:15.999 Okay.
Caitlin Croft: 00:48:17.209 So the first one is your photo at the very beginning, and you kind of talked about working with Russia. So one of the questions is, how has the war in Ukraine impacted that?
Andrew Smith: 00:48:30.232 Well, I don’t know whether the question is angled at people buying machines from us or our ability to sell connected support. I’ll answer both questions. We sell very large price capital equipment, so these things are planned years and years in advance. So I think COVID had an impact on us, but it’s very moderate and probably what’s happening now is all the orders that were not pushed through in COVID are now coming through now. So we’re kind of catching up with ourselves. The Russia question, if it’s about our ability to sell connected support in Russia, yeah, there are some places that it’s difficult. So Russia and China, all of our kit goes out with the capability. Some customers ask for the actual hardware to be removed. Some of them just kind of say, “There’s no way I’m plugging that in.” Does that answer the — does that answer the question?
Caitlin Croft: 00:49:49.311 I think so. What are some acceptable compression techniques for storing data during the whole life cycle of a machine? Does this assume keeping only summary data for some periods of time like day?
Andrew Smith: 00:50:09.389 This is a great, great, great question. So initially, when we were trying to cost the whole thing — if just kind of — I think we were looking at how much data we would need. I can’t find it in my presentation, but how much data we would need to compress. Here we go. Yeah. We were forecasting data volumes, and at the very early stages, we were kind of thinking: “Wow, if it’s going to be that expensive, then we really need to focus on compression.” And so we were looking at all kinds of different algorithms, some of which really interest us still. And I have a ticket out for developing one of them, which is RDP. I’ll come back to that. What we found is that — I’ve got 601 things to try and kind of focus on in terms of improving the infrastructure as we have it. Because of the bills that we’re getting, I actually don’t care about compression because it’s not currently causing us a problem. It’s not currently a significant cost, and so yeah, store everything. However, that won’t go on forever, and eventually, we’re going to be paying much bigger bills to store all this data. At that time, I’ll be looking for compression algorithms. Now, the compression algorithms that are offered natively by Influx don’t do it for us at all. For example, imagine if you’ve got a compression algorithm that, for example, is looking at a valve that opened and closed repeatedly. You don’t really want it averaged because that then loses all the useful information that you had about was this fault coincidence with a valve opening and closing. The same goes for things like pressures or temperatures because you’re typically, from a process control point of view — when things are flatlining, I don’t care. I am not interested in places where process state variables are smooth. I’m interested in spikes and rapid points of change.
Andrew Smith: 00:52:46.131 And so we looked at various different algorithms, and the one that we really like is RDP. I forget what it stands for. Raymond Douglas Peucker algorithm. And it basically is an algorithm that captures all the peaks and troughs in your data and yet compresses it. So it doesn’t do anything nasty like averaging or windowing. We don’t like the windowing. Unless it’s being used for a statistical purpose, we don’t believe that’s a useful compression technique for compressing process data. There are other algorithms that we’ve considered that aren’t as good and are generally good for compressing data as it’s coming. So there’s a rotating or the swinging door algorithm, but in general, I think that I’m very much looking forward to the release, if it’s going to happen, of the RDP algorithm. You could maybe have a look at that on GitHub. I think if you search for RDP Influx on GitHub, you probably will hit on that idea. I believe it’s in the works.
Caitlin Croft: 00:54:06.331 Yeah. I will say that was a very good overview of what you guys are doing and kind of best practices around IoT data. I will say we do see that kind of as a common trend when people start out using InfluxDB. You end up collecting a lot of data when you start out collecting your time series. You often start out collecting a lot more data and then as you kind of understand that data more, you don’t have to collect it quite as often because you understand it and know when those valleys are. And then of course, eventually you want to down sample or whatever else. But just to kind of add to what Alice was asking and what you answered with, when you start out, you’re going to want to collect at much higher granularity than you might a year down the road on your project or something like that.
Andrew Smith: 00:54:59.674 Yeah. I think for us, probably what we might do is change hardware settings so that data is — I mean, we typically — with the IoT use case data is logged on change, which means that a Boolean you will log every time it changes, and other variables like floats or integers, you will log when they have changed by a certain amount. I think for that reason, we don’t think that sub sampling is very useful algorithm for us. When there is an algorithm that we like, I think we probably will start down sampling our data, but currently, we choose to keep it because we don’t think it is [inaudible].
Caitlin Croft: 00:55:45.013 Do you have any virtual machine instances on AWS or is everything in Lambda?
Andrew Smith: 00:55:50.803 Everything’s in Lambda. Don’t like virtual machines. I don’t know why. [laughter] Yeah. We’ve been able to do it all with Lambda code.
Caitlin Croft: 00:56:01.657 How can you be sure that the pub topic is not duplicated? Do you have any kinds of a catalog implemented for the system where all topics are registered?
Andrew Smith: 00:56:15.893 The pub topic for our MQTT is actually the machine serial number, so ergo it’s unique.
Caitlin Croft: 00:56:29.489 Perfect. What is the moment in time when you usually start collecting telemetry from turbine? When is it started in production mode, on the customer site or earlier?
Andrew Smith: 00:56:45.651 So maybe to clarify, we manufacturers and machines which are involved in the casting process. We don’t monitor engines. People do but we don’t — we monitor our machines all across the world. So to answer that question, as a machine is being built here, at some point, it will become runnable. And at that point, it is connected to the Influx database. So we have all of our factory acceptance tests on a machine, which will leave out our site here, has data from when it was tested in our workshop. And so then it will be on a ship somewhere, and then the next time we hear from it is when a customer will plug it in. As soon as they plug it in, the whole thing will kick into life. We’ll start getting alerts for it and data for it. So yeah. We have data from tests when it was first built, and then for the rest of its life.
Caitlin Croft: 00:57:56.909 Okay. I realize that we’re over time. We still have a bunch of questions. Andrew, are you open to answering a few more?
Andrew Smith: 00:58:03.812 Yes. Yes.
Caitlin Croft: 00:58:04.864 Okay. Cool. And for those of you who have to drop, it is being recorded, so all the answers will be, of course, collected on the recording and will be made available. So if you have to drop, no worries. But if you want to stick on, totally cool as well. Do you decommission all turbine telemetry data when the turbine is retired?
Andrew Smith: 00:58:28.020 Well, we’ve only had this infrastructure for two years, and the life of a machine is 20 years, so we haven’t got there yet. Hypothetically, what would we do with our machine data when we decommissioned a machine so the machine didn’t exist? Would we delete the data? I think probably we would seriously compress it, but keep some kind of representation of it. I imagine that would be useful. I doubt we’d want to kind of delete it, particularly, again, coming back to what we’re paying to keep it. We’re paying hardly anything to keep it. So if at some future point we were able to compare advances in performance, or for example, advances in power consumption or rates of pressurization, it would be useful to have data where we could kind of say, “Our machines now pressurize four times as fast as they did back in 2020.” So I don’t think that — currently, not for us, I don’t think there’d be a use case for deleting data when the machine ceased to exist.
Caitlin Croft: 00:59:52.384 Well, I got you thinking. [laughter] Have you tried any machine learning algorithms like random forest to control incoming data and identify potential problems, or is everything done by someone?
Andrew Smith: 01:00:08.244 The question about AI and ML is an interesting one. And much earlier on in my career, I was involved with machine learning that used a lot of plant data for control. Now, when we’ve looked into it, a lot of products are basically — first of all, they want masses of data to be able to learn from. And the second thing that they need is indications and records of when errors happened so that they can learn, “Okay. This combination of data means this happened.” So at the moment, we’re just basically storing data in a format that we know we will be able to plug AI and ML into it at such time in the future that we need it. Being Amazon-centric, probably we would just pick their AI or ML products and point that towards our Influx data and let it loose. I personally think that we’re getting so much value from condition monitoring and being present for a customer in a connected support kind of way. We’re getting so much value from that. Without AI and ML, we currently have no reason to pursue that. I think there’s a tendency to kind of think that AI and ML can deliver magical things, when in reality, it’s possibly less magical. But either way, the one thing we need is data. So maybe in a couple years’ time, we’ll take on some AI guru who will tell us some fantastic things that are buried in our data. But we’d have to have data, so we’re collecting.
Caitlin Croft: 01:02:07.365 How much roughly is your AWS infrastructure per month per turbine?
Andrew Smith: 01:02:15.458 Per turbine, I have no idea. We have 20 machines online and we’re paying next to nothing for it. I think we get bill — yeah. We get Amazon bills saying $10 a month. We pay pretty much nothing for that.
Caitlin Croft: 01:02:30.478 That’s kind of incredible considering it’s Amazon. [laughter]
Andrew Smith: 01:02:34.709 So again, another tip, which I didn’t include there because this is very Influx-focused, but take a look at basic ingest. Basically, you don’t pay for any MQTT messages that contain a topic that target a particular lambda. So if you can target a particular rule, you get it for free. So basic ingest messages are ones that have a topic that begins with a dollar sign. You don’t get charged anything for them. So we get millions of those free per month. So that’s probably one reason why we’re paying next to nothing for it, because Amazon provide a great deal free.
Caitlin Croft: 01:03:22.896 Do you have an intermediate MQTT Edge broker or are all MQTT messages directly sent to AWS?
Andrew Smith: 01:03:34.954 Great question. There’s no intermediate. The gateway that is the MQTT client has the ability to store a bunch of messages. I think it’s something like 10,000 messages. So if it was temporarily unplugged, it would carry on being able to store that, and then it would catch up with itself. So we’ve never really needed a temporary broker. Our main issue that we’re banging our head against is that customers are paranoid about anything to them that sounds like remote access. So our big headache is being able to distinguish for an IoT for a factory that we are only listening to data. We cannot access your factory because most sort of people who are using the machines, they’ve been used to support in a sort of remote connection kind of way, where you are using, I don’t know, Team Viewer or some remote viewing software to be able to do that. It takes a lot of convincing to be able to kind of say, “We don’t do anything that isn’t delivered through MQTT, and MQTT can only send and receive messages of a certain format.” It doesn’t represent a hack route. So yeah. We haven’t needed a broker. Our main headache is getting customers to connect it at all.
Caitlin Croft: 01:05:24.012 Yep. And I will say there was another question about can InfluxDB act as an MQTT broker? We did recently launch the native collectors, and we started with MQTT. We worked with HiveMQ on that, and it’s out there. You can use it. Just want to let everyone know you must be using InfluxDB Cloud to use MQTT broker.
Andrew Smith: 01:05:52.602 Yeah. I think there’ll be great value in having an Influx providing an Amazon MQTT broker simply because of all the free, amazing stuff that comes with that. And that speaks to my kind of first point of I wish there had been an Amazon-centric MQTT broker because I think we all, in the industrial IoT, would find that very, very enabling performant cheap.
Caitlin Croft: 01:06:35.031 Did you experiment with CVS export of data? InfluxDB version one, exported a clean table of data useful on most use cases. InfluxDB2 exports a list to JSON, which is not spreadsheet-friendly. So do you have any experience with that?
Andrew Smith: 01:06:57.922 No. I mean, we have never actually needed to export anything from Influx, because generally, if I’m doing processing or if I’m constructing an ad hoc query, I find that far cheaper and more efficient. And to keep the visualization all there in Influx — remember, we’re using InfluxDB Cloud, so we’re not having Influx running somewhere on a server and having to then connect to it. So I don’t think we’ve ever had use for exporting data from Influx. The reason it’s in Influx is because we want to visualize it, we want to store it, and we want to process it. And exporting it fulfills none of those. If we wanted to process it in a different way, we would probably connect to it directly. So no. We’ve never really got into exporting.
Caitlin Croft: 01:08:00.409 During your design and development phase, did you ever feel like you needed to have a relational database for contextualizing the data?
Andrew Smith: 01:08:11.789 Yes. Yeah. And early on, I can remember it being very baffling that we could store time series data. But then I was kind of thinking, “Well, what do I do with all the relational stuff?” For example, every cycle will have a summary, which is like a report on how that cycle went. So the cycle will have a maximum pressure or a minimum temperature or a cycle time. And I think it would have been great to actually have in parallel with Influx a way of storing that SQL type stuff in with Influx and being able to combine them. So for example, to be able to combine queries that work from the time series with SQL. Yes. I distinctly remember being bugged by you have to choose between relational or time series. You can’t have both. I think the way we solve that currently is that report [inaudible] basically fired in to the time series database with a different tag or something. It’s not truly relational, but we have somewhere to put the data. But yeah. A very interesting question. I think that would be a great addition. We would find that useful.
Caitlin Croft: 01:09:42.969 Awesome. Well, thank you, everyone. I think we got through everyone’s questions, so really appreciate it. Andrew, there’s been a bunch of messages saying that this is the best session that people have joined, the best InfluxDB use case session so congrats. So thank you, everyone for sticking around. Please be sure to check out the recording. And I also just want to let you guys know we do have InfluxDays coming up in November. It is our annual user conference. It’s going to be a hybrid event this year, so the conference itself is completely virtual, but we’re going to have some watch parties. So if you’re in the Greater London area or in the Greater San Francisco area, we are going to have two watch parties. So you can come hang out with us, hang out with InfluxDB community members. And we also have an advanced Flux training course in London. So if you’re interested in attending, please let me know. I’m happy. I would love to send someone a free ticket to the Advanced Flux Training, so be sure to email me if you want that. And Bria’s already threaded in a link to InfluxDays, so be sure to register for free. The conference is free. There’s also a free Telegraf training, so we’re really excited to see everyone back for that. And thank you again, Andrew, for a fantastic presentation.
Andrew Smith: 01:11:13.027 And I’m particularly interested in connecting with people who are in the same space, industrial IoT. And I don’t know if Influx have some kind of way of bringing that interest into some kind of interest group where people can discuss and share specific issues around use of Influx in an industrial setting for process data. I think that would be really neat, and it would certainly help me because I’m sure there’s some great cross learning of stuff that we can really absorb, and that would be great. If there’s something that Influx can do to make that happen, I’d love to be there.
Caitlin Croft: 01:11:55.035 Yeah. So I just threw in my email address into the chat, so everyone should have my email. If you are in London and you want a free ticket to the advanced Flux course, please be sure to email me. If you want to connect with Andrew, I’m more than happy to. So feel free to email me, and I’m happy to connect you with Andrew, offer you free tickets, help you register for InfluxDays, all those good things. So thank you, everyone, for attending today. I think it a great discussion.
Andrew Smith: 01:12:26.704 Yeah. Thanks for attending. Thanks for setting that up, Caitlin.
Caitlin Croft: 01:12:30.345 Awesome. Thank you, everyone. And I hope you have a good day.
Andrew Smith: 01:12:34.938 Yeah. Thank you. Bye-bye.
Caitlin Croft: 01:12:36.739 Bye.
[/et_pb_toggle]
Andrew Smith
Andrew Smith is a process control engineer with over 25 years of experience in R&D, process control, innovation and product development. Working in UK, Europe and East Africa, he has worked in a variety of roles including entrepreneur, employee and consultant in a variety of sectors including glass manufacturing, water purification, wastewater treatment and equipment manufacturing. In partnership with Leeds University and "Innovate UK", he is developing LBBC's capability to provide customers with a new breed of process equipment that is cloud-connected, enabling value insights on the process and equipment performance for both LBBC and their customers. Andrew is a chartered engineer and holds a B.Eng in Electronic Control and Systems, an MSc. in Water and Wastewater Treatment, and a postgraduate diploma in business from Lancaster University.