How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Reduce Carbon Footprint
Session date: Aug 01, 2023 08:00am (Pacific Time)
Bevi are the creators of smart water dispensers which empower people to choose their desired beverage — flat or sparkling, their desired flavor and temperature. Since 2014, Bevi users have saved more than 350 million bottles and cans. Their “smart” water coolers have prevented the extraction of 1.4 trillion oz of oil from Earth and have saved 21.7 billion grams of CO2 from the atmosphere.
Discover how Bevi uses a time series database to enable better predictive maintenance and alerting of their entire ecosystem — including the hardware and software. They are using InfluxDB to collect sensor data in real-time remotely from their internet-connected machines about their status and activity — i.e., flavor and CO2 levels, water temp, filter status, etc. They are using these metrics to improve their customer experience and continuously improve their sustainability practices. Gain tips and tricks on how to best utilize InfluxDB’s schema-less design.
Join this webinar as Spencer Gagnon dives into:
- Bevi’s approach to reducing organizations’ carbon footprint — they are saving 50K+ bottles and cans annually
- Their entire system architecture — including InfluxDB Cloud, Grafana, Kafka, and DigitalOcean
- The importance of using time-stamped data to extend the life of their machines
Watch the Webinar
Watch the webinar “How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Reduce Carbon Footprint” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Reduce Carbon Footprint”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Caitlin Croft: Director of Marketing, InfluxData
- Spencer Gagnon: Lead Software Engineer, Bevi
- Sean Grundy: Co-founder and CEO, Bevi
Caitlin Croft: 00:00:03.692 Hello everyone and welcome to today’s webinar. My name is Caitlin, and I’m joined today by the Bevi team. So we’ve got Spencer and Sean. And Spencer, I think someone else has joined under your name. So we’ve got lots of people from Bevi here.
Spencer Gagnon: 00:00:19.484 Yeah, [inaudible].
Caitlin Croft: 00:00:21.982 Really excited to have you all here. Please feel free to ask questions, post them in the Q&A. This webinar is being recorded, and the recording and the slides will be made available tomorrow morning. And without further ado, I’m going to hand things off to Spencer and Sean.
Spencer Gagnon: 00:00:41.564 Yeah, real quick, I guess we’ll just do introductions. My name is Spencer. I’m the lead software engineer here at Bevi. I’ve been here for three years and some change. And I’m really excited to be here and talk about how we use Influx.
Sean Grundy: 00:00:58.629 I’m Sean Grundy. I’m co-founder and CEO of Bevi. And I’ll kick off today with a quick overview of Bevi as a company. I’ll share a little bit about our product and our history, and then hand it over to Spencer for the bulk of the conversation today to get into interesting architecture conversations. No, I know how to move the slides. Bevi’s mission is to make the beverage industry environmentally sustainable by eliminating single-use bottles and cans. We picked this mission when starting the company 10 years ago because the beverage industry is literally the most wasteful industry in terms of plastic pollution in the world. There’s a nonprofit group called Break Free From Plastic that every year puts out an audit of the top plastic polluters globally. And about half of the top 10, including the number one and number two spots, are the major beverage companies. So this is an incredibly wasteful industry that produces over 100 billion plastic bottles each year. Recycling rates are actually going down because the rate of bottle consumption is increasing faster than the rate of our recycling infrastructure growth. And in addition to just being very wasteful in terms of plastic itself, this industry is also very wasteful in terms of fuel. A lot of fuel gets used to transport water from a bottling plant where it’s first manufactured to wherever it’s ultimately consumed.
Sean Grundy: 00:02:50.896 In addition to being wasteful, it’s really entirely unnecessary. So most bottled beverages out there are either all or nearly all comprised of filtered tap water. But instead of being filtered in the taps all around us where they’re efficiently transported via water pressure, they’re instead filtered in a bottling plant, packaged in tiny plastic or sometimes aluminum containers, and then trucked off along a series of stops. It’s actually the trucking and the associated labor and warehousing costs that end up making up the bulk of the cost of bottled water. So typically today, bottled water will retail at around $1.20. It’ll wholesale in bulk at around $0.55 per bottle or can. And the majority of that is actually distribution costs. After that, it’s packaging costs. The actual water itself costs almost nothing. And you may be thinking about this just in relation to water, but it applies to the vast majority of non-alcoholic and non-dairy beverages. So whether it’s energy drinks or flavored water or a lot of the packaged juices or items like that, they’re essentially a few percent flavoring, CO2, and/or preservatives and then almost 100% filtered tap water.
Sean Grundy: 00:04:22.285 The concept of Bevi is to cut out all of those distribution steps and cut out the bulk of the packaging associated with beverages and only ship the non-water ingredients. So we only ship that one to three percent of other stuff, the flavorings and the CO2 and the filters that mix in with water to create some other beverage. And we always source our tap water locally and filter it in our machines. This has a very positive environmental impact, but also importantly for Bevi, it created our business opportunity. By cutting out all of these wasteful steps, we also significantly reduced the cost of a beverage to closer to 20 cents per drink. And that enables us to really undercut the beverage market, which is how we have a product and how we have a business. In going to a market, we ultimately have this vision of wanting to be everywhere. Wherever people are consuming single-use bottles, we want to have some version of Bevi. So we need this highly distributed system. But to start, we decided to focus on the commercial sector. And in particular, we focused on commercial offices. And there were two reasons for that. One is that we just saw a clear gap in the market.
Sean Grundy: 00:05:59.233 In the home, brands like Brita and PUR had gotten people comfortable filtering their own tap water instead of always buying bottles. And then products like SodaStream got people comfortable flavoring and adding CO2 to their own water, so customizing it a little bit. And that came with a lot of benefits. It came with more personalization, lower cost, lower carbon footprint, more space savings. There was a lot of value that these products provided in consumers’ homes. When we looked at the office market, what we saw was there were very clear industrial-scale, high-volume parallels to products like Brita and PUR that connected to the tap, were powered by electricity, chilled the water, had some features that a simple Brita or PUR pitcher wouldn’t have, and that were designed to dispense, say, tens or hundreds of beverages a day instead of just a handful. But then looking at the flavored and sparkling water side, we didn’t find any machines or any equipment having success in the market doing what SodaStream did for the home.
Sean Grundy: 00:07:16.200 What we saw instead was companies were just burning through hundreds or thousands of single-use plastic bottles and single-use aluminum cans every month, and both spending a lot of money and actually spending a lot of time restocking refrigerators with these bottles every day. So Bevi set out to displace that. And we do that first and foremost with machines that — you can see our product lineup right here of the two machines we’re actively manufacturing. They’re machines that filter tap water, store different ingredients in them, which for us are filters, concentrates for flavored and enhanced water, and CO2 for sparkling water, and then let users customize the water as they wish. So customizations can include adding vitamins and supplements, mixing and matching flavors, changing the strength of a flavor, changing how sparkling the water should be, if at all, and in our latest model even setting water temperature. And what’s less visible here — when looking at our product, it’s pretty easy to get a sense of some of the mechanical challenges and some of the beverage creation challenges, but what’s much less visible but still incredibly important is the software side of Bevi. We have a unique software interface that really creates the look and feel of the brand that everyone interacts with to customize and dispense a beverage, but that’s both literally and figuratively, that’s just the surface level.
Sean Grundy: 00:09:02.616 Every touch to our touchscreen is captured so that we or whoever is servicing our machines know when any consumables are running out so that we know if anything is going wrong that needs to be addressed. We use data from the machines to continuously improve our service model to improve quality. We also use it to essentially help enforce brand standards as we go to market across large geographical areas with many distributors. So it’s this incredibly important part of Bevi’s experience and it’s something that’s out of sight to the vast majority of our users. And I won’t get into this too much because we have an expert talking about it, but just wanted to set the stage. The reason customers get our product, like why we really found success in offices in particular, was first and foremost really time savings. In companies that previously offered single-use bottles and cans, an office manager would actually have to stock the fridge every day. Taking that responsibility away is very compelling because it frees up valuable time. Personalization is really important as well as just a commitment to healthy offerings.
Sean Grundy: 00:10:31.027 Environmental responsibility, I’d say, is important to the vast majority of our customers, but not all. It’s interesting because to me it’s actually the biggest win when a customer doesn’t care at all about sustainability, but still gets Bevi because they just like the user experience so much or because their employees like the product so much. And affordability as mentioned earlier, by cutting out all the packaging waste and most of the trucking associated with beverages, we also cut the cost. We have a little over 5,000 customers today. Customers for us range from big tech companies to consulting firms. And one thing I’ll mention here is, as you can imagine, being a company that caters to in-person office spaces was not a good place to be when March 2020 hit. We did go through a pretty existential crisis where we went from true startup hypergrowth from our product launch in 2015 through to Q1 2020. And then when COVID hit, we really saw our sales pipeline evaporate almost overnight. And we had to get fairly creative about how to operate. And we very quickly started looking at who was still open.
Sean Grundy: 00:12:06.197 And some of the industries that really got us through the pandemic were actually the pharma and biotech companies who never shut down. Even in March and April, they kept going to the office, both the ones working on COVID vaccines and not. We learned to sell into hospitality. So companies like Hilton and Hyatt and W Hotels became customers. We learned to sell into property management companies. So typically communal areas of residential apartment buildings. We learned to sell into hospitals. So COVID really prompted us to seriously expand our sales efforts beyond offices. And then one thing worth mentioning that was also helpful in this time is a lot of the big tech companies were actually using the moment where corporate real estate was cheap to buy up office space and renovate it in advance of an ultimate return. And seeing them do that not only provided us with a meaningful source of revenue but also gave us the confidence that they were serious about ultimately coming back on a hybrid scale, as we’ve since seen.
Sean Grundy: 00:13:19.884 Our IoT data, which Spencer’s about to get into, was also very helpful during the pandemic because that was our insight into what pockets of the country were going to the office in different phases, what industries were open, what industries were starting to return to work. That source of data from the actual usage of the machines became very important in guiding our sales efforts in a time when we just didn’t have a lot of resources and couldn’t really afford to make mistakes. I’m happy to report that we did get back to high growth. And this trend toward hybrid office reopenings has been incredibly helpful to Bevi and works really well for us. So now we’re excited to have our core office market up and running quite well again, while also having developed some degree of expertise selling into these other markets too. That’s all I’ve got on the business overview and with that, I’ll turn it over to Spencer.
Spencer Gagnon: 00:14:24.612 All right, great. Yeah, so I’m going to get a little bit into our software stack and what IoT at Bevi and software at Bevi is like. I wanted to start here with a brief history of how we use InfluxDB and our history. And then I’ll go into our architecture at large. And then we’ll zoom in a little bit more on how we use specifically Influx and how that’s changed over time. And feel free if you have questions. I’m going to try and answer them as we go through, just when they’re relevant, so please do not hesitate to ask. I will get into more of this stuff that you see before you in detail a little bit later, but we started out in 2014. My boss now started using InfluxDB version 0.8 when you guys were still in super alpha, and surprisingly we actually used that version up until 2021 when I started. So I started a little bit in 2020, and that was sort of my first large project was to upgrade us from InfluxDB 0.8 up to, at the time, the most modern version of Influx, which was 2.0. And we are continuing to use 2.0, the OSS version, today and we’ve had a little bit of changes, but by and large since 2021 Q4 we’ve been using the same implementation and the same version of Influx with largely no issues. So yeah, I’ll get into a little bit of the Bevi backend architecture. I sort of created these with some degree of detail, but a focus on Influx here.
Spencer Gagnon: 00:16:17.379 So down at the bottom these Bevi machines, these little stacks represent thousands of Bevi machines across the US and Canada. And throughout the day as people dispense beverages, they’re all communicating to our main backend server. And then those communications get persisted in Influx as well as a multitude of other services that we use for various data storage and processing and that kind of thing which I will, of course, get into. The main thing to understand is that pretty much every communication happens between Bevi machines and our backend via what we call events. And events are just a fancy way of us saying basically a JSON key-value pair. The only thing that all events have in common is they have a timestamp of when the event happened and then a type which is just something that describes a little bit more about what the event is. So the thing that makes sort of the most sense for a Bevi machine to have is, of course, a dispense event that represents a time when a user pressed the dispense button, and it tells you what flavor they were dispensing if at all and what type of water they were dispensing: sparkling, light sparkling, hot, whatever. But we also have a bunch of other events that might not be as obvious. Like I put here CO2 pressure sensor. And on the newer machines, we have dozens of different sensors that we’re taking data from at all times and that allows us to better predict, as Sean was saying, if there’s an issue on the machine but also if, for example, the CO2 has run out on a machine and we can gray out that option on the Bevi.
Spencer Gagnon: 00:18:08.905 So like I said, these events are mainly what we persist to Influx, but we use them for many, many things. I’ve put some information here about some other things, other ways that Bevi communicates to our backend. The way that we indicate that a Bevi machine is online is via these heartbeats which happen every minute or so. They just say, “Hey, I’m online,” and that’s sort of a little ping just to — so that we can on our backend understand how many Bevi machines are online and some information about those machines because we don’t want to necessarily be sending events all the time. The way that it works from the Bevi machine’s perspective is it’ll wait a little bit, it’ll batch up 100 or 300 or 600 events and then send them all at one time, but the heartbeats are always happening minute by minute. And then we have a back channel to communicate commands to those Bevi machines. The classic one is, of course, upgrading software, but also we can set custom configuration properties and restart the machine if possible.
Caitlin Croft: 00:19:26.015 Oh, Spencer, we lost your sound.
Spencer Gagnon: 00:19:33.481 Hello. How are we doing?
Caitlin Croft: 00:19:34.741 Okay, now I can hear you but we did lose you a little while ago.
Spencer Gagnon: 00:19:38.833 What was the last thing you heard?
Caitlin Croft: 00:19:41.710 Let’s see, talking about, yeah, I think you were on this slide, and just you were just getting started.
Spencer Gagnon: 00:19:50.409 Okay, cool. Yeah. Rough. That’s okay. So yeah, just talking about heartbeats and how the Bevi machines will communicate an online state every minute or so. Just that’s a bare-bones sort of really quick endpoint and they just say, “Hey, we’re checking in every minute or so.” And that makes it so that we can better batch these events that I talked about in the last slide. And it also allows us to pretty quickly use this batch channel to send commands to upgrade software or configure the machine depending on various bits of information and debug things where if a machine is having an issue, we can get more information about that machine via this batch channel of commands. And then the last sort of thing that Bevi machines communicate with the backend via is these getters. So there’s a lot of information that’s calculated on our backend server like inventory and alerts if there’s something going wrong or if the machine needs a retrofit. Also, usage data is calculated and aggregated on the backend, and sort of more customer-facing information like who manages the machine and that kind of stuff. Those are all provided via just sort of REST API endpoints. And some of those things are displayed on the machine. But that’s sort of a basic, whenever a person services a machine, they might see some of this information that is calculated or originates on our backend server. So that’s the basics of our zoomed-out architecture.
Spencer Gagnon: 00:21:29.420 And now, so this slide is — there’s a lot going on here, but I’ll go through all of it and say a little bit more about how we use Influx. And of course, this image even is not beginning to scratch the surface of what everything looks like on the backend of Bevi. But I think for the purposes of this, it’s a pretty good summary. So you can kind of see in the bottom left here, the main server has important API data and inventory and uses calculations and REST APIs. That’s the main sauce there. And then we have many clients of that, but the main two are the Bevi machines themselves, which send data to the REST API endpoints, sending events and that kind of thing. And then we also have something called The Well, which is our web portal that people who service Bevi’s and people who manage a large fleet of Bevi’s can get information. You can use that to get information about inventory and also alerts like if a machine has a leak, for example, or sort of day-to-day stuff, like if it’s out of CO2 or flavors.
Spencer Gagnon: 00:22:38.979 And then, so from the main server, we communicate in several ways to Influx. I’ll talk about the right path, which is displayed on the right side of the screen here. We have sort of three servers, basically. This is, of course, an oversimplification as everything is, but whenever the main server gets an event or a batch of events from a Bevi machine, it writes those events — publishes those events rather to Redpanda, which is a Kafka-compatible data streaming tool that we use and that lives on its own server. And that is the only thing that happens in sync with the Bevi sending events to our backend server. So that means relatively quickly we can use — we can send those JSON events right to Redpanda and then kind of forget about it and assume they’ve been persisted correctly. And then everything else happens asynchronously. So once those events exist in Redpanda, then we use Telegraf, which is on the same boxes as our InfluxDB open-source implementation. That is the subscriber to our Kafka server there. And it takes larger batches of events and persists those all to Influx from various Bevi machines in big chunks. So that allows us to not overload Influx and just have a single point that batches up all these events from even hundreds of Bevi machines at a time into batches of thousands of events and persists them to Influx. And I had some more specific numbers on later slides about what sort of rate we do there. So that’s the right path where we have to get those events persisted.
Spencer Gagnon: 00:24:33.655 And then there are other kinds of events that we have as well, not just the ones that come from Bevi machines. Some events are generated on the backend based on actions that a user has taken on The Well, for example, or events that describe things about how a machine is sending events. So like over the last hour, this machine sent 100 events and they were from this IP address or this and that. Just sort of diagnostic information and all of that gets stored in InfluxDB. So InfluxDB is really our single quick source of getting the latest information about a Bevi machine. So now that we’ve covered the write path, I’ll talk about the read path a little bit. There are countless things that read from Influx. We use it primarily, I would say, from the software team’s perspective as a debugging tool, but we also use some data in Influx for inventory and usage calculations. I’ve described here a few queries we do. Inventory calculation, arbitrary debugging. Certain synchronous APIs get the latest set of customization properties that are on the machine. Those are all stored in Influx’s columns.
Spencer Gagnon: 00:25:52.305 But then also we use Grafana. The hardware team is very active on our Grafana dashboards that hook directly into Influx, and they can give you real-time information about a Bevi machine that’s in the field that we don’t — there’s no person in front of it, but we can say what’s going on, for example, with the carbonator in that machine or what the temperatures of all the different waters are and that kind of thing. And I have some more examples of that that I’ll get into. Yeah, so just by the numbers, we send events. The fleet of thousands of Bevi machines sends about 150 to 200 requests consisting of 2,500 events per second. And all of those flow through our backend server and then Redpanda and then Telegraf picks them up and then they end up in InfluxDB. That is the general flow and that number is always increasing as we get more and more Bevi machines out in the field. So right now we have sort of a single box set up, but as we scale using Redpanda and Telegraf, it’ll be pretty easy for us to expand to multiple servers and talk more about load and dispersing that load. But right now we’re able to handle this load as it exists today pretty easily.
Spencer Gagnon: 00:27:22.130 Yeah, so I mentioned debugging within InfluxDB. This is one of the things that I personally do on the software team quite a lot. And I know a lot of other members of the software team do as well. We have this UI that we’ve sort of — bare-bones UI we’ve built that gets used a lot that allows people with permission to do specific InfluxDB queries to learn things that might not be displayed in some of our other UIs or more complex queries that haven’t been transferred into a Grafana dashboard yet. So I have an example here of — this is just one example of a machine reporting stats over time. So it’ll tell us how long the app has been up and what version application it’s running. And then we can get lots of other information here. We, from the software side in it, for example, how much memory the machine is using, how much disk space it has left from the software. And then from the hardware perspective, we mostly use Grafana here and we can get information that is very specific about the CO2 pressure and temperature. And here we have also depicted the heater in the last hour. And we can zoom in and out on this and explore that data if we see issues or if we just want to improve some of our inventory or usage algorithms as well, we can look at this data. So that’s very helpful.
Caitlin Croft: 00:28:51.533 Spencer, while you’re on this, just curious what — there’s the plateaus and spikes. I’m just curious — what do those represent? Are they causes for concern or just kind of standard peaks and valleys?
Spencer Gagnon: 00:29:07.495 Yeah, so that’s a very good question. So you can see on the heater graph here, the green line, I think might be the heater temperature. I’m not 100% sure. But the blue line, I think represents when the heater is turned on versus when the heater is not, when it’s getting a certain number of power. Honestly, I am not the person to talk about what all the various sensors mean. That’s sort of more of a hardware mechanical expertise thing, but we have sensors that depict this is a healthy machine that I’ve picked here. So these dips and valleys will happen all the time as the machine goes through its normal life cycle of just making sure that the heater is at temperature and making sure that the CO2 pressure is high and that kind of thing.
Spencer Gagnon: 00:29:56.799 So moving forward, I wanted to talk about a few challenges that we have had over my experience with Influx and how we solve those challenges and just kind of tips and tricks. So I’m going to try and not read off the slides directly here. But as we discussed earlier, the events that we send are really just schema lists. They are key-value pairs. And the dispense app can send whatever it wants. Oh, this might be the first time I mentioned the dispense app. The dispense app is what we call the software that runs on the Bevi machine. So it sends truly schema list key value pairs. And our backend wants to persist this to InfluxDB, but we want it to be typed because in order to use Grafana and in order to use other visualization tools and aggregations, we don’t want to be persisting a number as a string or as a Boolean or whatever. Some of those things just don’t make sense. So that was a big thing as we moved from InfluxDB 0.8 to 2.0 — is now field types have to be declared as they are persisted, which is a change for us. So our eventual solution was basically that our schema for all events that get sent by the dispense app are defined in the backend server. And whenever we receive an event that doesn’t match one of those defined schema fields or measurements or types, we still persist it, but we persist it in just a raw key value pair string in this field called underscore overflow data. Overflow data rather.
Spencer Gagnon: 00:31:46.559 And so that, for us, was very helpful because it prevents you from accidentally persisting some number value as a string, which would make it very hard to query. And it requires us to be more strict about what our schema looks like actually. And yeah, it allows us to be a little bit more strict about what we’re allowed to send from the dispense app rather than having it be more of a Wild West situation as was possible with Influx 0.8. But in this case, I think it just is a better best practice for us. Another problem that we had in adopting InfluxDB 2.0 — I mentioned this briefly in the timeline — was that InfluxDB 2.0 does not handle high-throughput small batch writes, which we found out sort of in this process. In our InfluxDB 0.8 implementation, every time a tablet would communicate with the backend, it would send events and those events would be synchronously persisted to InfluxDB. That’s not very fast and it’s not very smart to do that as we found out. And as I said, 200 requests a second, 10 rows per request, that’s a lot of threads and it’s a lot of traffic to Influx and it’s not very well batched.
Spencer Gagnon: 00:33:23.313 So our solution with that was a lot of conversations with the InfluxDB support team about what the proper way to use InfluxDB 2.0 that we’ve self-hosted and the adoption of Redpanda, which we’ll eventually use to not only publish events to be persisted in Influx, but also for usage in other parts of our backend infrastructure and adoption of Telegraf as well. It subscribes to Redpanda and batches the events. And all this resulted in reduced load, reduced thread usage on our Bevi backend, way faster response times because persisting to Kafka, a temporary event stream is much faster than persisting to a real database. And we no longer got crashes when too much traffic was being sent to Influx. And then as a nice bonus also, we gained a lot of trust and a strong relationship with the InfluxDB support team. And our relationship with them as now we’re doing this sort of back-and-forth webinar thing is great now. And we’ve had sort of no issues since then. I’m going to check on the questions.
Spencer Gagnon: 00:34:44.608 Let’s see. Where is the data store that Grafana presents and how is it organized? Yes, yeah, so our Grafana implementation is accessing InfluxDB via queries directly. So that has its advantages and disadvantages. If you put in Grafana a query that is irresponsible or sort of takes a lot of time for Influx to process, then you’re going to have a bad time with Grafana and it requires sort of an expertise in the InfluxDB, InfluxQL query language, and that kind of thing. But the advantage of it is — it’s sort of — it’s just direct, it’s super easy if you know InfluxQL and that kind of thing. We do not use MQTT at the moment is the answer to that question. How did you figure out how to batch requests? Did that magic come from the InfluxDB support team? Yeah, sort of. So it was obviously a very collaborative relationship there. We told them about our infrastructure as I have just done, but back then of course it was in synchronous calls to Influx. And the one thing was, I think it was a learning opportunity for both of us because it was shortly after the release of Influx 2.0. And because we were using OSS and hosting it ourselves, the Influx team didn’t have a lot of the same metrics and information that they would have about the server if it had been InfluxDB Cloud, for example. So there was a lot of back and forth with just pouring over our metrics and pouring over the knowledge that the InfluxDB support team had, and it was a little bit frustrating.
Spencer Gagnon: 00:36:39.672 But the eventual outcome ended up being — you’re sort of using InfluxDB in a way that is not wrong, but it sort of wasn’t anticipated. And I think it resulted in an infrastructure that we created that is a lot better for us and it’s a lot more scalable, and will, in the future, allow us to expand our Influx usage and our Kafka usage as well. So you keep overflow data in the main server only, or do you push that later to InfluxDB after some pre-processing? So this is all in the direct flow. This is a really good question. When the Bevi machines send events to our backend, we cross-reference those events with our existing schema and then transform them into the schema or rather the form that we persist to Kafka, and that happens synchronously. So if we find an unmatching field from the JSON that we’ve parsed, we put it directly in the overflow data. So the single source of truth for the schema exists in our backend Java server. I’m going to come back to the rest of these questions in a little bit because I just want to make sure I finished what I’ve got for the slides. All right, oh, that’s it, okay, cool.
Spencer Gagnon: 00:38:13.457 So looking forward, we really have no complaints. I’ve mentioned this before and our usage of InfluxDB is sort of exactly what we need. We’ve been running without issue for a few years now. Some things that I want to look into for future improvements are, of course, InfluxDB 3 is out now and it’s fully managed, and that is attractive because not having to manage your own service is huge. So that’s definitely a consideration that we’ll have going forward as Influx and our relationship with Influx evolves and as our scale expands. And another big thing for us is data lifecycle. Currently, we have all historical data for all Bevis since 2014 in Influx, which is a lot of data, but storage is cheap. The main issue is that most of it is unused and will never be used, really. And so I think going forward, we can be a little bit smarter about deleting some of that data over time and just having a better posture with respect to data intake and how long we’re storing it for because sort of dispense data from 2014 that has been since can be persisted in our colder storage is not very useful to have in InfluxDB. All right, yeah, that’s all I have for slides. I guess I’ll go back to questions we’ve got here. Caitlin, do you have anything else that you want to —?
Caitlin Croft: 00:39:45.715 I had my own questions. You’re talking about the fact that you have all the data from 2014. Have you downsampled that data or has there been any other analysis looking at — I’m thinking long-term how you guys are doing in 2014 versus 2023. I’m sure operationally you guys have tweaked things, improved things, just kind of curious. Clearly, that was two questions, but.
Spencer Gagnon: 00:40:12.567 Yeah, no, no, I think that’s a good question or a good set of questions. Basically, the only thing that we’ve done to downsample the data, and one of the advantages of the event schema being just a true key value pair is that we didn’t really have to do anything different other than just copy what we had in Influx 0.8 to Influx 2.0. The one thing that we did do is we didn’t copy data for units that had since been — for Bevi machines that had since been retired and are not currently being used. So we didn’t copy any of that data over because sort of it’s had its life cycle and there’s not really any reason that we would ever be using that data again except perhaps in cold storage. As I said, we do use Snowflake and Looker which also has copies of this data that are all stored sort of more affordably and less easily accessible for us.
Caitlin Croft: 00:41:09.626 Cool.
Spencer Gagnon: 00:41:10.435 Yeah, a lot of questions about predictive maintenance, preventative maintenance. This is huge. Yeah, I don’t have any slides on this which was a mistake on my part but we have a process and sort of — there’s two ways that we do this. The main data structure we call alerts. And it’s a relatively new system for us but we want to be — for people who service Bevis to be more aware when there’s a problem on a Bevi. In the past, this has just resulted in us putting up on a Bevi the out-of-order screen and not really alerting anybody that that has happened. And so it would get serviced sort of just the next time someone happened to go restock that machine which obviously isn’t an ideal customer experience. So one thing we’ve implemented is our sort of monitoring system which is a combination of using on-machine sensor data to predict when or understand when the machine — the most drastic example is if a machine has a leak, then, of course, we want to turn all the water off on that machine and alert somebody right away so they can come and fix it. But there are other smaller things, especially on the newer machines where we have a lot more sensor data and we have a lot more information about what’s going on in those machines. And we’ve implemented this monitoring system in such a way that we can notice when there’s something going wrong on the machine and surface that to the user, be it making the machine out of order or another way that we service that to users, for example, if a flavor is out or if sort of CO2 is out, those things get surfaced but the machine doesn’t go out of order; you just are unable to dispense a sparkling drink or a drink of the flavor that’s out.
Spencer Gagnon: 00:43:03.192 But whenever that happens we now create what we call an alert and that manifests on our back end and it’ll send emails to concerned people that service those Bevis and they’ll be able to go out right away to service a machine that has a leak. And then in addition, we have sort of lower priority alerts that we call [inaudible]. Those include — for example if a machine needs six-month preventative maintenance, we have a process of — every machine does a regular bi-yearly just sort of clean out of all of its systems. And that is a set of [inaudible] alert. And that just pops up every six months, and people will go and service those machines. And as well, if a machine is offline, we don’t really have any information about that machine on the back end. So the question is, when can someone get there and either turn on the Wi-Fi or connect to the internet in any of the other ways that we provide? As well as retrofits. Something that is cool about modern-day Bevi is we’re able to fix hardware problems that we’ve noticed in our manufacturing process or any of that retroactively. And that we can service as an alert that will get emailed to people, and they’ll say, “Oh, I have to go sort of replace this part on the Bevi or add this part because we found a new way of servicing that we said is safer and more efficient.” So yeah, I’m going to, let’s see, dismiss these ready maintenance questions. What else? There are a lot of questions here, so I probably won’t get to answer all of them.
Spencer Gagnon: 00:44:57.430 This is a good one. I’m going to answer this one about reliability. So what do we do for InfluxDB reliability, including uptime and upgrades? One thing about Bevi is we have this sort of great advantage of people are only really servicing Bevi machines during waking hours in the US. So that means essentially 9:00 AM Eastern to 8:00 AM Eastern, or 5:00 PM Pacific. So what that gives us is this great advantage of, basically, on weekends and evenings, being able to disable or bring down our backend, which doesn’t affect the behavior of Bevi machines at all. They won’t be able to connect to the backend to send their events, but their events will be logged, and then they’ll be sent once the backend comes back up. So we can have sort of prolonged downtimes during off-hours and on weekends that doesn’t impact the Bevi user experience at all and allows us to sort of take our time with upgrades and that kind of thing. With respect to uptime, we do have alerts for ourselves internally on Slack and stuff. If one of our servers goes down, of course, we get notified of that in an emergency situation, and we’re able to handle that. But we don’t have any sort of real-time update process just because of the way that Bevi machines are serviced.
Spencer Gagnon: 00:46:29.362 I’m going to dismiss this one. All right. Let’s see, do you pre-process any data at the edge before sending? I notice you average for time series sensor data. So all of the averages that you’ve been seeing when on the — I’ll bring it back actually on this Grafana slide. That’s all done in real-time by Influx in Grafana. So all the data that you see here is raw and then aggregated via whatever settings you’re doing in Grafana. We don’t do any fancy sort of edge processing. The machines themselves, obviously, are getting sensor data much faster than we would want to send it. So we have a process there of if there’s a meaningful difference in a sensor value, we will send that immediately. But other than that, like you can see here in the CO2 pressure graph, it looks like at some places it’s missing a data point. That’s just because the data hasn’t changed in a meaningful way. And so we can wait another minute before sending that data, and that’s just sort of a process that we use to reduce the data that we have to sort. All right, let’s — we can dismiss that one. Can you overlay temperature profiles for machine every time the heater is turned on? For example, can you calculate batch properties by the average pressure when a CO2 beverage is selected?
Spencer Gagnon: 00:48:06.999 Yeah, so I’ll say I’m not going to be able to answer this question very well. That is something that we do either in Snowflake very retroactively when we’re doing investigations into how we can improve data with a heater or any of that kind of thing. And I also am not a data scientist. Many of our — we have a whole team of data scientists that do this much better than I do. I think from my perspective, what we’re really doing here with Influx is providing the data for someone who is smarter than me in that field to analyze it. How is communication secured? Just real quick. We just use HTTPS on our backend, and we have some sort of basic authentication that the tablet uses, but that’s certainly an area of improvement for us. But also, yeah, that’s all I have to say about that. And that’s Jeff. I don’t know what you mean by retroactively or proactively. Do I store some event data on the edge as well? Yeah. Okay, yeah, so when I mentioned earlier batching and offline machines, one of the things that is great about the Bevi model is that a machine can be offline for years and it will just store event data locally in a file and keep logging to that file, and then in the normal operation of an online Bevi, asynchronously, those events will get sent every hour or so. But when a machine’s offline, that data is still being logged and then it gets, we call it just backloading.
Spencer Gagnon: 00:49:59.093 So we call InfluxDB the source of truth from any reasonable sort of accessibility perspective, but the real source of truth is the data that exists on the Bevi machines themselves because if there is an Influx outage even, or if we lose some amount of data theoretically, we can rebuild the entire history of all Bevi machines from the local Bevi machine data, which is something that’s really cool, and it allows us to not have lost a bunch of history if a machine is offline for a very long time because once it gets back online, it spends a little bit back loading the old data, and then we know what was happening on that machine the whole time.
Caitlin Croft: 00:50:41.899 Awesome, Spencer. Great job. There were a ton of questions for you. I have a question for you. So just kind of curious because you’re collecting all this data from the machines of people, are they drinking cold water, sparkling water, ambient water, which flavors? Has there been any correlation between — especially if you guys have gone into different industries, is there any correlation between maybe regions preferring, let’s say, cucumber water over another one or maybe seasonally like winter versus summer, or just any interesting stats on that?
Spencer Gagnon: 00:51:22.890 I’m happy to take that if helpful, there are some, although it’s almost surprisingly stereotypical. We see Sweden drinks being more popular in the south. One thing that’s exciting is more and more we’re selling into factories and distribution centers and other places where people are physically working with their hands. And we’re seeing extreme popularity of electrolytes and vitamins in those locations, which is really cool to see that we’re kind of helping people stay energized and healthy. Broadly, the main trend we’ve seen just over the past years is Sweden beverages in general and most of the country have become less popular. We find much more demand for unsweetened beverages in general.
Caitlin Croft: 00:52:21.573 That makes sense. Yeah, I was just kind of curious. Because I’m sure you guys are just collecting so much data on what’s actually being consumed. So you guys know what you need to send out to people.
Spencer Gagnon: 00:52:32.983 We are. And the direction we’re moving in this perspective is also to give users more and more control over their beverage and over time let them get precisely what they want for them. So we’re trying to keep the specific flavors relatively basic and as well as the enhancements but give users more and more ability to mix and match those to create exactly what they’re looking for in terms of like the strength of the flavor, the sweetness, the actual taste, all of that.
Caitlin Croft: 00:53:11.057 Someone did make the valid point that is the data skewed depending on what’s available in the machine. So I’m sure there is that component. But the people ordering the stuff must know and I’m sure they get feedback from their teams of what they like and don’t like. Because I know working in the office, right, people are always quick to give feedback on the snacks available and the beverages available.
Spencer Gagnon: 00:53:35.857 Yeah, the quantitative analysis is actually pretty tricky since our portfolio has, at any given time, about 14 flavors and currently three enhancements but depending on the machine model. So we have 17 concentrate options in general but depending on the machine model, they can hold a maximum of either four or eight. So in assessing popularity, it’s always this relative assessment versus what else is in the machine which makes it a little tricky. One thing we look at a lot is basically what share of the total flavor volume does a particular flavor get relative to others? But it’s a surprisingly tricky analysis.
Caitlin Croft: 00:54:21.978 Sure. Well, thank you to the amazing Bevi team. This was really great. I’m really excited to see what you guys do with InfluxDB 3.0. We’ve been busy working away at the latest products. So I’m really excited to see what you guys do with it and also what the community does. So thank you, Sean and Spencer, for joining today. I know there were a ton of questions. So if anyone has any further questions that they would really love to ask the Bevi team, everyone should have my email, I’m happy to put you in contact with them. And once again, this has been recorded. It will be available tomorrow morning. And I really appreciate everyone joining today.
Spencer Gagnon: 00:55:08.089 Thank you, appreciate it.
Sean Grundy: 00:55:09.664 Yeah thanks so much, Caitlin.
Caitlin Croft: 00:55:10.980 Thanks, Spencer. Thanks, Sean. Bye.
Sean Grundy: 00:55:14.043 Bye.
[/et_pb_toggle]
Spencer Gagnon
Lead Software Engineer, Bevi
Spencer has been engineering scalable solutions to Bevi's IoT problems for the past three years, with a specific focus on cloud infrastructure. He has a passion for explicit code, type safety, and live music of all genres.