Building Multi-Layer Data Pipelines for Autonomous Industrial Robots
Session date: Nov 19, 2024 08:00am (Pacific Time)
Discover how modern robotics systems can leverage multi-layered data architectures to serve diverse stakeholder needs, from real-time machine control to business intelligence. Using Urban Machine’s autonomous lumber reclamation robot as a case study, this technical session will demonstrate how to architect a comprehensive data pipeline that processes everything from low-level sensor data to high-level business metrics.
The webinar explores practical implementations of ROS (Robot Operating System) data handling, real-time machine learning inference monitoring, and business KPI tracking through live demonstrations of production systems. We’ll dive into strategies for managing different data velocities—from millisecond-level computer vision metrics to monthly business performance indicators—while maintaining system reliability and data accessibility.
This session covers:
- Architecting multi-stakeholder data pipelines for industrial robotics
- Integrating ROS data streams with business intelligence platforms
- Real-time monitoring of machine learning and computer vision systems
- Building effective dashboards for technical and non-technical stakeholders
- Strategies for cloud replication and data availability
Watch the Webinar
Watch the webinar “Building Multi-Layer Data Pipelines for Autonomous Industrial Robots” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Building Multi-Layer Data Pipelines for Autonomous Industrial Robots.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors. Speakers:
- Anais Dotis-Georgiou: Developer Advocate, InfluxData
- Alex Thiele: Co-Founder & Chief Software Architect, Urban Machine
ANAIS DOTIS-GEORGIOU: 00:06
Hello, everybody, and welcome. The webinar will begin shortly. I’m just going to give everybody a few minutes to kind of trickle in and get settled. And while we wait for everyone to do that, I’m also going to go over some general housekeeping. So, a recording of this webinar will be made available to you at the end of the recording, at the end of the webinar. And if you have any questions, I want to encourage you to ask them in the Q&A or the chat. Either one, I’ll be monitoring both so we can make sure and answer all your questions. And my name is Anais Dotis-Georgiou. I’m a developer advocate here at Influx Data, and I will just be hosting this webinar for Alex Thiele, who I’ll introduce in a second. And I also wanted to encourage you to ask any questions that you might have on the Influx Data Slack or the Influx Data Community Forums at community.influxdata.com. And ask any questions that you have there that either you think of later or might be about something related. If you don’t get a chance to ask them here, highly encourage you to seek that out as an additional resource.
ANAIS DOTIS-GEORGIOU: 01:12
So today we will be talking about building multi-layer data pipelines for autonomous industrial robots. And this webinar will explore practical implementations of robot operating systems, data handling, real-time machine learning inference and monitoring, and business KPI tracking through live demonstrations of production systems. So, Alex Thiele is our speaker today, and he is the co-founder and chief software architect at Urban Machine. And he leads the development of groundbreaking robots that convert construction waste into high-quality reclaimed lumber. In fact, he was just telling me today that a lot of that lumber is from old-growth forests, which is cool because it gives us the opportunity to preserve those forests more and preserve that wood, which is extremely valuable. So that’s really cool to learn about. And without further ado, I’ll give everyone Alex Thiele just to get going. So, thank you. You’re on mute.
ALEX THIELE: 02:15
Gotcha. Thank you so much, Anais. I really appreciate it and the opportunity to share the story and talk about our machine is always awesome. I’ll jump into InfluxDB in a bit, but I figured I’d give everybody the elevator pitch as to what our machine is. I think it’s an interesting idea. And any time that we talk about it, we get lots of questions about not just the tech, but people seem really interested in the wood or lumber market and how it works and why Urban Machine could even exist. I come from a software background. So, when I started working on this project, I knew next to nothing about lumber. Now I know more than I want. So, jumping into this— oh, actually, it looks like I’m on the wrong account for this. Let me just hop onto this. Perfect. And everybody can see the slideshow?
ANAIS DOTIS-GEORGIOU: 03:07
Yeah. It looks great. Thank you.
ALEX THIELE: 03:09
Awesome. I just got a notification that some of the videos weren’t visible. Perfect. So yeah, jumping onto this. Urban Machine is solving the lumber waste problem. And it’s a massive problem. We’re at home throwing our trash into carefully ordered bins. And at the same time, construction is just tearing down entire buildings and throwing them directly into the landfill. That’s not entirely true, but on the wood side, it basically is. Most of the lumber from large-scale demolition is just getting torn to shreds, thrown into an industrial-sized grinder. They use a big magnet to suck all the metal out. And then usually, that ground-up wood will go to the landfill as what’s called alternative daily cover. So, they just throw it on top of the landfill as a layer at the end of the day to keep the smell down.
ALEX THIELE: 04:03
It’s not a great use of resources. Lumber is super valuable. People love wood and love building with it. Architects love using it. So, we saw all that waste and thought, “Why is it not getting reused?” So, we want to do this mostly from the environmental perspective. That’s where my passion drives from. Once you grind up the wood, you’re basically turning it into methane a lot faster. So, there’s this huge environmental incentive not to do that. But I’ll jump into the economic incentives too because that’s kind of the main reason we’re here.
ALEX THIELE: 04:43
So, the goal here is basically, can we turn buildings into LEGO bricks, right? You have all these stick frame houses and the wood in them is— the houses aren’t getting torn down because the wood is old. They’re getting torn down because they want to build something new. There’s better use for that land. There’s something else wrong. Only occasionally is there actually an issue with the lumber. So, this is a very stable product. Think about when you buy lumber right now from Home Depot, it’s going to be this large squiggly line. When you get wood out of a house, it’s already dry. It’s been dried and straightened for many decades now. So that product isn’t going to change anymore. So, when you take stuff out of a house, you’re buying more stable, guaranteed, proven product. There’s a lot of caveats to everything I’m saying here, but that’s generally the trend that we found.
ALEX THIELE: 05:43
So, the goal, local lumber. Even if you don’t have forests, if you’re in Austin, Texas, you could have an Urban Machine shop that is selling lumber that’s locally sourced, saving on all kinds of transport costs and giving us this huge economic margin where we don’t have to pay the same transport costs as virgin lumber because we’re sourcing all of ours from demolitions in the area.
ALEX THIELE: 06:11
So, I’m here for the tech side of things, and I’m going to spend a lot of time today talking about what we do. So, let’s just start off with a little bit about Urban Machine’s current deployment. This is at All Bay Mill & Lumber in Napa Valley. We set up a bunch of tents. We’ve done several deployments there. We’ve built six iterations of our robotic system. Initially, it was just one big robot, and then we split it up into many. And the goal here is to take out the metal. So, I kind of skipped over the part where we grind up the wood and you throw it into the landfill.
ALEX THIELE: 06:55
But why are they grinding up the wood? Why can’t they just take this lumber and sell it back to the lumber marketplace? Well, the answer ends up just being the metal in the wood. Any metal in wood is absolutely terrifying for lumber mills. When we went down touring in Oregon, doing research on this project, all the lumber mills said that they were terrified of wood that comes from hunting grounds because they might hit bullets. And one guy running a lumber mill there, he had a wall of shame of pieces of wood that broke one of his machines. One of them was a piece of wood that had a power line going through it. So, I guess the tree grew around the power line. All of this is just to emphasize how terrifying the concept of reclaimed lumber might be if you’re a lumber mill.
ALEX THIELE: 07:49
And lumber needs to be reworked. If you take it out of the house, you’re going to need to plane it down. You’re going to need to cut it into shape. So reclaimed lumber today, it’s not that big of a market. People will take out the metal by hand, but you have to take out the metal. So that’s the core of what we’re doing here is take out metal from wood so that it can be reprocessed and resold as high-value usable material. Awesome. So, we’ve been taking steps to make that reclaimed lumber even higher value. So, we got FSC certified. Just means that we’re following all the right rules to guarantee its green lumber.
ALEX THIELE: 08:42
So, I want to jump into— what is happening here? Yeah. Well, this video is not too important because I’m going to do a live demo of this later. But I’m hoping I have a very technical audience here today because I want to show off how we built all the software stack that we’ve built. Because we’ve built six different iterations of robots, we’ve had to build a very modular, powerful stack that can handle lots of different robotics cases. And doing that is hard. And it requires a lot of decision-making. So, I wanted to walk through some of those decisions and why we made them.
ALEX THIELE: 09:27
So, at the very core of our stack is ROS. And if you’re not a roboticist, that’s fine. Basically, ROS is a message broker that has a robotics community around it. So, think RabbitMQ, or Kafka, or Proto Buffers. It’s got message generation system, but it just happens to be more in the robotics space, have more real-time related features. And we like it. We’ve built a lot of tooling around it to make it even better for our use case. And we chose Rviz for visualization. That’s part of the ROS ecosystem. It’s a nice 3D viewer. I’ll be showing off a demo of our entire robot stack running locally here. So, I look forward to that. And that’ll include Rviz as the kind of beautiful visualization firmware, C/C++. And then all our AI stack is powered by PyTorch models that we’ve created or trained in some way, shape, or form. And as far as training data, we use Hasty AI. I highly recommend them for labeling large data sets. Our robots must be able to detect small, tiny nails in the wood, and we use computer vision for that. And that requires gathering, and training, and labeling lots of data.
ALEX THIELE: 11:00
Okay. So, I skipped over the most important part, which is our database. We chose InfluxDB to be our metrics and ingestion system for almost everything. And some background on why. About four years ago— I mean, about eight years ago, I co-founded aotu.ai, and we were working on building computer vision systems that could handle arbitrary number of CCTV cameras. So, imagine you’re a system integrator that’s integrating, let’s say, 100 or 200 cameras into a supermarket, or a restaurant, or an oil rig, or cities with parking spots, you might want to run AI on those cameras. And basically, we were building BrainFrame, which was a system for deploying any kind of AI across many cameras and scaling that so that you can turn cameras into APIs, basically.
ALEX THIELE: 12:08
And that system, we used Postgres as our database. And we ended up finding that it just couldn’t handle the volume of reads. And then the queries that we were writing weren’t running nearly as performantly as we wanted, no matter how smart we got about it. Although, to be fair with databases, there’s always something you can do. And during that time, we were working with Intel, and Intel showed us this system called EIS. It was like Edge Insights Industrial or something. But it was basically Intel built their own little ROS stack for distributed computing and computer vision. And we noticed that it used InfluxDB. And so, we started looking into that. We were like, “That’s cool.”
ALEX THIELE: 12:56
And then we never could migrate our old database to InfluxDB, but I remembered that as something to note that if I have high-throughput data, maybe I should use InfluxDB on the next project that I do. So, starting Urban Machine, it was almost a no-brainer. We looked into a few other time series databases. We liked InfluxDB the most. One key thing that kind of turned us over was that InfluxDB handles replicating rights to the cloud. And we knew from day zero that this robot would be deployed in areas with crappy or no internet for months at a time. And then it would have periods of internet where we would want to sync all that data to the cloud. And we couldn’t find that feature, at least in such an easily set up way on any other time series database. So, we just decided, “Okay, that’s it. We’ll use InfluxDB. It seems to be robotics adjacent. It supports replicated writes. It’s designed for time series queries, which we’re going to be doing a lot of. And it’s high throughput.” So that’s kind of why we came into this.
ALEX THIELE: 14:09
So, talking about just building robots, we knew from day zero, we’re going to make a lot of these. Our engineering team decided, “Let’s move, fast, break things. Let’s build lots of robots quickly. Throw them away. Build more robots.” So, we needed a very modular system. We needed to be robust, and it needs to not be specific to any one type of hardware. We’re human-constrained. Every minute of debugging is almost wasted time. And if you’ve ever worked on a complicated system, debugging can take a lot more than a minute. It could take you a whole day or several days if you’re working on something that involves multiple layers of the stack. So having great logging, having good metrics is super important. And that’s just when you’re developing.
ALEX THIELE: 15:03
Once you actually deploy to the field and something weird happens and then it doesn’t happen again for the next few days, and then it happens again, you really want to have as much information about before, during, and after the anomaly so that you can start tracking down so that every time this rare occurrence happens, you have as much data as possible to give you hints as to what’s happening. So, we knew from day zero that we would want to have decent metrics, and we would be changing those metrics all the time. And so, suffice to say, it needs to be a schema-free database. It has to be time series. And we love Grafana. If you’re not using Grafana with InfluxDB, I highly recommend it. I know this isn’t the Grafana talk, but I could do another webinar for Grafana and spout all kinds of good things about it. I’ll be showing off dashboards today, and all of them are InfluxDB talking to Grafana and rendering those dashboards. So, I love that it’s compatible and it works so nicely.
ALEX THIELE: 16:09
So, here’s how we’re actually doing this. Here’s the communication flow across our stack. Everything is containerized, including our ROS stack. So just imagine all these boxes being containers. We start out at the ROS nodes. So, this is all the logic for, “Hey, robot arm, I need you to move to this X, Y, Z location. I need you to grab this nail.” I’m running computer vision, doing all this stuff. So, ROS nodes are dumping tons of runtime metrics towards InfluxDB. I mean a lot. So, it goes from ROS to InfluxDB, and then InfluxDB holds that and sends that off to Grafana when we want to query, or look at a dashboard, or understand what was happening. We also collect all the logs from Docker directly using Promtail, feed that to Loki. So, Loki is, think like InfluxDB, but for logs. It’s just a database for holding logs.
ALEX THIELE: 17:11
And these two together is very powerful because in Grafana, you can basically— let’s say you have a graph of anomalies over time. You could select a time range on that graph and then have your logs right underneath it also be time constrained. So, you can jump around at different time periods and see text logs changing. So, I highly recommend using InfluxDB alongside Loki for that experience.
ALEX THIELE: 17:42
So, let’s actually show you what this looks like. We’re always happy to show off our IP. It’s all patented and I want to hear people’s feedback. So, this is going to be kind of an intro to the whole robot stack. I’m going to start out with showing you what our dashboards look like. So, this is here in Napa Valley, an iPad on one of the machines. And I’m just showing that the operators can jump in and look at analytics. And it’s all nice and clickable and always up to date with the latest information for that day. So just to kind of ground things in reality, this is what all that tech ends up looking like as far as data. That data from all these deployments, all these ROS nodes, right, are basically— just to zoom in a little bit. Sorry, my cats are fighting.
ALEX THIELE: 18:39
Just to zoom in a little bit, the way we implemented this in ROS is we have different ROS nodes that are just publishing ROS messages. And a telemetry node that we created ourselves is basically just calling the InfluxDB Python library and saying, “Hey, write, write, write.” And that sends it over to the InfluxDB server. I will note, if you’re doing this, make sure to set a large batch size on the library because what ends up happening is if every call has a write, you might run into an issue later when you do replication because replication will then, later, replicate every write to the cloud one by one, and it can take forever. So just make sure that from day zero, you’re writing things with a batch. So, it only ever writes to the server after, let’s say, 1,000 metrics have been collected.
ALEX THIELE: 19:36
This is how our deployment looks like if you zoom all the way out. If we had more than one deployment, this would be trivial to implement. We already have replication working. Basically, each unique location has its own InfluxDB local OS running on the edge, and we replicate all our rights to the InfluxDB cloud. And then we have Grafana Cloud for viewing them. So, this is also a great setup. It’s worked out great for us. We don’t have to host anything, and we just pay a few subscription fees and get great data with great reliability.
ALEX THIELE: 20:12
Okay. I talked a big game about all these robots but let me show you. These are the iterations that actually made it to the field. So, every robot here has cleaned lumber in some way, shape, or form. Awesome Ash was the first robot we ever built. It took us one month. I remember the CEO wearing overalls welding away for a whole day after we’d gotten all the steel frames. This robot started out as a napkin sketch, and we basically wanted to explore the problem space as quickly as possible. There were these shuttles for transporting the wood around. We had this robotic gantry which was supposed to grab onto screws and turn them around. That was all mistakes, but we learned very quickly that we shouldn’t do things that way.
ALEX THIELE: 21:04
We went with 80/20 for our next build. There are still these shuttles here that can transport wood, but they’re much smarter. Each shuttle was like seven degrees of freedom. They could clamp different wood dimensions very tightly and move them around with millimeter-level precision. We had a robotic gantry over there. We ended up doing two sides to learn how to pick nails from two sides. All our control panels were built onto plywood, so we were still in the fast iteration, don’t worry about reliability stage of things.
ALEX THIELE: 21:40
Then about six months later, we created Charming Cypress. So, if you’re not catching on, each version is alphabetically ordered with an adjective and then a tree name. So Charming Cyprus was our first real deployment. It goes to another location to process wood. We were really excited about this. This was the first tent. That tent was too weak, and it blew away in the wind after about six or seven months. But you live and you learn. And this one was more reliable. It was definitely getting to the point where we’d learned everything we needed to know about how to move wood around. But we hadn’t quite learned how do you build a system that won’t fail once every few hours, that only fails once every few days. And so that was great. We got a lot of data from that, and that let us make a few key decisions for the current robot iteration.
ALEX THIELE: 22:41
And before I jump into that, I’m just going to point out that this iteration, basically you have an entry cell, an entry conveyor, a picking machine, and then a metal detector. For the final iteration, we made some huge sweeping design changes, and we decided to make three robots instead of one. So, this is the final iteration. We built one machine to cut the nails, another machine to cook the nails, and I’ll tell you what that means, but it’s not what you think. And then a final machine to grab the nails. So, we decided that our process needed to be even more complicated. But in order to achieve the throughput that we wanted and be able to pull out screws and nails as reliably as we wanted, we needed to add a few more parts to the process.
ALEX THIELE: 23:34
So, I’m going to show you each of these machines. This first one is the cutter. And basically, a conveyor feeds the wood into the system. And these two bandsaws move in and close around the wood, and they chop off all the nails that are sticking out of the sides of the wood. So that basically we wanted to transform the wood from this pokey porcupine state to a flat plane. Because once something is a flat plane, you can use other— basically, we wanted to get rid of 3D cameras. We wanted to make the wood easier to move. And this achieved both of those goals very, very easily. So, you pass this through this machine, and you get what we consider normalized wood. You might be wondering, “Oh, aren’t you cutting the heads off of the nails? How are you going to grab them?” Our system does not rely on the head being there for us to be able to pull it out. We can grip on the shaft of the nail with so much force that it doesn’t slip. So that’s the cutter. You can see how the nails end up being just shaved right off.
ALEX THIELE: 24:54
So, the next part, the cooker. That first prototype I showed you of Awesome Ash was this entire robot gantry that could grab onto a nail and twist it counterclockwise to be able to pull it out. We really thought that was going to work. Looking back on it, that idea, you really can’t turn a nail counterclockwise when it’s been through a landfill. It’s been thrown into a truck. Those nails are sheared. They’re bent. You don’t even know what angle it is inside of the wood without an X-ray. And don’t buy X-rays. They’re a safety nightmare. We made that mistake also.
ALEX THIELE: 25:33
So, we realized we need a better way. We need to find some way to turn screws into nails because we were good at pulling nails. We were already able to millimeter precision find nails, grab them, and pull them out. So, what this system does is it drives high-power induction coils right on top of the entry point of the nail, zaps it for four seconds, and that burns all the wood one millimeter around the metal. So basically, we’re returning the metal super hot, about 700 degrees Fahrenheit in a rapid succession, flash heating it, and weakening the wood fibers around it. And when you do that to screws, you basically are burning all the wood that hold onto the threads. And that reduces the force required to remove screws from over 800 pounds, 500 to 800 pounds, down to 20. It’s this crazy step change.
ALEX THIELE: 26:36
And for screws, a lot of the times, it doesn’t even matter how much force you put into it. Usually, the screw will just snap off, leaving metal inside of the wood, and that’s a non-option. So, you can see here how the nails come out. They come out just a little bit blackened and burnt. A little ashy. And at that point, you know it’s going to be really easy to pull. So, this second process was an innovation from us that we’re really proud of, and it made the whole system much more likely to succeed. And it turns screws into nails, and it turns nails into easy-to-pull nails. So overall, very happy to have it.
ALEX THIELE: 27:21
And last but not least, is the picker. So, this is the machine we had the most experience building. And even still, for this final iteration, we made it a lot more complicated. This is the clamping mechanism. It can support any width of wood. Oh, sorry, I didn’t go to the next one. There we go. This is the bigger— so this is the clamping mechanism, and it’s a little bit more complicated than the other systems. There’re springs. And we basically need to be able to have lots of control authority over the wood, but also fixture it so that you can manage all these large forces as you pull against the wood. And you can see that robot gantry is going in, pushing into the wood. There’s this beak that pushes a few millimeters into the wood, grips, and just pulls with all its might.
ALEX THIELE: 28:14
We’re using cameras on the in-feed and on the gantries to do machine learning on detecting all the nails and guessing what kind of nail they are, where they are, how we should pull them, and just constantly streaming that information back to ROS to make the best decisions. We’ll grab. We’ll take a look. We’ll see if we grabbed it. We’ll see if we went deep enough, see if we need to put more torque onto the next pick. And yeah, that’s the picker.
ALEX THIELE: 28:42
So, I think there’s a little time-lapse here just to give you an idea. But at the end of the day, what you end up with is these little, tiny bite marks and no metal. So, I think this one came out metal-free. People wonder, do the bite marks matter? Is that a problem for customers? Well, you’re forgetting that most of this wood is going to be planed down about one millimeter or two anyways. So, once you plane it, it looks clean. And for most structural applications, it doesn’t really matter. Also, for nails that are on the adjacent side, you can just run the piece of wood on its side and pick all the nails there as well.
ALEX THIELE: 29:24
And at the end, we have this metal detector. So, this metal detector is super, super sensitive. I think on the spec sheet, it said that it could be used for detecting iron fillings in cereal boxes, like the nutritional iron. So, it’s very sensitive. Anything that passes through this can basically be considered 100% metal-free, and lumber mills don’t have to be afraid of machine stamped lumber. It’s safe to run through their machines. They’re not going to break anything. No one’s going to get hurt. So that’s the process. It took a while to get here, but we’re happy with it.
ALEX THIELE: 30:01
And what do we do with all this information? We have three robots. They’re generating all kinds of data. We have several years’ worth of robot development. So now I’m going to shift over and talk about InfluxDB and how awesome it is and how much it’s helped us build cool, interesting dashboards that have impressed the operators, and investors, and folks around us. So, the very first thing is this is what the operators see. So, the people feeding the machines day in and day out.
ALEX THIELE: 30:34
Basically, during the next few slides, I’m just going to show you different dashboards and talk about who is looking at that, why they care about that data, and how we’re capturing that data, and how we’re displaying it. So, this first screenshot is—this is the iPad view. So, they’re constantly interfacing with this. And down here is the Influx portion, that working time and availability. That’s sourced from InfluxDB. And what its measuring is how many minutes has the machine been active today? What percent different is that from yesterday? And what percent of the day has the machine been available for use? So basically, you want big numbers on both sides, and it’s just sort of meant to be there to help operators know if they’re on track with yesterday, better or worse, significantly, or about the same. We added this when we realized that our machine availability was dismally low, and we needed to work on the ground processes to make sure that wood is always being fed into the machine.
ALEX THIELE: 31:41
Awesome. So, the next use case is kind of one level up on the type of information that they want to see. But imagine a field manager, someone that’s there on the floor, looking at the processes, looking at what lumber truck just dumped there, what kind of wood there is, looking at what’s coming out of the outfeed of the robot, making decisions about sales at some points, lumber sales, and doing inventory. So, these folks, they care that the robot is working, and we have this nice dashboard here to, at a glance, be able to take a look at that. Again, all of this is InfluxDB data being fed from ROS to InfluxDB to Grafana to our React-based dashboard.
ALEX THIELE: 32:27
So, at a glance, they can see what percentage of the shift has the robot actually been running? How many faults are we having? Is it a better or worse day? Should I be contacting an engineer and saying, “We’re having a lot of issues,” or “Is this fine?” It looks like not that many, but still some. And then what’s our current economics? So how much would we have to sell today’s load of lumber to make a buck on it? Keep in mind that the lumber that we get is this crazy long tail of variety. So, you have loads from landfills that will have almost no nails, and its pure profit. You just run it through the machine, validate it through the metal detector, and you’re good. And then you have other loads that have high nails, and they’re also a small cross-section, or you have lots of nails, but it’s beautiful, old growth. So, there’s a lot of things that this data doesn’t capture, but it does give you a quick look and an idea as to what’s going on.
ALEX THIELE: 33:31
And this fasteners per meter metric is what we look at pretty often to just get an idea of like, “Are we doing difficult wood today, or should we just be cranking through this stuff?” So, if you have low fasteners per meter, but your throughput is also low, that means that probably there’s some other issue, like the machines aren’t being fed enough. So, this data is super helpful. And being able to prototype with Grafana to get different data to people’s eyes is really powerful. And InfluxDB lets you write pretty complicated queries that reconfigure that data into something that can be useful. So, if you just save as much as possible, you can later take that and mold it into something that’s useful for your business case. That’s at least what I’ve found.
ALEX THIELE: 34:20
And then kind of another layer up is what CEOs and investors might care about. So, this is our trends dashboard. It’s very simple, but it captures long periods of time worth of data as opposed to the other dashboards I’ve been showing off, which are like, “I want to know the last day worth of data.” So, this is kind of like how much wood have we moved for the last 30 days? How dense has that wood been? How active has the machine been? What’s the uptime? That kind of thing.
ALEX THIELE: 34:50
And then getting into kind of the nitty-gritty, I think this is the most exciting stuff. Field engineers, people who are debugging an issue right now. So, this is a dashboard that we built that shows pretty nitty-gritty data. So, this is showing the states of every ROS action happening at any point in time throughout the whole robot over a selected period. And ROS actions, they’re kind of like something that the robot’s doing. So it could be, for example, you can see— sorry, I’ve got cats here just running around. Okay. You could see, for example, one of these actions is toggle beak. So that is referring to the grasping, the actual closing of the vice grip that grabs the nail. And you can kind of walk over here and see that it happened at 10:39 and 55 seconds. Somewhere around there, we toggled the beak. Okay. Cool. That’s useful, I guess. You can see that we turned on the vision stream. This yellow bit shows that there were two concurrent ROS actions, so it was happening more than once at the same time.
ALEX THIELE: 36:01
This is really useful if there’s an anomaly. I can’t tell you how many times it’s like, “Okay, we had a crash,” or something bad happened at exactly this timestamp. And then I zoom in with this and I can see all the order of events that led up to that and usually get a pretty good idea. It’s like, “Oh, I see. The robot was moving along its X-axis.” And then we had a crash that said there was an X fault. And then I can jump into the logs and confirm that it was some high-speed movement that crashed into the wall of the robot. I don’t know. It could be anything.
ALEX THIELE: 36:35
And then for each of these actions, you have these execution times. So, you have a histogram of how long each of these actions is taking on average. So, imagine if there was an action for just picking a nail, you could see a histogram of how long does it take on average to go grab a nail and pull it? Because that’s going to vary, and some nails are harder, some nails are easier. So, we have one of these dropdowns for every single action. You can dig as deep as you want. So, the idea is this is kind of meant more for field engineers. They’re debugging real-world problems that just happen. They want to detect about a solution. That’s the idea. And all that data helps for that.
ALEX THIELE: 37:23
You get one level more nitty-gritty and you have software engineers. They have different needs. So, they’re not debugging an issue that just happened. They are basically trying to optimize something that they’re running on their own machine, changing something, going back, restarting, running again. And so, this is our timing dashboard, and it’s, I think, one of the coolest dashboards that we’ve built. The way to read this is, let’s say you’re working on a computer vision pipeline, and it has lots of steps along it, and speed is very important. And also, lag can happen at different parts of that pipeline. Let’s say you have a step for grabbing a frame from a camera, putting that frame on the GPU, running inference on that frame, taking those results, and turning them into useful grasp point data on each fastener. So, each of those can take different amounts of time. And depending on how hot the day is, it might start taking more time on average. Or let’s say you had a lot of nails in one picture. Let’s say that final part of the pipeline takes a long time. So, you really want to be able to detect those lag spikes and see if you can work around them and develop better solutions around that.
ALEX THIELE: 38:42
So, the way to read this is when you put one of these timers in your code, it’s easy to add, and it just times that section of code and sends that data off to InfluxDB. And when you create one, you tell it ahead of time, “Here’s the spec. I want this part of this process to be able to run at under 130 milliseconds each time.” And so, when you graph that on InfluxDB, that’s what this dashed green line is. It’s saying, “Hey, your timing is good.” All of the times, these blue lines, all the times that this process that you wanted to measure ran, it ran faster than the spec you said it needed to run at. So, you’re good. So, I used Grafana, and I basically highlighted green anything under this line going up to the blue line.
ALEX THIELE: 39:29
So, here’s an example of— this is part of our process where we take an image, and we project it into 3D. So, it can be a little slow, and there’s some networking involved there to get information about the robot to know where the wood is. And that can be slow. It’s really important that it’s not. So, you can see here, there’s the solid blue line. That’s the median time of how long something took. And that’s kind of key is this dashboard is meant to put lots of data into your eyes in a useful manner. So, the blue line is showing median times. Out of the last 100 frames, this one dot represents how long it took on median. But the blue shading shows what range. Like, “Oh, the slowest call took longer than your spec. So, you’re actually out of spec in these different areas.” And then here’s a really bad out-of-spec area where every median call was out of spec. It’s red. So, you’re having issues here. And in this case, it looks like our full vision pipeline was running too slowly. And that’s on my laptop. So, I’ll be showing a demo there today, and you’ll get to see how slow it runs.
ALEX THIELE: 40:42
So, with that in mind, here’s a query. I just wanted to dump this and show you sometimes you need to take data from InfluxDB, and it’s not really in the format that you need. So, you need to do a lot of reconfiguring to get the data you want, but it’s totally possible. This is data that takes all the wood planks that we ran throughout a day and turns it into information about how available the machine has been throughout that day.
ALEX THIELE: 41:11
Okay. I’m going to show you real robot stuff happening in simulation. Also, I can’t wait to answer questions. Just start thinking of things you want to ask. We’re super open here at Urban Machine. We have no competitors for better or worse, so we’ll share any interesting tech details as long as someone can learn from it because robotics is hard enough as it is. So, jumping into it, I’m going to show you today is a 3D— sorry, I’m letting the cats out because they’re just freaking out. This is a simulation of our robot. So, our robots, all three of them, all the ones we’ve ever built, have a full digital twin. When they’re running production, they show the real-world position of the robot. And when they’re running here on my laptop, they’re showing a fake position, and everything is completely mocked out. And so, you can see you can see the nail gripping components here. This is what actually grabs the nail. You can see the rollers. And these green arrows are showing fake data of how far away the wood is. We’ve got these camera feeds that are currently turned off.
ALEX THIELE: 42:27
So, I’m going to show you— I’m going to basically run some fake wood through this and simulate the whole process. And then we’re going to look at the data that came out of that simulation and jump around those graphs and kind of talk about that. So, on my browser here, I’m clicking start. And basically, this is going to close in on the wood, and we’re going to see simulated wood images from a past scan. So, this is running and detecting wood, and it found some nails, and now you can see this robot going in and picking them. And every once in a while, it’ll update the position of the nail and try to see it. There’s a lot going on here, and I think I’m probably very jaded, and I’m used to seeing this view, so my brain is parsing it all. But you just have to take my word for it, that it’s picking fake nails. Each of these little arrows is a nail that has been attempted with machine learning detection. So, it’s a nail that’s been cut off. That’s how it detected it. Pits means how many times it saw it. You’ll also see this big gray rectangle. That’s the approximate width and height of the wood. You can see that it’s extending off into infinity. That’s because we haven’t seen the end. But we just saw the end and it gets fed out of the machine.
ALEX THIELE: 43:42
And eventually, this one will disappear. And when it disappears, it gets published to InfluxDB. So, this piece of wood, all the robot motions that just happened, everything about that piece of wood just got published to InfluxDB. So why not show you? So, I’ll just jump onto InfluxDB, and we’ll get to see all the data from the piece of what I just ran. So, this is the dashboard for the fake robot here. And if I go into analytics, you’ll see down here at 8:44:08 AM, we had a piece of wood run through. So, I’m going to select this time range, and it’s going to update all the time ranges across the board here.
ALEX THIELE: 44:31
The first thing you’ll see here is the simulated piece of wood we ran through was 1.8 meters. It didn’t trigger any fake metal detectors. That’s good. It ran at about a throughput of 1,000 meters of wood per eight-hour shift or translated to about 2,100 board feet of wood. If we could sell simulated wood, we’d be making a decent profit. And it had 3.3 fasteners per meter. 5 of them were cut off. It was more left side heavy on the nails than right. So, it’s kind of interesting data. 5 fasteners scene, 12 pick attempts. So, you’ll notice that it attempted to pick more times because in our simulation when we grab a fastener and we pull it out, it doesn’t disappear from the camera view. So, the robot always thinks it missed it.
ALEX THIELE: 45:26
And if I go back here and I say, let’s view the data for today, you’ll see that I was running— I was already running some wood earlier today. So, each of these dots is a plank of wood that I ran through the simulation. So, this is kind of what it looks like when you’ve run a few pieces of wood. You’ve got a little bit more data. You’ll see I managed to get a fault to happen. And then, of course, if you had more robots, for example, the cooker, it would show up in this dropdown. And you could take a look at data from that machine.
ALEX THIELE: 46:05
And at this point, I thought I’d actually show you real-world data from our production machine today. So, this is data from— here, let’s see if they’ve been running it today. We’ll analyze performance. Awesome. So, this is today. They’ve run 24 planks at 50 meters length total. It looks like it’s low-ish fastener density. We’re doing all right. Lots of cut-off nails, lots of nail heads. Okay. We’ve got some staples. Sometimes you’ll go for weeks without seeing any staples. Nothing too crazy here. Looks like they were operating in the morning, and they’re still running up to right now. And if we jump into here, we can actually see the vision stream is maybe not as healthy as it could be. So that’s basically a measure of how many times we’re seeing each fastener. You don’t want it to be too high because that means you could run the wood faster, but you don’t want it to be too low because you want to be able to see the fastener and be sure that it’s there. And our pick success rate in the last 15 minutes was 100%. So, let’s see here. 91%, 100%, 77, 85. So the performance has been all right. With that, I think I’m going to wrap up the demo portion. And I’d love to take questions and jump into the weeds with anybody here who liked what we showed off.
ANAIS DOTIS-GEORGIOU: 47:39
Thank you so much, Alex. That was super interesting. And all the demos and visualizations just made it so real and getting to see all the videos of the actual robot.
ALEX THIELE: 47:48
I’m glad. I nerd out about this every day, so it’s awesome to share.
ANAIS DOTIS-GEORGIOU: 47:51
Yeah. Super cool. We do have a couple of questions, and I encourage anyone else to ask any questions that you have as well. But the first question is, why did you create your own writer instead of using Telegraph?
ALEX THIELE: 48:03
So basically, we wanted to publish ROS messages, and we just needed something that could grab that message and send it to InfluxDB. So, I guess to answer, I haven’t used Telegraph, so that might be the first reason. Maybe if I used Telegraph and I was familiar with it, it would be a no-brainer. But we just wanted the nodes and the metrics they’re publishing to be disconnected from InfluxDB. Sorry, InfluxDB. But if we ever switched to some other provider, this makes it really easy to do. It also made it easier to unit test our code. So, we’re just publishing metric messages, and so we can unit test by subscribing to those metrics using ROS and seeing, “Yeah, that looks correct,” or “No, that’s not right.” I hope that answers your question.
ANAIS DOTIS-GEORGIOU: 48:52
Yeah. It answers it for me, at least. I don’t know if the author has more follow-ups. But we also have a couple other questions coming in from the Q&A. And the first one is, at the edge in the shop, are you using the open-source version of InfluxDB? And is historical data stored on the machine or sent to a central or office server?
ALEX THIELE: 49:12
Yes, and yes. So, we’re using the open-source version of InfluxDB. That holds all the data for that machine. But it also replicates the rights to the Influx Cloud. So, Influx Cloud, we’re ingesting all the data from all our robot fleets. We kind of jumped the gun on that because we only have one deployment right now. But it’s set up. So, if we had two, three, or four deployments across the country, all that data, including a tag about which deployment it is, would be ending up on the Influx Cloud.
ANAIS DOTIS-GEORGIOU: 49:50
Very cool. And then we have another question that’s a little off topic. So, I’ll make sure to— it’s just about data pipelining using Telegraph, Kafka, and InfluxDB. And I’ll make sure to share some resources in the chat for that specifically. But I also had some questions around just magnetic sensors and looking at machine learning as a tool versus that. You mentioned that X-ray isn’t an option for the safety concerns. Are magnetic sensors just not specific enough, or [crosstalk]?
ALEX THIELE: 50:21
No, this is a great question. Lots of Urban Machine lunch meetings have been spent thinking, “Are we missing anything?” So, we bought an X-ray. We had an X-ray created and shipped to us from London for way too much money. And we started using it and we realized we’re going to be putting this on the field, driving it through trucks. If anything in that shielding moves around, oh, God. And then you’re running through this X-ray. You need to be able to create this map of the wood. And then you need to be able to connect that map to something you just saw more localized. Because at the end of the day, you need millimeter-level precision to be able to grab small objects. And we realized that that would be really hard to do.
ALEX THIELE: 51:09
And once we thought about it even more, our machines aren’t even capable of grabbing metal that’s inside of the wood. And that was one of the benefits of the X-rays. If there’s metal deep inside the wood, you can see it. But we can’t even pull it out, so we don’t need to see it. At some point, you’re going to find out through the metal detector that there’s metal in there you didn’t get, and you’re going to take a brief look at that piece of wood. And if you can’t see it, it gets scrapped because there’s just infinite planks at the infeed. You should think of it like chip-making where there’s some yield percentage where you’re throwing away 5, 10 percent of the wood because it’s not even worth the trouble.
ALEX THIELE: 51:50
But we did stop and do a rev on can we use small magnetic sensors and create an array of them and get some more information at the surface level of wood. We really tried. We had an engineer work on that for about a month and a half doing research, building out little prototypes. The hard thing is when you need to handle such a wide width and height range of wood, it’s really tough to build. And we were already upgrading our camera systems and getting really good results from the vision side that we decided it just wasn’t worth building a whole other section of the cell just for detection because we would need to have cameras also to get really good precision. So, we just decided to go 100% on cameras. I can’t say that in the future we wouldn’t add something like that. It’s a possibility to augment the camera detections with some metal detector signal. But it’s not straightforward, unfortunately.
ANAIS DOTIS-GEORGIOU: 52:57
And then the heat flashing of all the hardware, does that help with the imaging and the processing of that?
ALEX THIELE: 53:06
I wish. So, the problem is that the coil, we have to position it on top of the nail so that we get good performance. And to position it on top of a nail, you need computer vision. So yeah. When we started working on this induction system, we thought, “Oh, here’s hoping we can just feed the piece of wood through a gigantic coil, and it just cooks everything. We don’t have to localize anything.” Unfortunately, the physics don’t work out there. You have to get really close. You don’t have to touch it, but you do have to get really close to the piece of metal. So alas, we have vision on both systems, the cooker and the picker.
ANAIS DOTIS-GEORGIOU: 53:46
Okay. Makes sense. Yeah. That makes sense. And let me see. I think we have a couple more questions coming into the chat. And for a person asking about Kafka and InfluxDB, I just included some example repos from Influx Community. Influx Community just has a bunch of projects that a lot of the developer advocates make and sometimes other people as well with how to use InfluxDB with a variety of tech stacks. So, there are three examples there for using Kafka with Polygraph, with InfluxDB, and with Faust. And one for replicating a digital twin of [ACSTR?] as well. So hopefully, that’s a little helpful for you for data pipelining. And if you have any more questions, please ask in the Forms or Slack. But back to questions that we have about this presentation specifically. Do you push camera images or any other large media payloads up through the same pipeline?
ALEX THIELE: 54:48
No, we have a slightly different— we do collect all the— we collect images of every 10-centimeter scan of wood. So, we kind of have a variety of images. We have a ROS node that collects those, buffers them into the file system, and pushes those to Google Cloud storage buckets when there’s internet. And that’s kind of part of our machine learning training pipeline, where we push it all to the cloud, analyze it, automatically select some for labeling, get those labeled, train a new model. That whole pipeline’s been automated. And we needed to kind of— I’ve never actually used InfluxDB for large binaries, so I can’t attest to whether it’s great at it or not. I don’t know.
ANAIS DOTIS-GEORGIOU: 55:37
Yeah. I wouldn’t recommend using InfluxDB for that. And then someone else— a lot of people are asking about ROS 2. Upgrading to ROS 2, why or why not? And what sort of pipeline changes do you envision?
ALEX THIELE: 55:53
I have a bad habit of saying ROS when I mean ROS 2. Let me just clarify. We are running on ROS 2. We always have been.
ANAIS DOTIS-GEORGIOU: 56:03
Nice.
ALEX THIELE: 56:05
Oh, and it does it give me any issues? No. But I will say— I’m going to take this opportunity to promote an open-source library that we’re going to be releasing hopefully this week, but definitely next week. But basically, Urban Machine is going to drop two big libraries. One of them is our ROS helpers. Basically, a suite of wrappers around ROS that help making writing reliable Pythonic code in ROS very easy. So, follow Urban Machine. We’re going to drop a ton of IP there that has nothing to do with wood robotics and everything to do with writing clean, reliable ROS code. I would say with that library, ROS has given us no issues. It’s been a wonderful thing to use. I think people are a little too harsh on ROS when really, it’s literally just a message queue. Any message queue system, you need to kind of cater to your own needs. We wrote an opinionated wrapper around it, and I think they’re good opinions. So, take a look at it. And I’d love to get some feedback on there. And any collaborators who are interested, jump on in.
ANAIS DOTIS-GEORGIOU: 57:14
Very cool. Thank you. And let me see if— does anyone else have any other questions? I’ll give a couple of seconds for someone to ask. But if not, Alex, thank you so much for this presentation. It was super interesting to learn about this use case. It makes my job a lot more interesting, and I like learning or doing cool things with InfluxDB like this. Makes my day. So, thank you so much for this presentation and for the tech that you’re creating.
ALEX THIELE: 57:40
Thank you. I appreciate it. Yeah. Thanks for all the interest, guys. I really appreciate it.
ANAIS DOTIS-GEORGIOU: 57:44
Yeah. And one final reminder, a recording of this webinar will be made available to you. You should get that in your inbox shortly. Thank you so much, everyone, for joining, and I hope you all have a good day.
ALEX THIELE: 57:56
Awesome. All right.
ANAIS DOTIS-GEORGIOU: 57:57
Thank you.
[/et_pb_toggle]
Alex Thiele
Co-Founder & Chief Software Architect, Urban Machine
Alex Thiele is the Co-founder and Chief Software Architect at Urban Machine, where he leads the development of groundbreaking robots that convert construction waste into high-quality reclaimed lumber. Previously a co-founder of Aotu.ai, he developed a smart vision platform capable of scaling to process hundreds of AI video streams, pioneering in the field ahead of the recent AI boom. With a background in robotics and AI, Alex is dedicated to leveraging technology for a greener future.