Webinar Recap: Building an AI Anomaly Detection Pipeline with InfluxDB
By
Charles Mahler /
Developer
Dec 18, 2023
Navigate to:
In this webinar hosted by InfluxDB and HiveMQ, we focus on how you can create value for your business using new tools in the AI and database ecosystem to quickly deploy AI models to perform tasks like anomaly detection.
The webinar starts with a high-level overview of how MQTT and time series data can be valuable in an industrial IoT environment. You will learn how the demo application uses HiveMQ and InfluxDB to collect sensor data from machines and analyzes that data with a machine learning model that monitors for anomalies. Demo application architecture
Tools used during the webinar
InfluxDB
InfluxDB is purpose-built for time series data workloads. InfluxDB 3.0 has 45x better write throughput and faster queries for recent data. InfluxDB 3.0 supports unlimited cardinality and is 100x faster when querying against high cardinality data. InfluxDB 3.0 persists data as Apache Parquet, which can lead to 90%+ savings in storage costs. For more performance benchmarks, check out this benchmarking paper. Parquet is part of the Apache open data ecosystem, which enables InfluxDB to connect to some of the best tools for statistical analysis, artificial intelligence (AI), and machine learning (ML). InfluxDB 3.0 architecture
HiveMQ
HiveMQ provides a platform that aids in the development of MQTT applications. The HiveMQ team recently introduced HiveMQ Edge, an embedded lightweight MQTT broker designed for edge deployment, which is ideal for IoT use cases. HiveMQ has a suite of MQTT products with both HiveMQ-managed and self-managed options.
Quix
Quix is a platform designed to simplify and streamline the development of streaming applications. It provides a robust and flexible environment for working with real-time data, enabling developers to build, deploy, and manage event-driven applications more efficiently. Quix is particularly useful in scenarios where handling large volumes of streaming data is essential, such as in IoT, financial services, or real-time analytics. Quix dashboard showing event logs
Hugging Face
Hugging Face is a platform that provides tools and services for building and deploying machine learning models. The Hugging Face Hub allows developers to try out models uploaded by other developers, access datasets, and create API endpoints for others to use their models programmatically.
Anomaly detection demo
The anomaly detection demo takes in simulated data modeled on output created by devices on a factory floor in a production environment. This data is then collected and analyzed to detect anomalies. Here is a high-level diagram of the architecture: Industrial IoT application architecture
Now let’s walk through how the demo works step by step. The first part of the application is the simulation script that generates data meant to represent output by machine sensors on a factory floor. Each data point contains metadata about the machine sending the information and the specific data payload including fields like temperature, power, and vibration.
This data is transferred over the network using MQTT protocol via a hosted HiveMQ broker. Telegraf is another alternative to HiveMQ, especially in situations where you need support for protocols other than MQTT.
The MQTT broker sends the data to Quix, which runs several Python scripts as part of a data pipeline. The data is first written into InfluxDB, then queried from another script to have a machine learning model hosted on Hugging Face analyze the data for anomalies. Data format and pipeline
The result from the machine learning model is added as an additional field to each data point and then written to InfluxDB as a new measurement. The demo application’s last step is creating a dashboard using Grafana. This dynamic dashboard shows raw data and highlights any anomalies by displaying anomalous data in a different color. Grafana dashboard showing raw data and highlighted anomalous data
Key takeaways
Use AI to improve operational efficiency for IoT
The main takeaway from this webinar is that you can take advantage of existing off-the-shelf tools to make your business more efficient without requiring experts to to implement them.
Ensure you are collecting your data for long-term use, even if you aren’t ready to use advanced AI solutions yet. One way to do this is by using MQTT and HiveMQ as the “plumbing” for integrating all your systems. This data can then be stored efficiently using InfluxDB.
AI will only continue to be integrated further for industrial IoT use cases
As LLMs and other AI models continue to improve, there are a ton of potential expanded use cases that become possible. One suggested use case would be allowing workers to “talk” to a factory using natural language to query the database and get answers to their questions. This would be like having a custom ChatGPT that knows everything that happens in the factory from the data collected. Another potential use case is using AI to create dynamic dashboards and user interfaces on the fly.
Watch the webinar
Check out the full webinar here. If you want to run the demo yourself, check out the code in the Github repo.