Unlocking the Power of Real-Time Analytics with InfluxDB
By
Jason Myers /
Developer
Dec 22, 2023
Navigate to:
About the session
In this webinar hosted by Tech Crunch, Paul Dix, InfluxData Founder and CTO, Andrew Lamb, InfluxData Staff Engineer, and Jay Clifford, InfluxData Developer Advocate, discuss using InfluxDB to unlock time series data for real-time analytics. The session focused on the new InfluxDB 3.0, and the hosts discussed the challenges of using traditional databases for time series data, the evolution of InfluxDB, and the use of open source projects in its development.
Run of show
- Understanding time series data and its use cases
- Challenges of using traditional databases for time series data
- Evolution of InfluxDB and the development of InfluxDB 3.0
- Use of open source projects (Apache Flight, DataFusion, Apache Arrow, Parquet, aka the FDAP stack) in InfluxDB 3.0
- Discussion of edge-to-cloud solutions and InfluxDB’s support for such scenarios
- Open source versions of InfluxDB 3.0
- Comparison of InfluxDB 3.0 with other time series databases
Key takeaways
Takeaway #1: InfluxDB 3.0 offers high-performance and advanced capabilities in time series and observational data management.
InfluxDB 3.0 has a fundamentally different design from previous versions, offering high-efficiency data ingestion and advanced capabilities for handling all kinds of observational data. The system optimizes query performance and uses far less memory and CPU cores to ingest more data than previous versions.
One factor contributing to this performance boost is the fact that v3.0 doesn’t do a ton of indexing. Instead, it organizes the data into Parquet files. “Most time series databases…are really designed around the metrics use case. They’re based on the idea that you have a measurement name and you have labels or tags… that describe the data and then you know a value,” Dix explained. InfluxDB 3.0 doesn’t just handle metric data; it handles observational data of all kinds, both structured and semi-structured.
Takeaway #2: The FDAP stack offers a powerful system for data management, storage, and processing.
The panelists also discussed the FDAP stack, a system that includes Flight, DataFusion, Arrow, and Parquet. The FDAP stack represents a more specialized system that can provide better performance for specific workloads, including time series data.
“The FDAP stack is like the LAMP stack. FDAP stands for Flight, DataFusion, Arrow, and Parquet. The idea is that those underlying technologies that you would use to build the whole next generation of data applications are similar to those used to build early web applications. [The LAMP stack] basically ushered in a whole new world of building applications because it was so inexpensive and rock solid … you didn’t have to reinvent the whole thing,” Lamb noted.
Takeaway #3: InfluxDB 3.0 enables users to leverage data in real-time.
InfluxDB 3.0 works with data in real-time. It enables users to quickly ingest data, making it available for query almost immediately. This capability is particularly beneficial for users building real-time monitoring and alerting systems, as well as those who need to constantly update observational dashboards.
“In our case, when we say time series, we are optimizing for real-time systems where people are building observational dashboards that update all the time and they’re building monitoring and alerting systems that essentially expect to query the data immediately and return a result potentially in hundreds of milliseconds, sub-second query latencies at this very, very large scale…” explained Dix.
Insights surfaced
- Time series data is any data with a timestamp. It tracks servers, applications, networks, sensor data, and more, providing insight into change over time.
- Traditional databases struggle with time series data due to issues with scale, real-time querying, data lifecycle management, and handling high cardinality data.
- InfluxDB is designed to address these challenges, with version 3.0 built around open source projects like Apache Arrow, Apache Arrow, DataFusion, and Parquet.
- InfluxDB 3.0 is designed to handle observational data of all kinds, not just metric data, and is optimized for real-time systems.
- The forthcoming open source version of InfluxDB 3.0, InfluxDB Edge, will be a single-server, single-process system that can operate as an island or be seamlessly attached to a centralized clustered system.
Watch the webinar: