Scaling Your Time Series Workloads with InfluxDB 3.0: New Tools, Improvements, and Products Now Generally Available
By
David Sprogis /
Developer
Sep 04, 2024
Navigate to:
Over the past year since its initial release, the InfluxDB 3.0 product suite has seen numerous new features and performance improvements. These improvements reinforce InfluxDB 3.0’s position as the industry’s leading time series database, offering unparalleled performance with unlimited cardinality, high-speed, independently scalable ingest, real-time querying, and superior data compression using Parquet format on cost-effective object storage. With these updates, InfluxDB 3.0 is making it easier for developers to manage time series workloads at any scale, whether in IoT, finance, aerospace, or any environment that relies on high-resolution data.
With that in mind, we’re excited to announce the latest improvements and new features in InfluxDB 3.0 and the general availability of InfluxDB Clustered, our self-managed product for large-scale time series workloads.
New features across the InfluxDB 3.0 product suite
Operational Dashboards
InfluxDB Cloud Dedicated users can now access operational dashboards that offer comprehensive observability of their cluster’s performance. The dashboard used to monitor your InfluxDB Cloud Dedicated cluster is a Grafana dashboard managed by InfluxData. These dashboards provide insights into each component of InfluxDB 3.0 to identify potential bottlenecks and optimization opportunities by monitoring data ingestion, query performance, and compaction.
Single Sign-On
InfluxDB Cloud Dedicated now supports single sign-on (SSO) for enterprise-grade access control to your InfluxDB cluster. By connecting your identity provider to InfluxData’s managed Auth0 service, you can easily grant or revoke access to your InfluxDB cluster, just as you would any other system.
Management API
The management API for InfluxDB Cloud Dedicated simplifies the programmatic management of databases, API tokens, and tables. With this API, users can automate actions like spinning up new InfluxDB instances, creating new databases with customized partitions, and managing API tokens for developers accessing your InfluxDB instance.
Parameterized Queries
Support for parameterized queries with InfluxQL and SQL delivers query reusability and helps prevent potential SQL injection attacks. Additionally, parameterized queries can be used at the application level to define permissions for data manipulation and allow for more fine-grained control.
InfluxDB Clustered now generally available
InfluxDB Clustered, our self-managed product, is now generally available, allowing users to take advantage of InfluxDB 3.0 in private cloud or on-prem environments. Like the rest of the InfluxDB 3.0 product suite, InfluxDB Clustered delivers the same high throughput for data writes and reads, independently scalable writes and reads, support for unlimited data cardinality, real-time data analysis, and native SQL support for large time series workloads.
InfluxDB Clustered can be deployed to Kubernetes using a Helm chart and features fully decoupled ingest, query, and storage tiers. This architecture allows you to independently scale components of your InfluxDB deployment as needed. This high availability and scalability allows you to tailor your technical infrastructure to meet your specific workloads and needs. Whether your workloads are write-intensive, read-intensive, or both, and if you have strict security and data residency requirements, InfluxDB Clustered offers the flexibility to handle it all.
InfluxDB 3.0 performance improvements
InfluxDB 3.0 continues to see significant performance improvements across multiple vectors. Key enhancements include upstream contributions to the open source Apache DataFusion engine at the core of InfluxDB 3.0. These contributions not only benefit InfluxDB 3.0 but also the entire DataFusion community. Other performance improvements come from InfluxDB-specific features like custom partitioning that allows users to optimize InfluxDB for their specific workload to achieve better performance.
Here are just a few performance improvements made since the release of InfluxDB 3.0:
- Custom partitioning: InfluxDB Clustered and Cloud Dedicated now allow developers to define how their data is grouped within underlying Parquet files. By default, data is partitioned by day, with each day’s data stored in the same Parquet file. Custom partitioning enables you to adjust this time range or partition by specific tags based on your needs. For example, if you frequently query by specific tag values, you can partition accordingly, ensuring it’s stored in the same file, therefore improving performance by reducing the number of files accessed during queries. Partitions can also be created using a combination of tags and time. Read the documentation to learn more about this feature.
- Improved aggregation and grouping performance: InfluxDB 3.0 has achieved a 2-3x performance improvement across various types of aggregation and grouping queries by enhancing parallel aggregation in DataFusion. This was accomplished by rewriting parts of DataFusion’s Group By implementation to reduce allocations and better utilize vectorization.
- Faster string-intensive query performance: Queries involving strings have seen performance gains of 20-200% following the addition of StringView support to DataFusion. This improvement was driven by improving UTF-8 validation and compiler optimizations.
These performance enhancements enable companies like ju:niz Energy to increase the volume of data they store by around 100x while maintaining query performance and reducing storage costs due to low-cost object storage in InfluxDB 3.0. Another great example is Joby Aviation, which uses InfluxDB Clustered to rapidly ingest, efficiently compress, and retain massive volumes of time series data generated from their electric vertical takeoff and landing (eVTOL) aircraft while keeping storage costs under control.
Get started with InfluxDB 3.0 today
InfluxDB 3.0 continues to evolve and improve, offering a robust suite of features and performance capabilities to support any time series data workload. Whether you’re managing IoT sensor data, monitoring applications, or launching rockets into space, InfluxDB 3.0 has the tools, performance, and flexibility you need to succeed.
To get started with InfluxDB 3.0 talk to our sales team or get started with a proof of concept.