InfluxDB 3.0 vs ADX
By
Jason Myers /
Developer, Product
May 10, 2023
Navigate to:
Over the past few years, time series is one of the fastest growing database categories in the world. As more and more organizations realize how critical time series data is to their operations, more database options entered the market. InfluxDB has been the leading time series database for years, and with the release of InfluxDB 3.0, it remains at the vanguard of the time series world. So, with this in mind, let’s take a look at how InfluxDB 3.0 measures up against the competition, specifically Microsoft Azure Data Explorer (ADX).
Architecture
Both InfluxDB and ADX are columnar databases designed to handle large time series workloads and unlimited cardinality data. But that’s pretty much where the similarities end.
InfluxDB gives users and organizations a lot of flexibility for managing their time series data. First and foremost, InfluxDB is built specifically for time series data. It offers both fully managed cloud, and self-managed on-prem solutions. InfluxDB Cloud offers an elastic, multi-tenant service and a dedicated single tenant service. Delivered in AWS, Azure, and GCP, InfluxDB Cloud allows organizations to use the provider best suited to their needs without compromising performance. This gives organizations a lot of flexibility in how they use the cloud. InfluxDB also offers enterprise and open source versions for on-prem, private cloud, and edge deployment.
ADX is not a purpose-built time series database; rather, it is a relational database that supports time series workloads. ADX is only available as a managed cloud solution, so it won’t support organizations with data security or residency policies that require (or prefer) an on-prem solution. To this same point, ADX is only available on Azure cloud. While this makes it an appealing solution to Azure shops, organizations tied to other cloud providers need to either invest in developing ways to export data to Azure, or choose a different solution.
Features
Both InfluxDB and ADX offer a range of features to facilitate development. Both are designed for OLAP workloads and can process data in real time or in batches. Both offer users API access and client libraries to work with data, although they don’t necessarily offer comparable API options or libraries in the same languages. Both support schema-on-write and explicit data schemas.
Many InfluxDB features prioritize ease-of-use. It has a single API that developers can use across the entire product suite. This enhances scalability because users can spin up a new instance and use scripts they already have to get it production-ready quickly. InfluxDB supports SQL and InfluxQL for querying data. SQL is one of the most common languages in the world, reducing friction and accelerating application development. InfluxQL is a SQL-like language with added time-based functions. InfluxDB 3.0 is built on the open source Apache Arrow ecosystem and persists data in the open source Apache Parquet file format. This increases the interoperability of InfluxDB, making it easy to integrate with other Arrow-based ecosystems. InfluxDB 3.0 uses Apache Arrow Flight SQL to communicate with a large and growing pool of third-party tools, like Grafana and Apache SuperSet for visualization.
ADX’s features lean heavily on the Microsoft ecosystem. ADX provides RESTful APIs and integrates with MS-TDS. This allows users to query data using T-SQL and provides interoperability with SQL Server Management Studio. However, the database does not support regular SQL. Instead, it provides a proprietary query language, KQL for data processing and analysis. While ADX can ingest Apache Parquet files, it isn’t optimized for interoperability with other Parquet-friendly solutions, limiting the extensibility of your data. ADX does have built-in data visualization capabilities and can integrate with tools like Power BI.
Performance
Both databases can handle large, OLAP time series data workloads.
InfluxDB 3.0 separates compute and storage to maximize performance and reduce costs. It uses a columnar in-memory “hot” storage tier to cache recent data for real-time, sub-second queries and persists Parquet files to object storage, the “cold” storage tier. Parquet has an extremely high data compression ratio, which allows users to save more, high fidelity data using less space, saving on storage costs.
ADX combines compute and storage and doesn’t allow for independent scaling along these vectors. This means that users may end up paying for resources that they don’t need.
Conclusion
Because InfluxDB and ADX have several key similarities, organizations debating between the two solutions should focus on how much value they want to derive from their data and the ease of deployment with their current tool set and technical expertise. InfluxDB, through its deployment options, features, flexibility, and interoperability enables organizations to maximize the value of their time series data. While ADX is a viable option for organizations dedicated to the Microsoft ecosystem, the fact that InfluxDB can integrate with the same Microsoft services, as well as many others beyond Microsoft, potentially tilts the scales toward InfluxDB as a future-proof solution.