InfluxDB 3 Core and Enterprise Architecture Highlights

Navigate to:

Introduction

Time series data innovators and open source community members following us will know that we recently released two new products: InfluxDB 3 Core and InfluxDB Enterprise. InfluxDB 3 Core is a high-performance recent data engine optimized for real-time monitoring, data collection, and streaming analytics use cases. InfluxDB 3 Enterprise builds on Core’s foundation by integrating historical analysis and data compaction, enabling efficient querying over extended time ranges. It also adds enterprise-grade features like high availability, scalability, and enhanced security.

Currently in alpha, the response to both products has been overwhelmingly positive, and we have received valuable feedback from the community, guiding our path forward. In response to the excitement and requests for a deeper dive, this blog explores the architecture and operational capabilities of InfluxDB3 Core and Enterprise. Let’s dig in!

Single Node/Single Process

InfluxDB 3 Core and InfluxDB 3 Enterprise feature a straightforward, single-node and single-process architecture. You can run them as a single executable or deploy them in a Docker container. This streamlined design allows you to download, install, and launch a fully operational database and Python-based Processing Engine in under a minute.

FDAP Stack

Next, it’s important to highlight that InfluxDB 3 Core and Enterprise are built on the FDAP stack, which is designed to optimize data storage, processing, and interoperability. FDAP stands for Flight, DataFusion, Arrow, and Parquet—four key technologies that power the InfluxDB 3 product line.

  • Flight enables efficient data transfer for large datasets.
  • DataFusion provides a powerful query engine with SQL and InfluxQL support.
  • Arrow ensures fast in-memory data processing.
  • Parquet serves as the columnar storage format, allowing integration with a wide range of analytics tools.

Leveraging the FDAP stack, InfluxDB 3 offers enhanced interoperability with external tools and libraries, SQL compatibility, and lightning-fast data transfer.

I recommend checking out my colleague Andrew Lamb’s blog for a deeper dive into this topic.

Diskless architecture

Core and Enterprise offer flexible storage options, including memory, disk, and object storage. When using object storage, the system operates in a diskless architecture where the object store serves as the sole persistence layer, prioritizing simplicity and storage efficiency.

Here is how data moves through the system before reaching object storage:

  • Writes are buffered in memory and flushed to a Write-Ahead-Log (WAL) file every second.
  • Flushed data moves to a queryable buffer in Arrow format, enabling fast, in-memory analytics.
  • Every ten minutes, WAL snapshots are converted into Parquet files, ensuring efficient data storage, rapid retrieval, and interoperability with external analytics tools.

To optimize write performance:

  • Parallel writes can be used to minimize the impact of the one-second delay.
  • Single-threaded clients are limited to one write request once per second, maintaining a consistent ingestion rate.

Processing Engine/Plug-In System

Another key component of the architecture is the built-in Processing Engine, which brings a new level of extensibility and control to both InfluxDB 3 Core and Enterprise. Check out this blog that shows you how to leverage the Processing Engine. For users familiar with Kapacitor with 1.x or Flux Tasks with 2.x, the Processing Engine is a more powerful and fully integrated engine for acting on data as it arrives, on-demand, or on a schedule—with multiple trigger types. The Processing Engine empowers you to transform and normalize data, combine multiple sources, trigger alerts and notifications, downsample, replicate data, and more—all through powerful plugins. Imagine running a Python script on incoming data without the need for a separate server. Now, picture doing it seamlessly, with no network transfers in or out of the database. That’s the Processing Engine, and we think you are going to love it!

Enterprise Architecture Enhancements

InfluxDB 3 Core and InfluxDB 3 Enterprise can be installed on edge devices as small as a Raspberry Pi, making them ideal for embedded systems and sensor-based workloads. At the same time, they scale vertically to handle very large workloads. But what if you need to scale horizontally or ensure high availability without relying on a single node? That is where InfluxDB 3 Enterprise can take you even further.

InfluxDB 3 Enterprise includes all of Core’s features while offering a seamless upgrade for commercial deployments. Unlike Core, it is optimized for long-range historical queries through its use of a compaction process that organizes and catalogs Parquet file data.

High Availability

InfluxDB 3 Enterprise is built for stability across diverse workloads, from running on bare metal or VMs to Kubernetes. It delivers high availability, read replicas, and a dedicated compactor, making it ideal for enterprise-scale applications that require reliability and scalability.

Enterprise achieves high availability through its diskless architecture by leveraging the object store to enable high availability.

  • Nodes can act as readers and/or writers, allowing concurrent writes to object storage for horizontal ingestion.
  • Nodes write to dedicated areas, while downstream readers can pick up writes from multiple nodes and compacted Parquet files across all nodes.

object storage

If a node fails, another node can continue reading from the failed node’s storage and compacted Parquet files, ensuring uninterrupted access to data and maintaining system reliability.

Read Replications

In InfluxDB 3 Enterprise architecture, read replication enables virtually unlimited read replicas, capped only by object storage request capacity.

  • A few nodes handle writes, while unlimited read replicas scale query performance.
  • Read replicas function as Processing Engines, providing flexibility in cluster setups.
  • Replicas pull data from all designated write nodes, compacting Parquet files for efficient queries.

When deploying InfluxDB 3 Enterprise, choosing the right cluster setup depends on workload size, availability requirements, and performance goals. Below are three recommended configurations.

Simplest High Availability: Two-Node Setup

The simplest setup is a two-node approach, where both nodes act as readers and writers, and one also runs the compactor. This design ensures fault tolerance- if one node goes down, the system keeps running without immediate downtime. Since the compactor operates on a read/write node, this setup is perfect for smaller workloads that prioritize cost efficiency while maintaining resilience.

Both nodes leverage the same object storage, allowing long-term storage and providing a cost-efficient, high-availability solution for production environments.

Three-Node Setup: Dedicated Compactor for Scalability

A three-node setup is the recommended approach for optimized performance and scalability. Two nodes handle reads and writes, while a third node is dedicated to compaction, preventing resource competition and ensuring efficient data processing.

This configuration allows the compactor to scale independently, enabling resource allocation based on real-time workload demands. With two read-write nodes operating at full capacity, queries and writes remain high-performing, while compaction runs in isolation without impacting other nodes.

This setup is ideal for workloads requiring consistent performance across reads and writes, ensuring reliability regardless of which node persists a request. It also facilitates vertical scaling of compaction resources to meet evolving data needs.

Five-Node+ Cluster: High-Throughput Scaling

A 5-node+ InfluxDB 3 Enterprise cluster extends the three-node setup by adding dedicated write nodes, increasing ingest capacity, and ensuring efficient data persistence to object storage. Downstream, a dedicated compactor optimizes storage by compacting files while read nodes handle queries.

Read nodes can function as query servers or processing engines, enabling flexible, isolated query and processing workloads. This separation allows for arbitrary query and processing setups without interference.

With multiple write nodes for high-ingest throughput, multiple read nodes for scalable query performance, and a dedicated compactor for isolated processing, this setup is ideal for large-scale workloads that require horizontal scaling, query isolation, and high-performance processing.

Share your feedback

We hope you enjoyed learning a little bit more about some of the highlights of the Core and Enterprise architectures. Let us know what you think! Check out our Getting Started Guide for Core and Enterprise, and share your feedback with our development team on Discord in the #influxdb3_core channel, on Slack in the #influxdb3_core channel, or on our Community Forums.