System Tables Part 2: How We Made It Faster
By
Chunchun Ye /
Developer
Oct 31, 2024
Navigate to:
In the first post, we introduced system tables and how to use them to inspect your cluster. In this follow up, we’ll explain some techniques to improve the speed of system table queries.
7. The problem
Before August 2024, querying system tables in Cloud Dedicated, particularly system.tables
, system.partitions
, and system.partitions
, often took a long time to run, even with filters applied. In some cases, queries timed out without returning any result.
8. Why it was slow
To understand the causes of slow queries, let’s first look at how system table data is generated.
Below is a simplified overview of the InfluxDB 3.0 architecture (referred to from the blog InfluxDB 3.0: System Architecture). The central component, Catalog1, stores metadata about databases, tables, columns, and file details like file size, location, created time, etc. On the right side, we have Querier, where the system tables reside and queries are executed.
Figure 1: InfluxDB 3.0 Architecture
The metadata in the Catalog is organized following the simplified data model shown below (Figure 2) : Figure 2: Simplified Catalog Metadata Data Model
As the InfluxDB Catalog serves as the central cluster coordinator, it provides a restricted interface that does not permit analytic-style queries, for performance and stability reasons. Because of this, the Querier must make multiple calls to the Catalog to gather the information required to fill the system tables. Figure 3 illustrates what a typical data flow looks like when querying a system table like system.partitions
.
system.partitions
table has a column that shows the total size of a partition in megabytes. To compute this, the Querier first gets all the tables in the database, then retrieves all partitions for each table, and looks up every Parquet file associated with each partition. After that, it adds up the file sizes and converts the total to megabytes. This process involves several requests between the Querier and the Catalog, as it requires multiple data retrievals, as shown in Figure 3.
Figure 3: Data Flow Between Querier and Catalog
8.1 The Performance Issue
Previously, when querying system tables like system.tables
, system.partitions
, or system.compactor
, the Catalog would scan all the metadata and send it to the Querier in gRPC format. The Querier would then convert the responses to Arrow record batches, and run the query using DataFusion, which would apply the filters to discard irrelevant data.
This meant that queries like these:
SELECT * FROM system.partitions;
SELECT * FROM system.partitions WHERE table_name = 'foo';
Both did the same amount of work in the Catalog and Querier, even though the second query has a filter. The system scanned everything, applying the filter at the Querier level, leading to unnecessary overhead and slow performance.
9. The solution
9.1 Predicate Pushdown
A predicate is a condition used to filter data in a query, such as table_name = ‘foo’
or age > 20
. We implemented predicate pushdown, a common database optimization technique, moving (or “pushing down”) certain filters (predicates) as close as possible to the data source (i.e., within the Catalog), as shown in Figure 4. This change reduced the amount of data fetched and transmitted to the Querier, which in turn reduced the workload on Querier.
Figure 4: Predicate Pushdown
Before we implemented predicate pushdown, a query like SELECT * FROM system.partitions WHERE table_name = 'foo'
fetched and formatted partition information for all tables in the Catalog, and the query engine promptly threw away everything except for foo
.
After predicate pushdown, the querier avoids fetching and formatting partition information that it determines will be filtered out during query execution.
9.2 Multiple Predicates Pushdown
In addition to supporting single predicate pushdown, we extended support to handle multiple filters. The simple examples in Section 9.1 are likely obvious, but in a real system, users can provide arbitrary predicates connected by AND
, OR
, IN
, etc., and it is non-trivial to determine which predicates to push down. For example, consider a query like:
SELECT *
FROM system.partitions
WHERE (table_name = 'foo' OR table_name = 'bar')
AND (partition_key = '2024-10|device-101' OR partition_key = '2024-09|device-101')
We used DataFusion’s LiteralGuarantee::analyze to parse and simplify query predicates before pushing them down to the Catalog. This method has saved us a lot of engineering time and effort. A big shout-out to the DataFusion community for making this so simple to use!
The current implementation supports multiple predicates pushdown for filters such as table_name
combined with partition_key
or table_name
combined with partition_id
, further reducing the amount of data processed by the Querier.
9.3 Concurrent Data Fetching
Previously, the Querier made sequential API calls to the Catalog, where each request had to wait for the previous one to complete before proceeding. This added significant latency, especially when querying large datasets.
We improved this by enabling concurrent API requests, allowing the Querier to make multiple requests simultaeneously This greatly reduced the time needed to gather all necessary metadata.
10. Performance improvements
Here is how much faster querying system tables become when using the filter WHERE table_name
:
- system.tables: 17% faster
- system.partitions: 65% faster
- system.compactor: 60% faster
These improvements are based on a database with 100 tables, over 200 partitions, and more than 3,000 parquet files. If your database has more tables, partitions, or Parquet files, you’ll see even more significant performance gains with filtered queries.
Additionally, the average query latency with multiple filters is around 20 ms for queries like:
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
11. Wrapping up
In this post, we explained how we improved the performance of system table queries by implementing predicate pushdown, optimizing the use of multiple filters, and enabling concurrent data fetching. These changes dramatically reduced query times, especially for databases with large amounts of metadata.
With these optimizations, you can retrieve relevant system data more efficiently and eliminate long debugging waits.
References:
- Note that in our actual deployments, there are several layers of caching that are not reflected in Figure 1.