System Tables Part 1: Introduction and Best Practices
By
Chunchun Ye /
Developer, Getting Started
Oct 29, 2024
Navigate to:
As an InfluxDB Cloud Dedicated or Clustered user, you may want to inspect your cluster to gain a better understanding of the size of your databases, tables, partitions, and compaction status. InfluxDB stores this essential metadata in system tables (described in Section 1), which help inform decisions about cluster performance and maintenance.
1. What are system tables?
System tables are “virtual” tables that present metadata for a specific database and provide insights into database storage. Each system table is scoped to a particular database and is read-only, meaning it cannot be modified.
System tables are hidden by default, as high-frequency access to these tables can interfere with the ongoing operations of the database. Thus, querying system tables requires a special debug header with the request. Once the debug header is added (described in Section 2), you can query system tables using SQL, similar to any other table in InfluxDB.
Here are the system tables that InfluxDB provides:
+---------------+--------------------+-------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+-------------+------------+
| ... | ... | ... | ... |
| public | system | compactor | BASE TABLE |
| public | system | partitions | BASE TABLE |
| public | system | queries | BASE TABLE |
| public | system | tables | BASE TABLE |
| ... | ... | ... | ... |
+---------------+--------------------+-------------+------------+
In this blog, we will focus on three tables:
system.tables
system.partitions
system.compactor
Table | Description | Schema |
system.tables |
Contains information about tables, such as table name and partition template in the specific database. | Link |
system.partitions |
Contains information about partitions, partition sizes, file count, etc. | Link |
system.compactor |
Contains detailed information about compacted partitions at different compaction levels. | Link |
Warning: System tables are not part of InfluxDB’s stable API. They are subject to change, and compatibility is not guaranteed.
Warning: Querying system tables may impact write and query performance. Use them only for debugging purposes and use filters to optimize queries and minimize their impact on your cluster.
2. Accessing system tables
To access system tables, you must provide a debug header with the request. The specific commands to add this header vary depending on the client you are using.
influxctl CLI
For influxctl
, set the --enable-system-tables
header:
influxctl query \
--enable-system-tables \
--database DATABASE_NAME \
--token DATABASE_TOKEN \
"SQL_QUERY"
Arrow Flight SQL or other client libraries
For Arrow Flight SQL or other client libraries, such as Go and Python, set the iox-debug
header to true
.
3. Querying system tables: Examples
1. View the partition template of a specific table
SELECT * FROM system.tables WHERE table_name = 'TABLE_NAME'
Example Result:
+-----------------+--------------------------------------------------------+
| table_name | partition_template |
+-----------------+--------------------------------------------------------+
| your_table_name | {"parts":[{"timeFormat":"%Y-%m"},{"tagValue":"col1"}]} |
+-----------------+--------------------------------------------------------+
If a table doesn’t include a partition template in the output of this command, the table uses the default (1 day) partition strategy and doesn’t partition by tags.
2. View the number of partitions and total size per table
SELECT
table_name,
COUNT(*) AS partition_count,
SUM(total_size_mb) AS total_size_mb
FROM system.partitions
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name
Example Result:
+------------+-----------------+---------------+
| table_name | partition_count | total_size_mb |
+------------+-----------------+---------------+
| foo | 1 | 2 |
| bar | 4 | 5 |
| baz | 10 | 23 |
+------------+-----------------+---------------+
3. View the size for different levels of compacted files*
SELECT
table_name,
SUM(total_l0_files) AS l0_files,
SUM(total_l1_files) AS l1_files,
SUM(total_l2_files) AS l2_files,
SUM(total_l0_bytes) AS l0_bytes,
SUM(total_l1_bytes) AS l1_bytes,
SUM(total_l2_bytes) AS l2_bytes
FROM system.compactor
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name
*Compacted files are compressed Parquet files processed by the Compactor to optimize storage. These files have different compaction levels: L0, L1, and L2. L0, or “Level 0”, represents newly ingested, uncompacted small files, while L2, or “Level 2”, represents compacted, non-overlapping files.
Example Result:
+------------+----------+----------+----------+----------+----------+----------+
| table_name | l0_files | l1_files | l2_files | l0_bytes | l1_bytes | l2_bytes |
+------------+----------+----------+----------+----------+----------+----------+
| foo | 0 | 1 | 0 | 0 | 20659 | 0 |
| bar | 0 | 1 | 0 | 0 | 7215 | 0 |
| baz | 0 | 1 | 0 | 0 | 10784 | 0 |
+------------+----------+----------+----------+----------+----------+----------+
4. Optimize queries to reduce cluster impact
Querying system tables can degrade the performance of other common queries, especially if you are trying to view every detail in clusters with hundreds of tables, hundreds of thousands of partitions, and millions of Parquet files.
To reduce the performance impact, we suggest selecting information for a specific table or a particular partition by adding filters as follows:
WHERE table_name = '...'
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
WHERE partition_id = ...
See documents on how to obtain partition_key and partition_id.
5. Use the most efficient filters
Among the above filters, the following filters are specially optimized and significantly reduce query latency to about 20 ms, even on our largest clusters:
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
6. Bringing it home
In this first post, we introduced system tables, explained how to access them, and discussed how to optimize your queries with filters. In the next post, we will explain how we improved the performance of system tables.
References: