System Tables Part 1: Introduction and Best Practices

Navigate to:

As an InfluxDB Cloud Dedicated or Clustered user, you may want to inspect your cluster to gain a better understanding of the size of your databases, tables, partitions, and compaction status. InfluxDB stores this essential metadata in system tables (described in Section 1), which help inform decisions about cluster performance and maintenance.

1. What are system tables?

System tables are “virtual” tables that present metadata for a specific database and provide insights into database storage. Each system table is scoped to a particular database and is read-only, meaning it cannot be modified.

System tables are hidden by default, as high-frequency access to these tables can interfere with the ongoing operations of the database. Thus, querying system tables requires a special debug header with the request. Once the debug header is added (described in Section 2), you can query system tables using SQL, similar to any other table in InfluxDB.

Here are the system tables that InfluxDB provides:

+---------------+--------------------+-------------+------------+
| table_catalog | table_schema       | table_name  | table_type |
+---------------+--------------------+-------------+------------+
| ...           | ...                | ...         | ...        |
| public        | system             | compactor   | BASE TABLE |
| public        | system             | partitions  | BASE TABLE |
| public        | system             | queries     | BASE TABLE |
| public        | system             | tables      | BASE TABLE |
| ...           | ...                | ...         | ...        |
+---------------+--------------------+-------------+------------+

In this blog, we will focus on three tables:

  • system.tables
  • system.partitions
  • system.compactor
Table Description Schema
system.tables Contains information about tables, such as table name and partition template in the specific database. Link
system.partitions Contains information about partitions, partition sizes, file count, etc. Link
system.compactor Contains detailed information about compacted partitions at different compaction levels. Link

Warning: System tables are not part of InfluxDB’s stable API. They are subject to change, and compatibility is not guaranteed.

Warning: Querying system tables may impact write and query performance. Use them only for debugging purposes and use filters to optimize queries and minimize their impact on your cluster.

2. Accessing system tables

To access system tables, you must provide a debug header with the request. The specific commands to add this header vary depending on the client you are using.

influxctl CLI

For influxctl, set the --enable-system-tables header:

influxctl query \
  --enable-system-tables \
  --database DATABASE_NAME \
  --token DATABASE_TOKEN \
  "SQL_QUERY"

Arrow Flight SQL or other client libraries

For Arrow Flight SQL or other client libraries, such as Go and Python, set the iox-debug header to true.

3. Querying system tables: Examples

1. View the partition template of a specific table

SELECT  *  FROM  system.tables  WHERE  table_name  =  'TABLE_NAME'

Example Result:

+-----------------+--------------------------------------------------------+
| table_name      | partition_template                                     |
+-----------------+--------------------------------------------------------+
| your_table_name | {"parts":[{"timeFormat":"%Y-%m"},{"tagValue":"col1"}]} |
+-----------------+--------------------------------------------------------+

If a table doesn’t include a partition template in the output of this command, the table uses the default (1 day) partition strategy and doesn’t partition by tags.

2. View the number of partitions and total size per table

SELECT
  table_name,
  COUNT(*) AS partition_count,
  SUM(total_size_mb) AS total_size_mb
FROM system.partitions
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name

Example Result:

+------------+-----------------+---------------+
| table_name | partition_count | total_size_mb |
+------------+-----------------+---------------+
| foo        | 1               | 2             |
| bar        | 4               | 5             |
| baz        | 10              | 23            |
+------------+-----------------+---------------+

3. View the size for different levels of compacted files*

SELECT
  table_name,
  SUM(total_l0_files) AS l0_files,
  SUM(total_l1_files) AS l1_files,
  SUM(total_l2_files) AS l2_files,
  SUM(total_l0_bytes) AS l0_bytes,
  SUM(total_l1_bytes) AS l1_bytes,
  SUM(total_l2_bytes) AS l2_bytes
FROM system.compactor
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name

*Compacted files are compressed Parquet files processed by the Compactor to optimize storage. These files have different compaction levels: L0, L1, and L2. L0, or “Level 0”, represents newly ingested, uncompacted small files, while L2, or “Level 2”, represents compacted, non-overlapping files.

Example Result:

+------------+----------+----------+----------+----------+----------+----------+
| table_name | l0_files | l1_files | l2_files | l0_bytes | l1_bytes | l2_bytes |
+------------+----------+----------+----------+----------+----------+----------+
| foo        | 0        | 1        | 0        | 0        | 20659    | 0        |
| bar        | 0        | 1        | 0        | 0        | 7215     | 0        |
| baz        | 0        | 1        | 0        | 0        | 10784    | 0        |
+------------+----------+----------+----------+----------+----------+----------+

4. Optimize queries to reduce cluster impact

Querying system tables can degrade the performance of other common queries, especially if you are trying to view every detail in clusters with hundreds of tables, hundreds of thousands of partitions, and millions of Parquet files.

To reduce the performance impact, we suggest selecting information for a specific table or a particular partition by adding filters as follows:

WHERE table_name = '...'
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
WHERE partition_id = ...

See documents on how to obtain partition_key and partition_id.

5. Use the most efficient filters

Among the above filters, the following filters are specially optimized and significantly reduce query latency to about 20 ms, even on our largest clusters:

WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...

6. Bringing it home

In this first post, we introduced system tables, explained how to access them, and discussed how to optimize your queries with filters. In the next post, we will explain how we improved the performance of system tables.

References: