TL;DR InfluxDB Tech Tips – How to Interpret an Annotated CSV
By
Anais Dotis-Georgiou /
Use Cases, Developer, Product
Jul 15, 2020
Navigate to:
In this post we share how to interpret an Annotated CSV, the Flux query result format for InfluxDB.
Q: What is an Annotated CSV?
A: An Annotated CSV is the output format of a Flux query with InfluxDB 2.0. For example, imagine that we’re writing data about the number of “calico” and “tabby” cats, both “young” and “adult” cats, in two shelters, “A” and “B”. The data layout looks like this:
Bucket | "cats-and-dogs" |
Measurement | "cats" |
Tag Keys |
Tag Values |
"shelter" | "A", "B" |
"type" | "calico" , "tabby" |
Field Keys |
"young" , "adult" |
If we query our data with:
from(bucket: "cats-and-dogs")
|> range(start: 2020-05-15T00:00:00Z, stop: 2020-05-16T00:00:00Z)
|> filter(fn: (r) => r["_measurement"] == "cats")
|> filter(fn: (r) => r["_field"] == "adult")
|> filter(fn: (r) => r["shelter"] == "A")
|> filter(fn: (r) => r["type"] == "calico")
|> limit(n:2)
This is what our Annotated CSV result looks like:
#group,false,false,true,true,false,false,true,true,true,true
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double,string,string,string,string
#default,_result,,,,,,,,,
,result,table,_start,_stop,_time,_value,_field,_measurement,shelter,type
,,0,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:50:33.262484Z,8,adult,cats,A,calico
,,0,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:51:48.852085Z,7,adult,cats,A,calico
Understanding the Annotations
Q: What are those “hashtag headers” of an Annotated CSV?
A: First, what might look like CSV headers are called Annotations. The three annotations, in order, are:
#group
: A boolean that indicates the column is part of the group key. A group key is a list of columns for which every row in the table has the same value. Let's look at the difference between thetrue
andfalse
columns, or the columns that are and aren't part of the group key, respectively .true
columns: In our example query above, we've filtered by a single field type,adult
, a single"shelter"
tag, "A", and a single"type"
tag,"calico"
. These values are constant across rows, so those columns are set totrue
. Also, filtering for a single value across tags and fields means that all of our tables will belong to the same table. Therefore thetable
column is alsotrue
for this query. The_start
and_stop
columns, defined by our range, are constants across the rows so these values are alsotrue
.false
columns: The_time
and_value
columns have different values values across rows which is why they receive afalse
for the value of the#group
Annotation.
#datatype
: Describes the type of data or which line protocol element the column represents.#default
: The value to use for rows with an empty value. So for example, if we had assigned our query to the variableourQuery
this annotation would look like:#default,ourQuery,,,,,,,,,
Because the #group
Annotation is especially tricky, let’s examine it further with another example. Imagine that we filter for two "shelters"
. We execute the following query:
from(bucket: "cats-and-dogs")
|> range(start: 2020-05-15T00:00:00Z, stop: 2020-05-16T00:00:00Z)
|> filter(fn: (r) => r["_measurement"] == "cats")
|> filter(fn: (r) => r["_field"] == "adult")
|> filter(fn: (r) => r["shelter"] == "A" or r["shelter"] == "B")
|> filter(fn: (r) => r["type"] == "calico")
|> limit(n:2)
Our Annotated CSV looks like:
#group,false,false,true,true,false,false,true,true,false,true
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double,string,string,string,string
#default,_result,,,,,,,,,
,result,table,_start,_stop,_time,_value,_field,_measurement,shelter,type
,,0,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:50:33.262484Z,9,adult,cats,B,calico
,,0,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:51:48.852085Z,7,adult,cats,B,calico
,,1,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:50:33.262484Z,8,adult,cats,A,calico
,,1,2020-05-15T00:00:00Z,2020-05-16T00:00:00Z,2020-05-15T18:51:48.852085Z,7,adult,cats,A,calico
Now, the #group
Annotation for the shelter
column is false
. We would now have two group keys as well. This means that our data has two tables, so likewise the #group
Annotation for the “table” column is false
.
Q: What can I do with Annotated CSV’s?
A: There are several things you can do:
- You can export data as a CSV with the InfluxDB 2.0 UI.
- You can create temporary data with the from.csv() function or you can write that data to InfluxDB if you follow from.csv() with to().
Q: I’m new to InfluxDB and Annotated CSV is confusing, how can I work with regular CSV and InfluxDB 2.0?
A: There are several ways to write regular CSV to InfluxDB. Some ways include:
Tip: I encourage you to try different queries and view the Annotated CSV results in the InfluxDB 2.0 UI in the Raw Data viewer, as the Annotated CSV is divided into columns which makes it easier to read.