TL;DR InfluxDB Tech Tips - InfluxDB Kapacitor

By Regan Kuchan / Developer
May 26, 2016

Navigate to:

In this post we recap the week’s most interesting InfluxDB Kapacitor and TICK-stack related issues, workarounds, how-tos and Q&A from GitHub, IRC and the InfluxDB Google Group that you might have missed.

Preserving tags with INTO queries

Q: I’m using an INTO query to move data to a new measurement. InfluxDB says it wrote the correct number of points, but when I query the data in the new measurement, I’m missing a point.

In my data, color is a tag and name is a field.

> SELECT * FROM "french_bulldogs"
name: french_bulldogs
---------------------
time                  color  name
2016-05-25T00:05:00Z  peach  nugget
2016-05-25T00:05:00Z  grey   rumple
2016-05-25T00:10:00Z  black  prince

> SELECT * INTO "all_dogs" FROM "french_bulldogs"
name: result
------------
time                  written
1970-01-01T00:00:00Z  3

> SELECT * FROM "all_dogs"
name: all_dogs
--------------
time                  color  name
2016-05-25T00:05:00Z  grey   rumple
2016-05-25T00:10:00Z  black  prince

What happened to nugget?

A: If you use SELECT * with INTO, InfluxDB converts any tags in the current measurement to fields in the new measurement. This can cause InfluxDB to overwrite points that were previously differentiated by a tag.

In your case, InfluxDB converts color from a tag to a field when it writes data to all_dogs. Because nugget and rumple have the same timestamp and are no longer differentiated by a tag, InfluxDB overwrites the nugget point with the rumple point.

Use GROUP BY * to preserve tags as tags with INTO queries:

> SELECT "name" INTO "all_dogs" FROM "french_bulldogs" GROUP BY *
name: result
------------
time                  written
1970-01-01T00:00:00Z  3

> SELECT * FROM "all_dogs"
name: all_dogs
--------------
time                  color  name
2016-05-25T00:05:00Z  peach  nugget
2016-05-25T00:05:00Z  grey   rumple
2016-05-25T00:10:00Z  black  prince

To stream or not to stream with Kapacitor

Q: I’m using Kapacitor to join stream data with shifted stream data and I write the joined data back to InfluxDB. Will this work if the shift covers a long time range (for example, 24 hours)? Can Kapacitor handle that amount of data in RAM?

var current_data = stream
   | from()
     .measurement('sparkle_trees')

var previous_data = stream
   | from()
     .measurement('sparkle_trees')
   | shift(24h)

current_data
   |join(previous_data)
    .tolerance(10s)
    .as('current_data','previous_data')
   |influxDBOut()
    .database('enchanted_forest')
    .retentionPolicy('default')
    .measurement('joined_data')

A: Your TICKscript will work but, in practice, it’s not a good idea. First, every time you restart Kapacitor you will have to wait 24 hours before it does any joining. Second, you will have to buffer 24 hours worth of data in RAM; this is impractical unless you have a very small ingestion rate.

We recommend using a batch task instead of a stream task. With a batch task, Kapacitor can simply query old data and start immediately processing the join when Kapacitor first starts up. In addition, Kapacitor only needs to fit data for a single batch in RAM.

For example:

var current_data = batch
   | query('SELECT * FROM "enchanted_forest"."default"."sparkle_trees"')
     .period(5m)
     .every(5m)

var previous_data = batch
   | query('SELECT * FROM "enchanted_forest"."default"."sparkle_trees"')
     .period(5m)
     .every(5m)
     // Select data -24h from the current time
     .offset(24h)
   // Shift data to match current time
   | shift(24h)

current_data
   |join(previous_data)
    .as('current_data','previous_data')
   |influxDBOut()
    .database('enchanted_forest')
    .retentionPolicy('default')
    .measurement('joined_data')

When to double quote identifiers in queries

Q: I’ve successfully written data to InfluxDB, but I get an unexpected error when I query my data:

> INSERT monsters exists=true
> SELECT exists FROM monsters
ERR: error parsing query: found EXISTS, expected identifier, string, number, bool at line 1, char 8

A: The field key exists is an InfluxQL keyword. Identifiers that are also InfluxQL keywords must be double-quoted in queries:

> SELECT "exists" FROM monsters
name: monsters
--------------
time				             exists
2016-05-25T00:36:01.714188435Z   true
2016-05-25T00:36:19.30227374Z	 true

See our docs for a complete list of InfluxQL keywords.

For more InfluxDB tips, check out our Frequently Encountered Issues page and feel free to post your questions in the InfluxDB users group.

What's next?

Download and get started with InfluxDB v0.13
Schedule a FREE 20 minute consultation with a Solutions Architect to review your InfluxDB project
Attend one of our FREE virtual training seminars.
Looking for InfluxDB Clustering on your infrastructure? Contact Sales to get a demo of InfluxDB Enterprise plus pricing information.

Navigate to:

Try InfluxDB Cloud

Stop flying blind

TL;DR InfluxDB Tech Tips - InfluxDB Kapacitor

By Regan Kuchan / Developer
May 26, 2016

Navigate to:

Preserving tags with INTO queries

To stream or not to stream with Kapacitor

When to double quote identifiers in queries

What's next?

Ready to get started?

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT:
A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Get Updates

TL;DR InfluxDB Tech Tips - InfluxDB Kapacitor

By Regan Kuchan / Developer May 26, 2016

Navigate to:

Preserving tags with INTO queries

To stream or not to stream with Kapacitor

When to double quote identifiers in queries

What's next?

Ready to get started?

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT: A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Sign up for the InfluxData newsletter

Follow Us

By Regan Kuchan / Developer
May 26, 2016

InfluxDB for Industrial IoT:
A Live Demonstration