Announcing Flux (formerly IFQL) - A New Query Language and Engine for InfluxDB
By
Paul Dix /
Product, Developer
Nov 14, 2017
Navigate to:
Today we’re releasing an early alpha of IFQL, or Influx Query Language. [This was later renamed Flux and is referred to as such below]. It’s an all-new query language and execution engine that works out of the box with InfluxDB 1.4. It’s implemented as a separate standalone binary that runs alongside InfluxDB. This represents the biggest advance in the InfluxDB API since 0.9 was released over two years ago. However, unlike that release this one is compatible with the latest InfluxDB.
As a language, Flux looks very similar to JavaScript. It is a series of functions chained off each other like you’d see in jQuery or D3 code. Here’s an example query:
select(db:"foo") .where(exp:{"_measurement"=="cpu" AND "_field"=="usage_system" AND "service"=="app-server"}) .range(start:-12h) .window(every:10m) .max()
That query is equivalent to this InfluxQL query:
SELECT max(usage_system) FROM "foo".."cpu" WHERE "service" = 'app-server' AND time > now() - 12h GROUP BY time(10m), *
Our goal with Flux was to create a language that works more naturally with time series data. We felt that the functional paradigm matched more closely to the kinds of things you’d want to do than with SQL. We took inspiration from Pandas in Python or the Tidyverse projects in R. Data frames get passed from function to function with each one performing some operation on it. Of course the underlying implementation details differ, but conceptually users can think of it like this.
The functions have named parameters. This makes it less brittle and makes the code easier to use. With positional parameters you end up having to look up function definitions to find out what each argument is for. Also, with named parameters you can add parameters to functions in the future without breaking existing client code.
The where function has a special expression language for specifying predicates. This was one place where we didn’t go with a pure functional approach. Instead, we opted for this predicate language that looks like the where clause in SQL. For client library authors or builders, we’ll be adding functionality to specify this predicate as a JSON object. It’s just easier to type and read with this special language.
The goal was not to create the most terse language. Certainly, some queries will be more verbose than their InfluxQL or PromQL equivalents. Our goal was to design for readability, flexibility, and extensibility. New functions can be introduced without changing the semantics of the language itself.
Flux decouples the query engine from the storage tier. Our goal was to be able to spin up new query processors on the fly that could pull data from any number of InfluxDB nodes. This also opens up future work for workload isolation and data science tasks that can be run outside the database.
There are queries that you can do in Flux today that you can’t do in InfluxQL. For example, HAVING queries:
select(db:"foo") .where(exp:{"_measurement"=="foo"}) .range(start:-12h) .window(every:10m) .sum() .where(exp:{$ > 100}) // this is the having
Or math across measurements:
// math across measurements, or dbs var a = select(db:"foo").where(exp:{"_measurement"=="foo"}).last() var b = select(db:"foo").where(exp:{"_measurement"=="bar"}).last() b.join(on:["host"], exp: {$ + a})
In Flux, everything is represented as tag key/value pairs. So measurement and field names can be accessed through the special _measurement and _field keys. This kind of structure is also what Prometheus looks like under the hood.
That example also shows that Flux has variables that can be assigned to. This is similar to functionality available in Kapacitor.
Our goal with Flux is that it will be the one language that can be used across the platform, regardless of whether you’re doing interactive querying against the database or running processing or monitoring and alerting tasks in Kapacitor.
Another design goal we had with Flux was to decouple the query language from the execution engine. When Flux gets parsed it gets structured into a DAG that can be represented as JSON. We’ll be building parsers for InfluxQL, TICKscript and (likely) PromQL to run on top of the Flux engine. That means the new engine will work for the SQL style queries or the new Flux language, which is where the bulk of new features and functionality will land when Flux is production-ready.
We’re open sourcing the project today and putting out initial builds of a Docker image and deb and rpm packages. The code is up on Github in the Flux repo. It’s licensed under the AGPL V3 license or under a commercial license through us. We’ll be treating this in very much the same way that MongoDB treats their AGPL license. We want to enable new projects and startups to use the software for free, but also give ourselves the ability to monetize the managed hosting of it. Basically, if you’re not hosting it as a service for your users then it’s free to use, but if you plan on hosting it as a service like a hosting provider, then you’d need to have a commercial license unless you want to open source the code of your hosting platform. We went this way because of what we see happening in the open source infrastructure space. So it’s a dual licensing model: AGPL for free open source users and a commercial license for those that require it.
Over the coming months we’ll be adding an Flux query builder to Chronograf. We’ll be continuously adding features to Flux and releasing on a regular basis.
We’d like to get feedback from the community and iterate on the API. Our goal is to lock this down early next year so we can move to real production-ready releases. For more details on the language, you can read through an early writeup on the Flux spec. This release is by no means complete and we’re adding to it daily. We’re tracking what additional query features need to be wired up here.
In the meantime, head over to the Flux repo for instructions on how to get it set up.