A Long Time Ago, on a Server Far, Far Away…
By
Jason Myers /
Use Cases, Developer
Oct 09, 2023
Navigate to:
This article was originally published on The New Stack and is reposted here with permission.
Here is a brief case study that explores the logistics and motivations that would lead a successful company to spend time and resources completely rewriting the core of their flagship product in Rust.
Calling a programming language Rust almost seems like a misnomer. Rust is the brittle byproduct of corrosion — not something that would typically inspire confidence. But fortunately, software developers have very different concerns from metallurgists. In the digital realm, Rust is a game changer.
The following is a brief case study that explores the logistics and motivations that would lead a successful company to spend time and resources completely rewriting the core of their flagship product in Rust. InfluxData makes InfluxDB, the leading time series database in the world. When it comes to time series data use cases, the 1.x and 2.x iterations of InfluxDB are great for metrics. They’re able to handle analytics use cases to a certain extent, but there was always a danger of high-cardinality data impacting database performance.
The vision for InfluxDB is not to simply master metrics, but to provide solutions for all time series use cases. To achieve this, developers needed to solve the cardinality problem. Doing so would throw open the floodgates for time series data and InfluxDB.
As developers sought solutions for the cardinality problem, it became clear that to achieve their desired end they needed to rewrite significant portions of InfluxDB’s core. They needed to build a columnar database and, as a company with roots in, and a commitment to open source, turned to Apache Arrow for the columnar framework of the new database version. Versions 1.x and 2.x were written in Go, but for this new stack InfluxData founder and CTO Paul Dix saw an opportunity to try something different. Enter Rust.
Why Rust?
Rust has many attractive features for developers. The real-time nature of time series data brings with it significant performance demands. Rust has the inherent performance capabilities to support the characteristics of time series data and use cases. For example, Rust relies on fearless concurrency. This is an approach to systems programming that enforces discipline around different programming paradigms that helps developers mitigate or eliminate subtle bugs in their code. Another benefit of this fearless concurrency approach is that it makes applications easy to refactor without introducing new bugs. The borrow checker is another critical aspect of Rust. It helps users manage memory and initializes all variables before using them. This prevents users from unintentionally using the same value more than once.
Some additional perks of using Rust include the fact that its libraries can export a foreign function interface (FFI) compatible with many different programming languages. This provides extensibility and interoperability that makes Rust a major potential value-add to a wide range of applications. Rust uses the Crates.io packaging system, which gives developers everything they need right out of the box. In Rust, errors are first-class citizens and developers don’t have to deal with a garbage collector.
Rust also gives developers more control over runtimes than many other languages. Its async/await tool is much more advanced than order languages like JavaScript. In JavaScript, for example, users can’t control the order of asynchronous functions when they execute in Node.js. Async runtimes are runtimes optimized to execute async functions in specific environments. In Rust, however, developers have granular control over the execution order of asynchronous functions using async runtimes.
This just scratches the surface for the advantages of Rust. However, memory management and runtime control are two contributing factors that led to InfluxData’s decision to build its new database engine in Rust.
Rust challenges
While Rust presents a lot of advantages, it has its share of challenges as well. The most significant is that it has a high learning curve. It is a uniquely designed programming language, complete with its own design patterns. In some cases, these unique qualities are driven by the very capabilities that make Rust appealing, like the borrow checker. Developers with a background limited to dynamic programming languages, such as Python, Ruby or JavaScript tend to have a harder time learning Rust than developers with a background in static programming languages, like C++ or Swift.
Another sticking point that developers must adapt to is Rust’s lengthy compile time. This puts pressure on developers to write code that optimizes compile time. But the uphill climb might just be worth it because Dix believes that developers and companies will write more and more high-performance server software in Rust moving forward.
Supporting a shift to Rust
Hiring seasoned Rust developers may not always be an option as demand for them continues to increase. So, it’s important for individuals and companies alike to tap into available resources that will help mitigate the language’s steep learning curve. Rust is an open source language with a growing community supporting it, so leaning on the community is a great starting point for motivated developers.
Rust results
InfluxData set out to expand the analytical capability of its leading time series database and used Rust to accomplish that task. InfluxDB 3.0 is the result of years of research and development. It takes several key database concepts and applies them to the time series use case. Columnar databases aren’t new. Neither is the idea of separating storage and compute. But combining these concepts for the time series use case results in a database that can drive both monitoring and real-time analytics projects at scale.
InfluxDB 3.0 can handle data with unlimited cardinality, can scale compute and storage separately and supports native SQL queries. Its performance gains compared to previous versions of InfluxDB OSS are seismic. With a “hot” storage tier for leading-edge data, users can perform real-time analytics. And the combination of a columnar database and the use of Apache Parquet as its persistence format ushers in drastic data compression gains. Using low-cost cloud object storage for “cold” data can save users up to 90% on storage costs, all while enabling them to keep more, high-granularity data for longer periods.
Rust was a key difference maker in the creation of InfluxDB 3.0. While the decision to rewrite the database core was a major one, the end results speak for themselves. Thanks to Rust, InfluxDB is poised to remain atop the time series category for the foreseeable future.
Try InfluxDB for yourself and see what a difference Rust makes.