Data Historians vs. Time Series Databases
By
Jason Myers /
Developer
Mar 13, 2024
Navigate to:
It’s easy to pitch technology buying decisions as black or white, where one camp is the promised land and the other is a dystopian wasteland where companies and profits go to die. But that doesn’t match reality.
Instead, organizations need to balance technical trade-offs with their needs. So, while it’s easy to stand atop the “rip and replace” mountain and shout the virtues of your new technology, that’s not something that most organizations are willing to do.
In the industrial and manufacturing space, data historians were a key element of Industry 3.0, where computers took center stage. The fact remains that progress from Industry 3.0 to Industry 4.0 is incremental. Some companies may move from Industry 3.0 to 3.5 to 3.7 before landing in Industry 4.0. As technology evolves and organizations embrace Industry 4.0, how must they adapt? These incremental changes look different for every organization.
Before making any grand decisions, it’s important to understand the players in the game. For the purposes of this article, we’re talking about InfluxDB, a purpose-built time series database, and legacy data historians.
InfluxDB (TSDB) | Data Historian | |
Domain-specific | No | Yes |
Open/Closed system | Open source | Closed (Proprietary) |
Deployment environment | Cloud, Edge, On-Prem | On-Prem |
Interoperability | Extensive (Open source, APIs, cloud-native) | Limited |
Build/Buy | Build | Buy |
Scalability | High | Limited |
OT integration | Supports common protocols, customizable | Tight |
End-to-end solution | Not out of the box, but you can build one | Yes |
Growth potential | Unlimited | Limited by vendor resources and goals |
Data Historian: Pros
Data historians aren’t all bad. Technology doesn’t become commonplace in a given sector—like data historians—if it doesn’t work well.
- Domain-specific: Vendors build data historians for industry and industrial applications. These systems focus on the unique features and needs of industrial environments and provide tools that work with PLCs, SCADAs, individual machines, and more.
- OT integrations: Data historians tightly integrate with operations technology (OT) control systems and standards.
- End-to-End: Data historians have features for pretty much any requirement that industrial operators need. Newer data historians even offer rich UIs to visualize data. Data historians are more of an “all inclusive” option, providing a wide range of features and capabilities in a single solution. They may not be turn-key, but they’re much closer to that than DIY.
Data Historian: Cons
While those pros all sound pretty good (that’s why they’re positives), we have to remember the context of this consideration is the move to Industry 4.0. Data historians are great self-contained solutions, but what happens when you want to do more with your data?
- Legacy tech: Closed software systems create “walled gardens” that limit organizations’ ability to adapt, innovate, and grow. Data historians are a tool designed for one job. But if multiple people need to use the same data for different purposes? This leads to the next point…
- Vendor lock-in: The walled garden makes it almost impossible to integrate with modern data ecosystems. Closed, proprietary software makes it difficult, if not impossible, to integrate with tools beyond what the vendor is willing to support. As other systems advance, new protocols and standards emerge, and vendors prioritize interoperability, a closed system limits what you can do with your data.
- Data silos: These systems typically run on-premises, so when your data historian can’t connect to other systems, the data in the historian becomes siloed. Some organizations may even have multiple data historians running at a single site. The closed nature of these systems prevents users from collating data and drawing insights across systems— another factor that limits your ability to generate value from your data. If other systems can’t access your data, then they can’t benefit from it either.
- Cost: Because they’re such niche systems, data historians tend to be expensive. Custom changes, if available, are expensive and time-consuming because vendors are committed to their proprietary standards and have limited development resources.
InfluxDB: Pros
- Open technology: InfluxDB is built on open technologies, allowing for a lot of flexibility in application development. Leveraging modern technologies and open standards gives users access to best-in-class services and tools. These capabilities enable teams to adapt, grow, develop, and iterate applications faster. Access to a larger ecosystem, APIs, connectors, widely adopted industrial protocols, and third-party tooling lets developers choose the tools they prefer and integrate them with InfluxDB. Accessibility and interoperability are critical components of Industry 4.0, and that’s where a TSDB like InfluxDB shines.
- Query languages: InfluxDB supports SQL and the SQL-like InfluxQL query languages. SQL is basically the lingua franca of the digital age, reducing start-up time for many users. Having multiple ways to query data provides another degree of flexibility to users. InfluxDB also supports client libraries in multiple languages to make writing data easier.
- Scalability: There are a couple of aspects of scalability worth mentioning. First, there are the database resources and infrastructure. As a cloud-native database, InfluxDB is available as a fully-managed service. Residing in a major cloud environment enables InfluxDB to scale up and down with users’ needs. Another aspect is its ability to scale to meet growing data ingest needs. InfluxDB is designed to handle large time series datasets without impacting performance.
- Lower storage costs: This is especially pertinent to industrial organizations that want to take advantage of predictive analytics and other advanced analytics and artificial intelligence tools. These organizations need highly granular, historic data to feed and continually train AI and machine learning algorithms. Storing that data can be expensive, which is why InfluxDB separates compute and storage and utilizes multiple storage tiers. Cold storage, for infrequently accessed data, lives on low-cost object store and can reduce storage costs by 90%+.
- Multiple deployment options: InfluxDB is cloud-native, but it is also available for on-premises deployment for those users who want or need to control their infrastructure. Single-node instances for edge deployment also allow organizations to bring data collection and processing closer to data sources, which can then replicate that data back to a centralized instance, if desired.
InfluxDB: Cons
No solution does everything for everyone all the time, and databases are no exception. That’s precisely why specialty databases exist.
- Not domain-specific: InfluxDB isn’t built specifically for industrial or manufacturing applications. It doesn’t have the features that data historians do baked-in and ready to go from the outset. Adding those things would take additional time and effort.
- Build vs buy: When you opt for a time series database like InfluxDB, you know going in that you’ll need to build some stuff. This requires domain knowledge (or at least the time/willingness to learn) and developer resources.
- Stack needed: Related to the build/buy idea, because InfluxDB isn’t domain-specific, building an end-to-end solution requires using a wider ecosystem to get comparable features to a data historian. Some organizations don’t want—or have—the resources to learn or manage an entire ecosystem.
Deployment: crawl, walk, run options
To get back to our initial idea, when choosing data historians and/or time series databases, you should consider your organization’s needs and what solutions best fit them.
The following examples are just that: examples. Every organization will have different needs and require different trade-offs, but hopefully these examples will provide a jumping-off point for thinking about the relationship between your data historian, your needs, and where a time series database fits into the picture.
Crawl
Let’s say your data historian works fine for your organization, but you are curious about digital transformation. No doubt you’ve put years and tons of money into your OT stack, so you’re not going to go around pulling plugs for an experiment.
One approach you might take is to use Telegraf to test data collection. Essentially, you can configure Telegraf to collect data from the same sources your data historian collects from. You would want to start with a small data set to keep storage costs down because you’d be writing the same data to two different places.
But doing this gives you a sense of where to locate InfluxDB in your OT stack. And it yields legitimate production data to experiment with to see what kind of insights you can gain.
Walk
Let’s take this example to the next level. Remember when we mentioned that a plant may have multiple siloed historians running at the same location? Well, replicating the above experiment for each historian (but outputting the data of each Telegraf instance to a single instance of InfluxDB) allows you to combine those data streams and break down those silos.
Using a visualization tool like Grafana allows you to create a single pane of glass to track the individual performance of each system as well as their collective performance.
Run
Once you get a feel for how InfluxDB can function with your data historian on a smaller level, you can build out that integration. Connect more systems and tools to InfluxDB. Investigate other ecosystem tools that you can use to replace data historian features. One benefit of open technologies is that you can customize these replacements to meet your specific needs.
If you’re expanding operations and your TSDB experiments are going well, it may be time to adopt a TSDB instead of an expensive legacy data historian. But the point is that you can ease your way into open standards and a time series database. It doesn’t have to be all rip-and-replace. You want to ensure the technologies you use meet your needs, so you need to make sure that the trade-offs from a data historian to a TSDB make sense.
That said, organizations that are serious about digital transformation will likely be in a position—sooner rather than later—where they need the connectivity, interoperability, and accessibility of open standards to remain competitive. Legacy technologies are legacy for a reason. But fortunately, future-proofing your systems doesn’t have to take place overnight. It’s ok to be at a 3.6 on the Industry 3.0 to 4.0 transition spectrum. Options exist. You just need to determine what trade-offs, e.g., costs, features, capabilities, etc., are acceptable for your organization and plan accordingly.
To start experimenting with these ideas and InfluxDB, sign up for a free account today.
Additional resources: