Scaling Data Collection: Solving Renewable Energy Challenges with InfluxDB
By
Jason Myers /
Developer
Jun 06, 2024
Navigate to:
For data-critical and data-intense sectors, like energy and renewables, access to data can be a make-or-break situation. As the complexity of the systems underpinning energy operations increases, collecting and analyzing that data is more challenging than ever before. Therefore, understanding what data sources are necessary, where they sit in the tech stack, and how they scale across an organization is crucial for obtaining the insights energy companies need to maintain and optimize operations.
Why scalability matters
When we think about scalability, it’s either horizontal or vertical. That’s not to say that companies need to choose one or the other; rather that their needs tend to surface independently.
The very structure of modern energy grids and sources requires much greater attention than older systems. Traditional energy production occurs in a central location, making tracking easier. The inputs and outputs are consistent and require less frequent attention.
Horizontal scaling
In the renewable energy space, the number and location of sources are often widely distributed. This means that energy producers need to monitor more ‘plants’ as well as the connections between those sources and the grid.
Because many renewable energy sources are intermittent, energy production and storage from individual devices (e.g., solar panels, wind turbines, etc.) must be monitored. The sun only shines during the day, and wind turbines only turn when the wind blows! Feeding that information into machine learning models can help optimize how companies use that energy. All this requires data.
Consider, too, all the residential or individual commercial facilities that generate energy and put it back into the grid (e.g., solar panels). Energy companies need to keep precise track of that information.
Vertical scaling
Finally, we need to factor in the systems doing the actual monitoring. Companies need to ensure that their operations technology (OT) stacks remain in good working order, too. Having a plan to watch the watchers is something companies need to consider. Monitoring these stacks requires vertical scalability as they grow and become more complex.
Data challenges
As systems become more complex and distributed, data workloads for energy companies become more demanding. The challenge with time series data is that there is a lot of it. A positive aspect is that the more data you have, the deeper insights you can reveal. When we instrument everything, we have insight into how those things operate and change. The more granular we get, e.g., microsecond or nanosecond precision, the more accurate those insights become.
However, deeper insights come with trade-offs. First, you need a solution that can process and manage high-resolution data. InfluxDB 3.x is a time series database that can ingest millions of data points every second and supports nanosecond data precision. In other words, it can ingest and make data available as fast as your equipment can generate it. (This is very different from legacy data historians.)
Storage becomes a challenge once you have high-resolution data because the more data you have, the more it costs to keep it. This is especially true where companies want to do forecasting and predictive analytics because these processes require as much high-resolution data as possible to build and train the AI/ML models that underpin them.
As a result, historically, companies could keep high-resolution data long-term, enable AI/ML optimization, and spend more to store it all. Or, they could downsample that data, keep the aggregation, and save money on storage, but drastically limit their ability to generate insights or optimize and embrace the advantages that Industry 4.0 offers.
Scaling for time series workloads
When we combine the realities of distributed industrial operations with the realities of data generation, it quickly becomes apparent why scalability matters. End-to-end monitoring can be a challenge, but failure to do so can lead to unpredictable and costly issues down the line. Furthermore, when we look at the energy sector as a whole, both vertical and horizontal scalability emerge in different ways. Some companies may play in all areas, and others may choose to specialize. However, opportunities in this sector are often a function of the need for scalability.
Data normalization and edge data replication
On the energy production side, there is no shortage of devices and systems to monitor. This is especially true for renewable energy sources like wind and solar. Energy-generating devices often use different sensors, even within the same array, which can mean they push out data in different formats and protocols. To make sense of that data holistically, companies need to normalize it.
In the diagram below, you can see Telegraf used to collect data from various protocols (e.g., Modbus, MQTT, OPC-UA, etc.) and output it in InfluxDB’s line protocol. You can accomplish this using a single-node instance of InfluxDB at the edge. Individuals working onsite can use that data at the edge to monitor local systems in real-time. These edge nodes use InfluxDB’s edge data replication (EDR) feature to create a durable queue that automatically sends data to a centralized data store. This architecture enables both data access and analysis at the edge and couples it with data resiliency at the center.
Architecture diagram showing data ingest with Telegraf, single-node InfluxDB, and EDR enabled at the edge, transmitting data to a central hub.
Energy distribution
There are many ways to get energy from its source to its final destination and plenty of things to monitor along the way.
In a more traditional power grid, we can see both horizontal and vertical scaling realities. In the diagram below we have two distributed power station networks that make up one vertical layer and a horizontal layer. We see the same thing with the substations fed by the power stations in the previous layer. You can use InfluxDB within each plant facility to collect, store, and analyze its data. The goal here is typically to understand faults when and where they occur to accelerate maintenance. These aren’t self-repairing facilities, so real-time insights may not be necessary. However, energy companies still want to identify root causes as quickly as possible to optimize schedules for their maintenance staff. With enough data, energy companies can leverage machine learning to predict when errors will occur and be more proactive in troubleshooting them. The name of the game is monitor, analyze, predict, and repeat.
Virtual power plants
Storage and strategic release of energy are the key motions behind virtual power plants. The cost of a kilowatt/hour of energy varies by time of day. Periods of high energy usage can cost orders of magnitude more than periods of low usage. Companies can take advantage of this situation by storing energy produced during the low periods, storing it, and releasing it during the high periods.
What does this look like in practice, and where does data fit into the story? The energy generation source in this situation is usually renewable, like wind. Companies typically store energy in battery arrays. Therefore, they need to monitor not only the wind turbines but also the batteries, assuming they control those. They need to be able to track battery performance on both ends of the process. That means understanding battery capacity as energy comes in and when it goes out. They also need to track the price of energy throughout the day so that they can maximize the cost of stored energy.
Ju:niz Energy is an example of a company that uses batteries to store energy. Ju:niz developed intelligent, large-scale energy storage systems that collect 1.3M data points every second about battery health, climate, temperature, and other conditions. Ju:niz uses the Modbus protocol to connect to the iEMS SPS controllers on site. They collect the data from the controller with open source Telegraf, the data collection agent for InfluxDB, and write it to an open source instance of InfluxDB at the edge. Ju:niz sends data from all its local InfluxDB OSS instances to a central, AWS-hosted, Cloud Dedicated cluster using EDR. To learn more about this type of energy storage, check out the complete case study on ju:niz Energy to understand how it works and where InfluxDB sits in the system.
Ju:niz Energy architecture diagram
Companies working in this area can also control the energy that consumers with solar panels put back into the grid. In some areas, companies can buy the energy generated by individual households. Companies can offer higher rates than public entities to purchase power generated by individual households in the same way that they can maximize value by strategically releasing energy into the grid. These companies need to monitor the amount of energy coming into the system through these agreements.
Monitoring energy at scale
This post barely scratches the surface regarding data, scalability, and the energy sector. But even this brief demonstrates that a need to use data to monitor, analyze, and predict exists along both vertical and horizontal axes. With so many sources generating data at so many levels, the ability to collect, organize, and manage that data–to turn it into actionable insights–becomes mission-critical. InfluxDB has the capabilities and features to ensure energy companies can see what their systems are doing, derive deep insights from data, and power advanced analytics (AI/ML) to optimize and improve systems and processes up and down the sector.
Click here to learn more about InfluxDB and how it works with energy/renewables.