Learn to Forecast Time Series Data Using ML & InfluxDB

Navigate to:

Forecasting is all about predicting the future—in data science, it is one of the key skills in dealing with time series data, such as stock price prediction, sales forecasting, logistics planning, etc.

In this tutorial, we’ll learn how to forecast the notorious weather pattern of London, UK, using the following free and open source technologies.

  1. InfluxDB 3 Cloud Serverless (free): We are using the serverless version of InfluxDB 3 to store historical time-stamped weather data for its scalability and ease of setup.
  2. Prophet: A popular open source time series Machine Learning (ML) library by Facebook research. Prophet is particularly good at handling data with seasonality (daily, weekly, and yearly patterns) and trend changes.
  3. Open-Meteo API: A free and open source weather API that provides historical, current weather data for locations worldwide. Here, we’ll use it for London, but you can easily change that in the program.
  4. Python project (GitHub): The Python program brings together everything you can run locally on your laptop or on the cloud without needing a GPU.

Setting up the environment

Before we start, make sure you have the necessary libraries installed. You can install them using pip:

pip install influxdb3-python prophet requests python-dotenv

You must also create a .env file in your project root to store your InfluxDB credentials and other configuration variables. Feel free to replace the variables with your own.

INFLUXDB_HOST="your_influxdb_cloud_serverless_url"
INFLUXDB_TOKEN="your_token"
INFLUXDB_ORG="your_org_name"
INFLUXDB_DATABASE="weather"
LONDON_LAT="51.5074"
LONDON_LON="-0.1278"

Step 1: Fetching and Storing Historical Weather Data

We’ll fetch the average daily temperature for London over the past six months from the Open-Meteo API using weather_client.py. The main.py script will process this data and then use influxdb_client.py to store it in InfluxDB efficiently.

# In main.py 
from weather_client import fetch_weather_data
from influxdb_client import DBClient
weather_data = fetch_weather_data()
df = process_api_response(weather_data) # This function parses the API response
db = DBClient()
db.write_weather_data(df) # Efficiently writes data to InfluxDB
print("Weather data written to InfluxDB")

The DBClient class handles writing to an InfluxDB bucket using batch mode for better performance.

Step 2: Reading Data from InfluxDB

We’ll use a straightforward SQL query within the DBClient class to retrieve the data from the measurement (table) for the last 6 months.

# In influxdb_client.py
    def read_data(self):
        query = """
                  SELECT time, temperature
                  FROM "weather-london"
                  WHERE time >= now() - interval '180 days'
                  ORDER BY time ASC
                """

        return self.client.query(query=query)

Step 3: Forecasting with Prophet and Visualizing

The visualization.py script will use Prophet to forecast the temperature and create a plot as a .png file using matplotlib. This code “fits” a Prophet model to the historical data and generates a forecast for the next 30 days for London weather.

# visualization.py
from influxdb_client import DBClient
from prophet import Prophet

db = DBClient()
df = db.read_data().to_pandas() # Get data from InfluxDB

model = Prophet(yearly_seasonality=True, daily_seasonality=True)
model.fit(df)

future = model.make_future_dataframe(periods=30*24, freq='h') # Forecast 
1 month ahead
forecast = model.predict(future)

# ... (Plotting code using Matplotlib) ...

Forecasting: ML vs. older statistical techniques

Prophet is a Machine Learning model that learns patterns from data. Traditional methods like ARIMA use predefined mathematical models.

  • ML (e.g., Prophet, Neural Prophet etc): Adapts to complex patterns, handles seasonality/trends automatically, captures non-linearity.
  • Statistical Techniques (e.g., ARIMA): More interpretable, works well with less data, and is less computationally intensive.
Time Series LLMs

Time series-specific LLMs offer forecasting without explicit training as they rely on large language models that are pre-trained on huge amounts of time series data.

  • Merits: Potential for zero-shot learning and generalization across time series.
  • Drawbacks: Still new, computationally expensive, lacking interpretability and long term accurate forecasts.

Choosing the right approach

  • Prophet: Good for ease of use, automatic seasonality handling, and robustness to outliers.
  • ARIMA: Better for smaller datasets or when interpretability is crucial.
  • Time Series LLMs: Worth exploring for zero-shot learning but are computationally expensive.
Batch Processing vs. Real-Time Forecasting

Our example uses batch processing (forecasting on a fixed dataset), which is suitable for long-term forecasts like weather. Real-time forecasting is needed for things like stock prices.

Recap

Forecasting is powerful, and tools like Prophet ML and InfluxDB make it more accessible. The field is evolving, with new techniques like Time Series LLMs emerging, so staying informed and experimenting is key to mastering forecasting. In future articles, we will cover more examples of forecasting using Time Series LLM Models, so keep an eye on our blog posts, and if you have any questions, feel free to ask our community.