Time Sequence Forecasting Made Easy (Half 1): Decomposition and Baseline Fashions

Introduction to Lean for Programmers

One Versatile Instrument Beats a Hundred Devoted Ones

I to keep away from time sequence evaluation. Each time I took an internet course, I’d see a module titled “Time Sequence Evaluation” with subtopics like Fourier Transforms, autocorrelation features and different intimidating phrases. I don’t know why, however I at all times discovered a motive to keep away from it.

However right here’s what I’ve realized: any advanced subject turns into manageable after we begin from the fundamentals and give attention to understanding the instinct behind it. That’s precisely what this weblog sequence is about : making time sequence really feel much less like a maze and extra like a dialog together with your knowledge over time.

We perceive advanced matters rather more simply after they’re defined via real-world examples. That’s precisely how I’ll method this sequence.

In every put up, we’ll work with a easy dataset and discover what’s wanted from a time sequence perspective. We’ll construct instinct round every idea, perceive why it issues, and implement it step-by-step on the information.

Time Sequence Evaluation is the method of understanding, modeling and Forecasting knowledge that’s noticed over time. It includes figuring out patterns reminiscent of developments, seasonality and noise utilizing previous observations to make knowledgeable predictions about future values.

Let’s begin by contemplating a dataset named Every day Minimal Temperatures in Melbourne (). This dataset accommodates each day information of the bottom temperature (in Celsius) noticed in Melbourne, Australia, over a 10-year interval from 1981 to 1990. Every entry consists of simply two columns:

Date: The calendar day (from 1981-01-01 to 1990-12-31)
Temp: The minimal temperature recorded on that day

You’ve most likely heard of fashions like ARIMA, SARIMA or Exponential Smoothing. However earlier than we go there, it’s a good suggestion to check out some easy baseline fashions first, to see how effectively a fundamental method performs on our knowledge.

Whereas there are various forms of baseline fashions utilized in time sequence forecasting, right here we’ll give attention to the three most important ones, that are easy, efficient, and extensively relevant throughout industries.

Naive Forecast: Assumes the following worth would be the similar because the final noticed one.
Seasonal Naive Forecast: Assumes the worth will repeat from the identical level final season (e.g., final week or final month).
Shifting Common: Takes the common of the final n factors.

You may be questioning, why use baseline fashions in any respect? Why not simply go straight to the well-known forecasting strategies like ARIMA or SARIMA?

Let’s take into account a store proprietor who desires to forecast subsequent month’s gross sales. By making use of a shifting common baseline mannequin, they’ll estimate subsequent month’s gross sales because the common of earlier months. This straightforward method may already ship round 80% accuracy — ok for planning and stock selections.

Now, if we change to a extra superior mannequin like ARIMA or SARIMA, we’d improve accuracy to round 85%. However the important thing query is: is that further 5% well worth the further time, effort and assets? On this case, the baseline mannequin does the job.

In reality, in most on a regular basis enterprise situations, baseline fashions are enough. We sometimes flip to classical fashions like ARIMA or SARIMA in high-impact industries reminiscent of finance or vitality, the place even a small enchancment in accuracy can have a major monetary or operational impression. Even then, a baseline mannequin is normally utilized first — not solely to offer fast insights but in addition to behave as a benchmark that extra advanced fashions should outperform.

Okay, now that we’re able to implement some baseline fashions, there’s one key factor we have to perceive first:
Each time sequence is made up of three major elements — pattern, seasonality and residuals.

Time sequence decomposition separates knowledge into pattern, seasonality and residuals (noise), serving to us uncover the true patterns beneath the floor. This understanding guides the selection of forecasting fashions and improves accuracy. It’s additionally an important first step earlier than constructing each easy and superior forecasting options.

Pattern
That is the general path your knowledge is shifting in over time — going up, down or staying flat.
Instance: Regular lower in month-to-month cigarette gross sales.

Seasonality
These are the patterns that repeat at common intervals — each day, weekly, month-to-month or yearly.
Instance: Cool drinks gross sales in summer time.

Residuals (Noise)
That is the random “leftover” a part of the information, the unpredictable ups and downs that may’t be defined by pattern or seasonality.
Instance: A one-time automobile buy exhibiting up in your month-to-month expense sample.

Now that we perceive the important thing elements of a time sequence, let’s put that into apply utilizing an actual dataset: Every day Minimal Temperatures in Melbourne, Australia.

We’ll use Python to decompose the time sequence into its pattern, seasonality, and residual elements so we are able to higher perceive its construction and select an applicable baseline mannequin.

import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Load the dataset
df = pd.read_csv("minimal each day temperatures knowledge.csv")

# Convert 'Date' to datetime and set as index
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df.set_index('Date', inplace=True)

# Set a daily each day frequency and fill lacking values utilizing ahead fill
df = df.asfreq('D')
df['Temp'].fillna(methodology='ffill', inplace=True)

# Decompose the each day sequence (365-day seasonality for yearly patterns)
decomposition = seasonal_decompose(df['Temp'], mannequin='additive', interval=365)

# Plot the decomposed elements
decomposition.plot()
plt.suptitle('Decomposition of Every day Minimal Temperatures (Every day)', fontsize=14)
plt.tight_layout()
plt.present()

Output:

Decomposition of each day temperatures exhibiting pattern, seasonal cycles and random fluctuations.

The decomposition plot clearly exhibits a robust seasonal sample that repeats annually, together with a delicate pattern that shifts over time. The residual part captures the random noise that isn’t defined by pattern or seasonality.

Within the code earlier, you might need seen that I used an additive mannequin for decomposing the Time Sequence. However what precisely does that imply — and why is it the correct selection for this dataset?

Let’s break it down.
In an additive mannequin, we assume Pattern, Seasonality and Residuals (Noise) mix linearly, like this:
Y = T + S + R

The place:
Y is the precise worth at time t
T is the pattern
S is the seasonal part
R is the residual (random noise)

This implies we’re treating the noticed worth because the sum of the components, every part contributes independently to the ultimate output.

I selected the additive mannequin as a result of after I regarded on the sample in each day minimal temperatures, I seen one thing necessary:

The road plot above exhibits the each day minimal temperatures from 1981 to 1990. We are able to clearly see a powerful seasonal cycle that repeats annually, colder temperatures in winter, hotter in summer time.

Importantly, the amplitude of those seasonal swings stays comparatively constant over time. For instance, the temperature distinction between summer time and winter doesn’t seem to develop or shrink over time. This stability in seasonal variation is a key signal that the additive mannequin is suitable for decomposition, because the seasonal part seems to be impartial of any pattern.

We use an additive mannequin when the pattern is comparatively steady and doesn’t amplify or distort the seasonal sample, and when the seasonality stays inside a constant vary over time, even when there are minor fluctuations.

Now that we perceive how the additive mannequin works, let’s discover the multiplicative mannequin — which is usually used when the seasonal impact scales with the pattern which can even assist us perceive the additive mannequin extra clearly.

Think about a family’s electrical energy consumption. Suppose the family makes use of 20% extra electrical energy in summer time in comparison with winter. Which means the seasonal impact isn’t a hard and fast quantity — it’s a proportion of their baseline utilization.

Let’s see how this appears to be like with actual numbers:

In 2021, the family used 300 kWh in winter and 360 kWh in summer time (20% greater than winter).

In 2022, their winter consumption elevated to 330 kWh, and summer time utilization rose to 396 kWh (nonetheless 20% greater than winter).

In each years, the seasonal distinction grows with the pattern from +60 kWh in 2021 to +66 kWh in 2022 though the share improve stays the identical. That is precisely the form of conduct {that a} multiplicative mannequin captures effectively.

In mathematical phrases:
Y = T ×S ×R
The place:
Y: Noticed worth
T: Pattern part
S: Seasonal part
R: Residual (noise)

By wanting on the decomposition plot, we are able to work out whether or not an additive or multiplicative mannequin suits our knowledge higher.

There are additionally different highly effective decomposition instruments accessible, which I’ll be masking in considered one of my upcoming weblog posts.Now that now we have a transparent understanding of additive and multiplicative fashions, let’s shift our focus to making use of a baseline mannequin that matches this dataset.

Primarily based on the decomposition plot, we are able to see a powerful seasonal sample within the knowledge, which suggests {that a} Seasonal Naive mannequin may be an excellent match for this time sequence.

This mannequin assumes that the worth at a given time would be the similar because it was in the identical interval of the earlier season — making it a easy but efficient selection when seasonality is dominant and constant. For instance, if temperatures sometimes observe the identical yearly cycle, then the forecast for July 1st, 1990, would merely be the temperature recorded on July 1st, 1989.

Code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load the dataset
df = pd.read_csv("minimal each day temperatures knowledge.csv")

# Convert 'Date' column to datetime and set as index
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df.set_index('Date', inplace=True)

# Guarantee common each day frequency and fill lacking values
df = df.asfreq('D')
df['Temp'].fillna(methodology='ffill', inplace=True)

# Step 1: Create the Seasonal Naive Forecast
seasonal_period = 365  # Assuming yearly seasonality for each day knowledge
# Create the Seasonal Naive forecast by shifting the temperature values by one year
df['Seasonal_Naive'] = df['Temp'].shift(seasonal_period)

# Step 2: Plot the precise vs forecasted values
# Plot the final 2 years (730 days) of knowledge to check
plt.determine(figsize=(12, 5))
plt.plot(df['Temp'][-730:], label='Precise')
plt.plot(df['Seasonal_Naive'][-730:], label='Seasonal Naive Forecast', linestyle='--')
plt.title('Seasonal Naive Forecast vs Precise Temperatures')
plt.xlabel('Date')
plt.ylabel('Temperature (°C)')
plt.legend()
plt.tight_layout()
plt.present()

# Step 3: Consider utilizing MAPE (Imply Absolute Proportion Error)
# Use the final one year for testing
take a look at = df[['Temp', 'Seasonal_Naive']].iloc[-365:].copy()
take a look at.dropna(inplace=True)

# MAPE Calculation
mape = np.imply(np.abs((take a look at['Temp'] - take a look at['Seasonal_Naive']) / take a look at['Temp'])) * 100
print(f"MAPE (Seasonal Naive Forecast): {mape:.2f}%")

Output:

Seasonal Naive Forecast vs. Precise Temperatures (1989–1990)

To maintain the visualization clear and targeted, we’ve plotted the final two years of the dataset (1989–1990) as an alternative of all 10 years.

This plot compares the precise each day minimal temperatures in Melbourne with the values predicted by the Seasonal Naive mannequin, which merely assumes that every day’s temperature would be the similar because it was on the similar day one yr in the past.

As seen within the plot, the Seasonal Naive forecast captures the broad form of the seasonal cycles fairly effectively — it mirrors the rise and fall of temperatures all year long. Nevertheless, it doesn’t seize day-to-day variations, nor does it reply to slight shifts in seasonal timing. That is anticipated, because the mannequin is designed to repeat the earlier yr’s sample precisely, with out adjusting for pattern or noise.

To judge how effectively this mannequin performs, we calculate the Imply Absolute Proportion Error (MAPE) over the ultimate one year of the dataset (i.e., 1990). We solely use this era as a result of the Seasonal Naive forecast wants a full yr of historic knowledge earlier than it could start making predictions.

Imply Absolute Proportion Error (MAPE) is a generally used metric to guage the accuracy of forecasting fashions. It measures the common absolute distinction between the precise and predicted values, expressed as a share of the particular values.

In time sequence forecasting, we sometimes consider mannequin efficiency on the most up-to-date or goal time interval — not on the center years. This displays how forecasts are utilized in the true world: we construct fashions on historic knowledge to foretell what’s coming subsequent.

That’s why we calculate MAPE solely on the closing one year of the dataset — this simulates forecasting for a future and provides us a sensible measure of how effectively the mannequin would carry out in apply.

A MAPE of 28.23%, which provides us a baseline degree of forecasting error. Any mannequin we construct subsequent — whether or not it’s personalized or extra superior, ought to purpose to outperform this benchmark.

A MAPE of 28.23% signifies that, on common, the mannequin’s predictions had been 28.23% off from the precise each day temperature values during the last yr.

In different phrases, if the true temperature on a given day was 10°C, the Seasonal Naïve forecast might need been round 7.2°C or 12.8°C, reflecting a 28% deviation.

I’ll dive deeper into analysis metrics in a future put up.

On this put up, we laid the inspiration for time sequence forecasting by understanding how real-world knowledge may be damaged down into pattern, seasonality, and residuals via decomposition. We explored the distinction between additive and multiplicative fashions, carried out the Seasonal Naive baseline forecast and evaluated its efficiency utilizing MAPE.

Whereas the Seasonal Naive mannequin is easy and intuitive, it comes with limitations particularly for this dataset. It assumes that the temperature on any given day is equivalent to the identical day final yr. However because the plot and MAPE of 28.23% confirmed, this assumption doesn’t maintain completely. The info shows slight shifts in seasonal patterns and long-term variations that the mannequin fails to seize.

Within the subsequent a part of this sequence, we’ll go additional. We’ll discover easy methods to customise a baseline mannequin, examine it to the Seasonal Naive method and consider which one performs higher utilizing error metrics like MAPE, MAE and RMSE.

We’ll additionally start constructing the inspiration wanted to grasp extra superior fashions like ARIMA together with key ideas reminiscent of: