Time Collection Forecasting Made Easy (Half 4.1): Understanding Stationarity in a Time Collection

Retaining Possibilities Sincere: The Jacobian Adjustment

The Machine Studying “Creation Calendar” Day 24: Transformers for Textual content in Excel

to date, we now have mentioned completely different decomposition strategies and baseline fashions. Now, we transfer on to Time Collection Forecasting fashions like ARIMA, SARIMA, and so forth.

However these forecasting fashions require the info to be stationary. So first, we are going to talk about what stationarity in a time collection truly is, why it’s required, and the way it’s achieved.

Maybe most of you’ve already learn rather a lot about stationarity in a time collection via blogs, books, and so forth., as there are numerous assets out there to study it.

At first, I assumed to clarify what stationarity in time collection is once I talk about forecasting fashions like ARIMA, and so forth.

However once I first learnt about this subject, my understanding didn’t go a lot past fixed imply or variance, robust or weak stationarity, and assessments to verify for stationarity.

One thing at all times felt lacking; I used to be unable to grasp a couple of issues about stationarity.

So, I made a decision to put in writing a separate article on this subject to clarify what I learnt in response to my questions or doubts on stationarity.

I simply tried to put in writing about stationarity in time collection in a extra intuitive method, and I hope you’re going to get a contemporary perspective on this subject past the strategies and statistical assessments.

We name a time collection stationary when it has a continuing imply, fixed variance and a continuing autocovariance or fixed autocorrelation construction.

Let’s talk about every property.

What can we imply by Fixed Imply?

For instance, take into account a time collection of gross sales information for five years. If we calculate the typical gross sales of every 12 months, the values needs to be roughly the identical and if the averages differ considerably for every year, then there isn’t a fixed imply and time collection isn’t stationary.

The subsequent property of a stationary time collection is Fixed Variance.

If the unfold of the info is identical all through the collection, then it’s mentioned to have Fixed Variance.

In different phrases, if the time collection goes up and down by related quantities all through the collection, then it’s mentioned to have Fixed Variance.

But when the ups and downs begin small after which develop into bigger later, then there isn’t a fixed variance.

The third property of a stationary time collection is Fixed Autocovariance (or Autocorrelation).

If the connection between values relies upon solely on the hole between them, no matter after they happen, then there’s Fixed Autocovariance.

For instance, you’ve written a weblog and tracked its views for 50 days and if every day’s views are carefully associated to earlier day’s views (day 6 views just like day 5 and day 37 views just like day 36, as a result of they’re someday aside).

If this relationship stays the identical all through the complete collection, then autocovariance is fixed.

In a stationary time collection the autocorrelation normally decreases because the lag (or distance) will increase as a result of solely close by values are strongly associated.

If autocorrelation stays excessive at bigger lags, it might point out the presence of development or seasonality, suggesting non-stationarity.

When a time collection has all of those three properties, then we name it a stationary time collection, however we name this a second order stationarity or weak stationarity.

There are primarily two kinds of stationarity:
1) Sturdy Stationarity
2) Weak Stationarity

Sturdy Stationarity means the complete time collection stays the identical every time we observe it, not simply imply and variance however even skewness, kurtosis and total form of distribution.

In actual world that is uncommon for a time collection, so the traditional forecasting fashions assumes weak stationarity, a extra practical and sensible situation.

Figuring out Stationarity in a Time Collection

There are completely different strategies for figuring out stationarity in a Time Collection.

To know these strategies, let’s take into account a retail gross sales dataset which we used earlier on this collection for STL Decomposition.

First is Visible Inspection.

Let’s plot this collection

From the above plot, we are able to observe each development and seasonality within the time collection, which signifies that the imply isn’t fixed. Subsequently, we are able to conclude that this collection is non-stationary.

One other methodology to check stationarity is to divide the time collection into two halves and calculate the imply and variance.

If the values are roughly the identical, then the collection is stationary.

For this time collection,

The imply is considerably increased, and variance can be a lot bigger in first half. Because the imply and variance usually are not fixed, this confirms that this time collection is nonstationary.

We will additionally establish the stationarity in a time collection by utilizing the Autocorrelation (ACF) plot.

ACF plot for this time collection

Within the above plot, we are able to observe that every statement on this time collection is correlated with its earlier values at completely different lags.

As mentioned earlier, autocorrelation progressively decays to zero in a stationary time collection.

However that’s not the case right here because the autocorrelation is excessive at a number of lags (i.e., the observations are extremely correlated even when they’re farther aside) and it suggests the presence of development and seasonality, which confirms that the collection is non-stationary.

We even have statistical assessments to establish stationarity in a time collection.

One is Augmented Dickey Fuller (ADF) Check, and the opposite is Kwiatkowski-Phillips-Schmidt-Shin (“KPSS”) Check.

Let’s see what we get once we apply these assessments to the time collection.

Each the assessments affirm that the time collection is non-stationary.

These are the strategies we use to establish stationarity in a time collection,

Reworking a non-stationary time collection to a Stationary Time Collection.

We’ve got a way known as ‘Differencing’ to rework a non-stationary to stationary collection.

On this methodology, we subtract every worth from its earlier worth. This manner we see how a lot they modify from one time to subsequent.

Let’s take into account a pattern from retail gross sales dataset after which proceed with differencing.

Now we carry out differencing, this we name as first-order differencing.

That is how differencing is utilized throughout the entire time collection to see how the values change over time.

Earlier than first order differencing,

After first order differencing

Earlier than making use of first-order differencing, we are able to observe a rising development within the authentic time collection together with occasional spikes at common intervals, indicating seasonality.

After differencing, the collection fluctuates round zero, which implies the development has been eliminated.

Nonetheless, because the seasonal spikes are nonetheless current, the subsequent step is to use seasonal differencing.

In seasonal differencing, we subtract the worth from the identical season in earlier cycle.

On this time collection we now have yearly seasonality (12 months), which implies:

For January 1993, we calculate Jan 1993 – Jan 1992.

This manner we apply seasonal differencing to complete collection.

After seasonal differencing on first order differenced collection, we get

We will observe that the seasonal spikes are gone and in addition for 12 months 1992 we get null values as a result of there are not any earlier values to subtract.

After first order differencing and seasonal differencing, the development and seasonality in a time collection are eliminated.

Now we are going to once more take a look at for stationarity utilizing ADF and KPSS assessments.

We will see that the time collection is stationary.

Be aware: Within the closing seasonal differenced collection, we nonetheless observe some spikes round 2020-2022 due to pandemic (one-time occasions).

These are known as Interventions. They could not violate stationarity; they will have an effect on mannequin accuracy. Methods like Intervention evaluation can be utilized right here.

We’ll talk about this once we discover ARIMA modeling.

We eliminated the development and seasonality within the time collection to make it stationary utilizing differencing.

Now as an alternative of differencing, we are able to additionally use STL Decomposition.

Earlier on this collection, we mentioned that when development and seasonal patterns in time collection get messy, we use STL to extract these patterns in a time collection.

So, we are able to apply STL decomposition on a time collection and extract the residual part which we get after eradicating development and seasonality.

We may also talk about ‘STL + ARIMA’ once we discover the ARIMA forecasting mannequin.

Thus far, we now have mentioned strategies for figuring out stationarity and for remodeling non-stationary time collection into stationary.

However why do time collection forecasting fashions assume stationarity?

We use time collection forecasting fashions to foretell the longer term based mostly on previous values.

These fashions require a stationary time collection to foretell the longer term as a result of the patterns stay constant over time.

In non-stationary time collection, there’s a fixed change in imply and variance, making the patterns unstable and the predictions unreliable.

Aren’t development and seasonality additionally patterns in a time collection?

Development and Seasonality are additionally the patterns in a time collection, however they violate the assumptions of fashions like ARIMA, which require stationary enter.

Development and Seasonality are dealt with individually earlier than modelling, and we are going to talk about this in upcoming blogs.

These time collection forecasting fashions are designed to seize short-term dependencies after eradicating international patterns.

What precisely are these short-term dependencies?

When we now have a time collection, we attempt to decompose it utilizing decomposition strategies to grasp the development, seasonality and residuals in it.

We already know that the development provides us the general course of the info over time (up or down) and seasonality exhibits the patterns that repeat at common intervals.

We additionally get residual, which is remaining after we take away development and seasonality from the time collection. This residual is unexplained by development and seasonality.

Development provides total course and seasonality exhibits patterns that will get repeated all through the collection.

However there could also be some patterns in residual that are non permanent in a time collection like a sudden spike in gross sales as a result of a promotion occasion or a sudden drop in gross sales as a result of strike or climate situations.

What can fashions like ARIMA do with this information?

Do fashions predict future promotional occasions or strikes based mostly on this information? No.

More often than not collection forecasting fashions are utilized in dwell manufacturing techniques throughout many industries (Actual-Time).

In real-time forecasting techniques, as new information is available in, the forecasts are constantly up to date to mirror the newest tendencies and patterns.

Let’s take a easy instance of cool drinks stock administration.

The shop proprietor is aware of that the cool drinks gross sales is excessive in summer time and low in winter. However that doesn’t assist him in each day stock planning. Right here the brief time period dependencies are crucial.

For instance,

there could also be a spike in gross sales throughout festivals and wedding ceremony season for a while.
If there’s a sudden temperature spike (warmth wave)
A weekend 1+1 supply could improve gross sales
Weekend gross sales could also be excessive in comparison with weekdays.
When retailer was out of inventory for 2-3 days and the second inventory is again there could also be a sudden burst of gross sales.

These patterns don’t repeat constantly like seasonality, and so they aren’t a part of a long-term development. However they do happen usually that forecasting fashions can study from them.

Time collection forecasting fashions don’t predict these future occasions, however they study the patterns or conduct of the info when such a spike seems.

The mannequin then predicts in line with it like a spike in gross sales after a promotional occasion the gross sales could progressively develop into regular reasonably than a sudden drop. The fashions seize these patterns and supply dependable outcomes.

After prediction, the development and seasonality elements are added again to acquire the ultimate forecast.

Because of this the short-term dependencies are crucial in time collection forecasting.

Dataset: This weblog makes use of publicly out there information from FRED (Federal Reserve Financial Information). The collection Advance Retail Gross sales: Division Shops (RSDSELD) is revealed by the U.S. Census Bureau and can be utilized for evaluation and publication with acceptable quotation.

Official quotation:
U.S. Census Bureau, Advance Retail Gross sales: Division Shops [RSDSELD], retrieved from FRED, Federal Reserve Financial institution of St. Louis; https://fred.stlouisfed.org/collection/RSDSELD, July 7, 2025.

Be aware:
All of the visualizations and take a look at outcomes proven on this weblog have been generated utilizing Python code.
You’ll be able to discover the entire code right here: GitHub.

On this weblog, I used Python to carry out statistical assessments and, based mostly on the outcomes, decided whether or not the time collection is stationary or non-stationary.

Subsequent up on this collection is an in depth dialogue of the statistical assessments (ADF and KPSS assessments) used to establish stationarity.

I hope you discovered this weblog intuitive and useful.

I’d love to listen to your ideas or reply any questions.

Thanks for studying!