Stochastic Differential Equations and Temperature — NASA Local weather Information pt. 2

If we use AI to do our work – what’s our job, then?

10 Python One-Liners Each Machine Studying Practitioner Ought to Know

surprise why we’d need to mannequin a temperature time sequence as an Ornstein-Uhlenbeck course of, and if this has any sensible implications. Properly, it does. Whereas the O-U course of is utilized in Physics to mannequin the rate of particles underneath friction, additionally it is utilized in finance to mannequin rates of interest, volatility, or unfold imply reversion. This temperature software originates from an space the place I’ve private expertise. My thesis, “Neural Community Approaches to Pricing Climate Spinoffs,” focuses on growing new strategies for pricing these derivatives.

The O-U course of is often utilized in pricing climate derivatives. A monetary by-product designed to ship a payout based mostly on an underlying local weather index (temperature, rainfall, windspeed, snowfall). These derivatives allow anybody uncovered to local weather dangers to handle their danger publicity. That might be farmers involved about drought, power corporations apprehensive about rising heating prices, or seasonal retailers involved a few poor season. You’ll be able to learn way more about this in Climate Derivatives, Antonis Alexandridis Ok. (with whom I’m at the moment working), and Achilleas D. Zapranis.

SDEs present up wherever uncertainty lies. The Black Scholes, Schrödinger Equations, and Geometric Brownian Movement are all Stochastic Differential Equations. These equations describe how techniques evolve when a random part is concerned. SDEs are notably helpful for forecasting underneath uncertainty, whether or not that be inhabitants development, epidemic unfold, or inventory costs. At present, we might be taking a look at how we will mannequin temperature utilizing the Ornstein–Uhlenbeck course of, which behaves fairly a bit just like the Warmth Equation, a PDE describing temperature flow.

In my previous two tales:

The Local weather knowledge can be out there on my GitHub, so we will skip this step.

At present we’ll:

Clarify how the O-U course of can mannequin a temperature time-series.
Speak about mean-reverting processes, stochastic differential equations, and the way the O-U course of pertains to the warmth equation.
Use Python to suit an Ornstein-Uhlenbeck (O-U) course of, a Stochastic Differential Equation (SDE), to NASA local weather knowledge.

Within the determine above, we will see how volatility adjustments with the seasons. This emphasizes an essential motive why the O-U course of is well-suited for modelling this time sequence, because it doesn’t assume fixed volatility.

The O-U course of, mean-reversion, and the warmth equation

We’ll use the O-U course of to mannequin how temperature adjustments over time at a hard and fast level. In our equation, T(t) represents the temperature as a operate of time (t). As you intuitively perceive, the Earth’s temperatures cycle round a 365-day imply. Which means that seasonal adjustments are modelled utilizing s(t). We will consider these means because the blue and orange strains within the temperature graphs for Barrow, Alaska, and Niamey, Niger.

[dT(t) = ds(t) – kappa left( T(t) – s(t) right) , dt + sigma(t) , dB(t)]

Kappa (κ) is the velocity of imply reversion. Very similar to its title would suggest, this fixed controls how briskly our temperature reverts again to its seasonal means. Imply reversion on this context signifies that if in the present day is an icy day, colder than the common, then we’d count on tomorrow’s or subsequent week’s temperature to be hotter because it fluctuates round that seasonal imply.

[
dT(t) = ds(t) – {color{red}{kappa}} big( T(t) – s(t) big), dt + sigma(t), dB(t)
]

This fixed performs an identical position within the warmth equation, the place it describes how briskly a rod will increase in temperature as it’s being heated. Within the warmth equation, κ is the thermal diffusivity, which describes how briskly warmth flows by means of a fabric.

[
frac{partial u(mathbf{x},t)}{partial t} = {color{red}{kappa}} , nabla^2 u(mathbf{x},t) + q(mathbf{x},t)
]

κ = 0: No imply reversion. The method reduces to Brownian movement with drift.
0 < κ < 1: Weak imply reversion. The shocks persist, gradual decay of autocorrelation.
κ > 1: Sturdy imply reversion. Deviations are corrected shortly, and the method is tightly clustered round μ.

Fourier Sequence

Estimating Imply and Volatility

[s(t) = a + b t + sum_{k=1}^{K_1} left( alpha_k sinleft( frac{2pi k t}{365} right) + beta_k cosleft( frac{2pi k t}{365} right) right)]

[sigma^2(t) = c + sum_{k=1}^{K_2} left( gamma_k sinleft( frac{2pi k t}{365} right) + delta_k cosleft( frac{2pi k t}{365} right) right)]

Becoming the Ornstein–Uhlenbeck course of to Mumbai knowledge

We match the imply of our temperature course of

Match the imply utilizing the Fourier Sequence
Mannequin short-term dependence — and calculate the imply reversion parameter κ utilizing the AR(1) course of
Match Volatility utilizing Fourier Sequence

Becoming the imply

By becoming our knowledge to 80% of the info, we will later use our SDE to forecast temperatures. Our imply is an OLS regression becoming s(t) to our knowledge.

# --------------------
# Config (parameters and paths for evaluation)
# --------------------

CITY        = "Mumbai, India"     # Title of the town being analyzed (utilized in labels/plots)
SEED        = 42                  # Random seed for reproducibility of outcomes
SPLIT_FRAC  = 0.80                # Fraction of information to make use of for coaching (relaxation for testing)
MEAN_HARM   = 2                   # Variety of harmonic phrases to make use of for modeling the imply (seasonality)
VOL_HARM    = 3                   # Variety of harmonic phrases to make use of for modeling volatility (seasonality)
LJUNG_LAGS  = 10                  # Variety of lags for Ljung-Field take a look at (test autocorrelation in residuals)
EPS         = 1e-12               # Small worth to keep away from division by zero or log(0) points
MIN_TEST_N  = 8                   # Minimal variety of take a look at factors required to maintain a sound take a look at set

# --------------------
# Paths (the place enter/output recordsdata are saved)
# --------------------

# Base listing in Google Drive the place local weather knowledge and outcomes are saved
BASE_DIR    = Path("/content material/drive/MyDrive/TDS/Local weather")

# Subdirectory particularly for outputs of the Mumbai evaluation
OUT_BASE    = BASE_DIR / "Benth_Mumbai"

# Subfolder for saving plots generated throughout evaluation
PLOTS_DIR   = OUT_BASE / "plots"

# Make sure the output directories exist (create them in the event that they don’t)
OUT_BASE.mkdir(dad and mom=True, exist_ok=True)
PLOTS_DIR.mkdir(dad and mom=True, exist_ok=True)

# Path to the local weather knowledge CSV file (enter dataset)
CSV_PATH    = BASE_DIR / "climate_data.csv"

It’s far more durable to see the sign within the noise when wanting on the residuals of our regression. We could be tempted to imagine that that is white-noise, however we’d be mistaken. There’s nonetheless dependence on this knowledge, a serial dependence. If we want to mannequin volatility later utilizing a Fourier sequence, we should account for this dependence within the knowledge.

# =================================================================
# 1) MEAN MODEL: residuals after becoming seasonal imply + diagnostics
# =================================================================

# Residuals after subtracting fitted seasonal imply (earlier than AR step)
mu_train = mean_fit.predict(prepare)
x_train = prepare["DAT"] - mu_train

# Save imply mannequin regression abstract to textual content file
save_model_summary_txt(mean_fit, DIAG_DIR / f"{m_slug}_mean_OLS_summary.txt")

# --- Plot: noticed DAT vs fitted seasonal imply (prepare + take a look at) ---
fig = plt.determine(figsize=(12,5))
plt.plot(prepare["Date"], prepare["DAT"], lw=1, alpha=0.8, label="DAT (prepare)")
plt.plot(prepare["Date"], mu_train, lw=2, label="μ̂(t) (prepare)")

# Predict and plot fitted imply for take a look at set
mu_test = mean_fit.predict(take a look at[["trend"] + [c for c in train.columns if c.startswith(("cos","sin"))][:2*MEAN_HARM]])
plt.plot(take a look at["Date"], take a look at["DAT"], lw=1, alpha=0.8, label="DAT (take a look at)")
plt.plot(take a look at["Date"], mu_test, lw=2, label="μ̂(t) (take a look at)")

plt.title("Mumbai — DAT and seasonal imply match")
plt.xlabel("Date"); plt.ylabel("Temperature (DAT)")
plt.legend()
fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_mean_fit_timeseries.png", dpi=160); plt.shut(fig)

# --- Plot: imply residuals (time sequence) ---
fig = plt.determine(figsize=(12,4))
plt.plot(prepare["Date"], x_train, lw=1)
plt.axhline(0, coloration="ok", lw=1)
plt.title("Mumbai — Residuals after imply match (x_t = DAT - μ̂)")
plt.xlabel("Date"); plt.ylabel("x_t")
fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_mean_residuals_timeseries.png", dpi=160); plt.shut(fig)

κ, ϕ, and the autoregressive course of

Match an AR(1) course of. This tells you the velocity of imply reversion.

An AR(1) course of can seize the dependence between our residual at time step t based mostly on the worth at t-1. The precise worth for κ relies on location. For Mumbai, κ = 0.861. This implies temperatures return to the imply pretty shortly.

[X_t = c + phi X_{t-1} + varepsilon_t, quad varepsilon_t sim mathcal{N}(0,sigma^2)]

# =======================================================
# 2) AR(1) MODEL: match and residual diagnostics
# =======================================================

# Extract AR(1) parameter φ and save abstract
phi = float(ar_fit.params.iloc[0]) if len(ar_fit.params) else np.nan
save_model_summary_txt(ar_fit, DIAG_DIR / f"{m_slug}_ar1_summary.txt")

# --- Scatterplot: x_t vs x_{t-1} with fitted line φ x_{t-1} ---
x_t = x_train.iloc[1:].values
x_tm1 = x_train.iloc[:-1].values
x_pred = phi * x_tm1
fig = plt.determine(figsize=(5.8,5.2))
plt.scatter(x_tm1, x_t, s=10, alpha=0.6, label="Noticed")
xline = np.linspace(np.min(x_tm1), np.max(x_tm1), 2)
plt.plot(xline, phi*xline, lw=2, label=f"Fitted: x_t = {phi:.3f} x_(t-1)")
plt.title("Mumbai — AR(1) scatter: x_t vs x_{t-1}")
plt.xlabel("x_{t-1}"); plt.ylabel("x_t"); plt.legend()
fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_ar1_scatter.png", dpi=160); plt.shut(fig)

# --- Time sequence: precise vs fitted AR(1) values ---
fig = plt.determine(figsize=(12,4))
plt.plot(prepare["Date"].iloc[1:], x_t, lw=1, label="x_t")
plt.plot(prepare["Date"].iloc[1:], x_pred, lw=2, label=r"$hat{x}_t = phi x_{t-1}$")
plt.axhline(0, coloration="ok", lw=1)
plt.title("Mumbai — AR(1) fitted vs noticed deviations")
plt.xlabel("Date"); plt.ylabel("x_t")
plt.legend()
fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_ar1_timeseries_fit.png", dpi=160); plt.shut(fig)

# --- Save AR diagnostics (φ + Ljung-Field take a look at p-value) ---
from statsmodels.stats.diagnostic import acorr_ljungbox
strive:
    lb = acorr_ljungbox(e_train, lags=[10], return_df=True)
    lb_p = float(lb["lb_pvalue"].iloc[0])   # take a look at for autocorrelation
    lb_stat = float(lb["lb_stat"].iloc[0])
besides Exception:
    lb_p = np.nan; lb_stat = np.nan

pd.DataFrame([{"phi": phi, "ljungbox_stat_lag10": lb_stat, "ljungbox_p_lag10": lb_p}])
    .to_csv(DIAG_DIR / f"{m_slug}_ar1_diagnostics.csv", index=False)

Within the graph above, we will see how our AR(1) course of, in orange, estimates the subsequent day’s temperature from 1981 to 2016. We will additional justify the usage of the AR(1) course of, and acquire additional instinct into it by plotting Xₜ and Xₜ₋₁. Right here we will see how these values are clearly correlated. By framing it this fashion, we will additionally see that the AR(1) course of is just linear regression. We don’t even have so as to add a drift/intercept time period as we’ve eliminated the imply. Therefore, our intercept is at all times zero.

Now, taking a look at our AR(1) residuals, we will start to consider how we will mannequin the volatility of our temperature throughout time. From the graph beneath, we will see how our residuals pulsate throughout time in a seemingly periodic method. We will postulate that the factors within the yr with increased residuals correspond to time durations with increased volatility.

# --- Residuals from AR(1) mannequin ---
e_train = ar_fit.resid.astype(float).values
fig = plt.determine(figsize=(12,4))
plt.plot(prepare["Date"].iloc[1:], e_train, lw=1)
plt.axhline(0, coloration="ok", lw=1)
plt.title("Mumbai — AR(1) residuals ε_t")
plt.xlabel("Date"); plt.ylabel("ε_t")
fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_ar1_residuals_timeseries.png", dpi=160); plt.shut(fig)

Volatility Fourier Sequence

Our resolution is to mannequin the volatility utilizing a Fourier sequence, very similar to what we did for our imply. To do that, we should make some transformations to our residual εₜ. We contemplate the squared εₜ² as a result of volatility can’t be destructive by definition.

[varepsilon_t = sigma_t eta_t qquad varepsilon_t^{2} = sigma_t^{2}eta_t^{2}]

We will consider these residuals as being comprised of half standardized shocks ηₜ, with a imply of 0 and a variance of 1 (representing randomness), and volatility σₜ. We need to isolate for σₜ. This may be achieved extra simply by taking the log. However extra importantly, doing this makes our error phrases additive.

[displaystyle y_t := log(varepsilon_t^2) = underbrace{logsigma_t^2}_{text{deterministic, seasonal}} + underbrace{logeta_t^2}_{text{random noise}}]

In abstract, modeling log(εₜ²) helps to:

hold σ^t₂ > 0
Cut back the impact that outliers have on the match, which means they may have a much less important affect on the match.

After we match log(εₜ²), if we ever want to get well εₜ², we exponentiate it later.

[displaystyle widehat{sigma}_t^{,2} ;=; exp!big(widehat y_tbig)]

# =======================================================
# 3) VOLATILITY REGRESSION: match log(ε_t^2) on seasonal harmonics
# =======================================================
if vol_fit is just not None:
    # Compute log-squared residuals (proxy for variance)
    log_eps2 = np.log(e_train**2 + 1e-12)

    # Use cosine/sine harmonics as regressors for volatility
    feats = prepare.iloc[1:][[c for c in train.columns if c.startswith(("cos","sin"))][:2*VOL_HARM]]
    vol_terms = [f"{b}{k}" for k in range(1, VOL_HARM + 1) for b in ("cos","sin")]
    Xbeta = np.asarray(vol_fit.predict(prepare.iloc[1:][vol_terms]), dtype=float)

    # --- Time plot: noticed vs fitted log-variance ---
    fig = plt.determine(figsize=(12,4))
    plt.plot(prepare["Date"].iloc[1:], log_eps2, lw=1, label="log(ε_t^2)")
    plt.plot(prepare["Date"].iloc[1:], Xbeta, lw=2, label="Fitted log-variance")
    plt.title("Mumbai — Volatility regression (log ε_t^2)")
    plt.xlabel("Date"); plt.ylabel("log variance")
    plt.legend()
    fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_volatility_logvar_timeseries.png", dpi=160); plt.shut(fig)

    # --- Plot: estimated volatility σ̂(t) over time ---
    fig = plt.determine(figsize=(12,4))
    plt.plot(prepare["Date"].iloc[1:], sigma_tr, lw=1.5, label="σ̂ (prepare)")
    if 'sigma_te' in globals():
        plt.plot(take a look at["Date"], sigma_te, lw=1.5, label="σ̂ (take a look at)")
    plt.title("Mumbai — Conditional volatility σ̂(t)")
    plt.xlabel("Date"); plt.ylabel("σ̂")
    plt.legend()
    fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_volatility_sigma_timeseries.png", dpi=160); plt.shut(fig)

    # --- Coefficients of volatility regression (with CI) ---
    vol_coef = coef_ci_frame(vol_fit).sort_values("estimate")
    fig = plt.determine(figsize=(8, max(4, 0.4*len(vol_coef))))
    y = np.arange(len(vol_coef))
    plt.errorbar(vol_coef["estimate"], y, xerr=1.96*vol_coef["stderr"], fmt="o", capsize=3)
    plt.yticks(y, vol_coef["term"])
    plt.axvline(0, coloration="ok", lw=1)
    plt.title("Mumbai — Volatility mannequin coefficients (95% CI)")
    plt.xlabel("Estimate")
    fig.tight_layout(); fig.savefig(DIAG_DIR / f"{m_slug}_volatility_coefficients.png", dpi=160); plt.shut(fig)

    # Save volatility regression abstract
    save_model_summary_txt(vol_fit, DIAG_DIR / f"{m_slug}_volatility_summary.txt")
else:
    print("Volatility mannequin not out there (too few factors or regression failed). Skipping vol plots.")

print("Diagnostics saved to:", DIAG_DIR)

[dT(t) = ds(t) – kappa (T(t) – s(t)),dt + expleft(frac{1}{2}(alpha + beta^T z(t))right), dB(t)]

Demo of log stabilization

The graphs beneath illustrate how the logarithm aids in becoming volatility. The residual εₜ² incorporates important outliers, partly because of the sq. operation carried out to make sure positivity on the stabilized scale. With out the log stabilization, the massive shocks dominate the match, and the ultimate result’s biased.

# Repair the seasonal plot from the earlier cell (indexing bug) and
# add a *second state of affairs* with uncommon massive outliers for instance instability
# when regressing on skewed eps^2 straight.

# ------------------------------
# 1) Seasonal view (fastened indexing)
# ------------------------------
# Recreate the arrays from the prior simulation cell by re-executing the identical seed & setup
np.random.seed(7)                 # deterministic reproducibility
n_days = 730                      # two artificial years (365 * 2)
t = np.arange(n_days)             # day index 0..729
omega = 2 * np.pi / 365.0         # seasonal frequency (one-year interval)

# True (data-generating) log-variance is seasonal through sin/cos
a_true, b_true, c_true = 0.2, 0.9, -0.35
log_var_true = a_true + b_true * np.sin(omega * t) + c_true * np.cos(omega * t)

# Convert log-variance to plain deviation: sigma = exp(0.5 * log var)
sigma_true = np.exp(0.5 * log_var_true)

# White noise improvements; variance scaled by sigma_true
z = np.random.regular(dimension=n_days)
eps = sigma_true * z              # heteroskedastic residuals
eps2 = eps**2                     # "variance-like" goal if regressing uncooked eps^2

# Design matrix with intercept + annual sin/cos harmonics
X = np.column_stack([np.ones(n_days), np.sin(omega * t), np.cos(omega * t)])

# --- Match A: OLS on uncooked eps^2 (delicate to skew/outliers) ---
beta_A, *_ = np.linalg.lstsq(X, eps2, rcond=None)
var_hat_A = X @ beta_A            # fitted variance (could be destructive from OLS)
var_hat_A = np.clip(var_hat_A, 1e-8, None)  # clip to keep away from negatives
sigma_hat_A = np.sqrt(var_hat_A)  # convert to sigma

# --- Match B: OLS on log(eps^2) (stabilizes scale & reduces skew) ---
eps_safe = 1e-12                  # small epsilon to keep away from log(0)
y_log = np.log(eps2 + eps_safe)   # stabilized goal
beta_B, *_ = np.linalg.lstsq(X, y_log, rcond=None)
log_var_hat_B = X @ beta_B
sigma_hat_B = np.exp(0.5 * log_var_hat_B)

# Day-of-year index for seasonal averaging throughout years
doy = t % 365

# Right ordering for the primary yr's DOY values
order365 = np.argsort(doy[:365])

def seasonal_mean(x):
    """
    Common the 2 years day-by-day to get a single seasonal curve.
    Assumes x has size 730 (two years); returns length-365 array.
    """
    return 0.5 * (x[:365] + x[365:730])

# Plot one "artificial yr" view of the seasonal sigma sample
plt.determine(figsize=(12, 5))
plt.plot(np.kind(doy[:365]), seasonal_mean(sigma_true)[order365], label="True sigma seasonality")
plt.plot(np.kind(doy[:365]), seasonal_mean(sigma_hat_A)[order365], label="Fitted from eps^2 regression", alpha=0.9)
plt.plot(np.kind(doy[:365]), seasonal_mean(sigma_hat_B)[order365], label="Fitted from log(eps^2) regression", alpha=0.9)
plt.title("Seasonal Volatility Sample: True vs Fitted (one-year view) – Mounted")
plt.xlabel("Day of yr")
plt.ylabel("Sigma")
plt.legend()
plt.tight_layout()
plt.present()

# ------------------------------
# 2) OUTLIER SCENARIO for instance instability
# ------------------------------
np.random.seed(21)                # separate seed for the outlier experiment

def run_scenario(n_days=730, outlier_rate=0.05, outlier_scale=8.0):
    """
    Generate two-year heteroskedastic residuals with occasional big shocks
    to imitate heavy tails. Evaluate:
      - Match A: regress uncooked eps^2 on sin/cos (could be unstable, destructive suits)
      - Match B: regress log(eps^2) on sin/cos (extra steady underneath heavy tails)
    Return fitted sigmas and error metrics (MAE/MAPE), plus diagnostics.
    """
    t = np.arange(n_days)
    omega = 2 * np.pi / 365.0

    # Similar true seasonal log-variance as above
    a_true, b_true, c_true = 0.2, 0.9, -0.35
    log_var_true = a_true + b_true * np.sin(omega * t) + c_true * np.cos(omega * t)
    sigma_true = np.exp(0.5 * log_var_true)

    # Base regular improvements
    z = np.random.regular(dimension=n_days)

    # Inject uncommon, big shocks to create heavy tails in eps^2
    masks = np.random.rand(n_days) < outlier_rate
    z[mask] *= outlier_scale

    # Heteroskedastic residuals and their squares
    eps = sigma_true * z
    eps2 = eps**2

    # Similar sin/cos design
    X = np.column_stack([np.ones(n_days), np.sin(omega * t), np.cos(omega * t)])

    # --- Match A: uncooked eps^2 on X (OLS) ---
    beta_A, *_ = np.linalg.lstsq(X, eps2, rcond=None)
    var_hat_A_raw = X @ beta_A
    neg_frac = np.imply(var_hat_A_raw < 0.0)        # fraction of destructive variance predictions
    var_hat_A = np.clip(var_hat_A_raw, 1e-8, None) # clip to make sure non-negative variance
    sigma_hat_A = np.sqrt(var_hat_A)

    # --- Match B: log(eps^2) on X (OLS on log-scale) ---
    y_log = np.log(eps2 + 1e-12)
    beta_B, *_ = np.linalg.lstsq(X, y_log, rcond=None)
    log_var_hat_B = X @ beta_B
    sigma_hat_B = np.exp(0.5 * log_var_hat_B)

    # Error metrics evaluating fitted sigmas to the true sigma path
    mae = lambda a, b: np.imply(np.abs(a - b))
    mape = lambda a, b: np.imply(np.abs((a - b) / (a + 1e-12))) * 100

    mae_A = mae(sigma_true, sigma_hat_A)
    mae_B = mae(sigma_true, sigma_hat_B)
    mape_A = mape(sigma_true, sigma_hat_A)
    mape_B = mape(sigma_true, sigma_hat_B)

    return {
        "t": t,
        "sigma_true": sigma_true,
        "sigma_hat_A": sigma_hat_A,
        "sigma_hat_B": sigma_hat_B,
        "eps2": eps2,
        "y_log": y_log,
        "neg_frac": neg_frac,
        "mae_A": mae_A, "mae_B": mae_B,
        "mape_A": mape_A, "mape_B": mape_B
    }

# Run with 5% outliers scaled 10x to make the purpose apparent
res = run_scenario(outlier_rate=0.05, outlier_scale=10.0)

print("nOUTLIER SCENARIO (5% of days have 10x shocks) — illustrating instability when utilizing eps^2 straight")
print(f"  MAE  (sigma):  uncooked eps^2 regression = {res['mae_A']:.4f}   |   log(eps^2) regression = {res['mae_B']:.4f}")
print(f"  MAPE (sigma):  uncooked eps^2 regression = {res['mape_A']:.2f}% |   log(eps^2) regression = {res['mape_B']:.2f}%")
print(f"  Destructive variance predictions earlier than clipping (uncooked match): {res['neg_frac']:.2%}")

# Visible comparability: true sigma vs two fitted approaches underneath outliers
plt.determine(figsize=(12, 5))
plt.plot(res["t"], res["sigma_true"], label="True sigma(t)")
plt.plot(res["t"], res["sigma_hat_A"], label="Fitted sigma from eps^2 regression", alpha=0.9)
plt.plot(res["t"], res["sigma_hat_B"], label="Fitted sigma from log(eps^2) regression", alpha=0.9)
plt.title("True vs Fitted Volatility with Uncommon Giant Shocks")
plt.xlabel("Day")
plt.ylabel("Sigma")
plt.legend()
plt.tight_layout()
plt.present()

# Present how the targets behave when outliers are current
plt.determine(figsize=(12, 5))
plt.plot(res["t"], res["eps2"], label="eps^2 (now extraordinarily heavy-tailed as a result of outliers)")
plt.title("eps^2 underneath outliers: unstable goal for regression")
plt.xlabel("Day")
plt.ylabel("eps^2")
plt.legend()
plt.tight_layout()
plt.present()

plt.determine(figsize=(12, 5))
plt.plot(res["t"], res["y_log"], label="log(eps^2): compressed & stabilized")
plt.title("log(eps^2) underneath outliers: stabilized scale")
plt.xlabel("Day")
plt.ylabel("log(eps^2)")
plt.legend()
plt.tight_layout()
plt.present()

Conclusion

Now that we’ve fitted all these parameters to this SDE, we will think about using it to forecast. However getting right here hasn’t been straightforward. Our resolution closely relied on two key ideas.

Features could be approximated by means of a Fourier sequence.
The imply reversion parameter κ is equal to ϕ from our AR(1) course of.

Every time becoming a stochastic differential equation, each time doing something stochastic, I discover it humorous to think about how the phrase derives from stokhos, which means intention. Even funnier is how that phrase was stokhazesthai, which means intention/guess. That course of should’ve concerned some error and horrible intention.

References

Alexandridis, A. Ok., & Zapranis, A. D. (2013). Climate Derivatives: Modelling and Pricing Climate-Associated Threat. Springer. https://doi.org/10.1007/978-1-4614-6071-8

Benth, F. E., & Šaltytė Benth, J. (2007). The volatility of temperature and pricing of climate derivatives. Quantitative Finance, 7(5), 553–561. https://doi.org/10.1080/14697680601155334

GitHub

Web site