- AR (AutoRegressive): The model uses the dependent relationship between an observation and some number of lagged observations.
- I (Integrated): It involves differencing the raw observations to make the time series stationary, a key step for the AR and MA components to work effectively.
- MA (Moving Average): The model incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations.
if (!require("forecast", quietly = TRUE)) {
install.packages("forecast")
}
# Load the 'forecast' library after ensuring it's installed
library(forecast)
# Assume we have a time series 'electricity_demand' in kWh
# This is our mock time series data
set.seed(123) # For reproducibility
electricity_demand <- ts(rnorm(60, mean = 1500, sd = 300), frequency = 12, start = c(2019, 1))
# First, we plot the data to check for any obvious trends or seasonality
plot(electricity_demand, main = "Monthly Electricity Demand", xlab = "Time", ylab = "Demand (kWh)")
# Next, we use the 'auto.arima' function to automatically select the best ARIMA model based on AIC
fit <- auto.arima(electricity_demand)
# Summarize the fit to understand the chosen model
summary(fit)
# Forecast the next 12 months (1 year) of electricity demand
future_demand <- forecast(fit, h = 12)
# Plot the forecasted demand
plot(future_demand, main = "ARIMA Model Forecast", xlab = "Time", ylab = "Demand (kWh)")
# Print the forecasted values
print(future_demand$mean)
The name of the time series that the model is applied to is electricity_demand.
ARIMA(0,0,0)(0,0,1)[12]: This indicates the type of model fitted to the data. The notation can be interpreted as follows: The first set of parameters (0,0,0) indicates that there are no autoregressive terms (AR), no differencing (I), and no moving average terms (MA) in the non-seasonal part of the model. The second set of parameters (0,0,1)[12] indicates that there is one seasonal moving average term and the seasonal period is 12, which usually corresponds to monthly data with an annual cycle. "With non-zero mean" suggests that the model includes a mean term in its formulation.
- sma1: The coefficient for the seasonal moving average term is -0.3059, with a standard error of 0.1746. This indicates the relationship between the current month's seasonal component and the residual error of the previous month's seasonal component.
- mean: The mean of the series is estimated to be 1515.4874, with a standard error of 25.7335. This value represents the average monthly electricity demand around which the seasonal fluctuations occur.
- sigma^2: This represents the variance of the residuals from the model and is estimated to be 70026. A lower value is generally better, indicating that the model's predictions are closer to the actual values.
- log likelihood: The value of -419.41 is a measure of the model's likelihood, with higher values indicating a better model fit.
- AIC (Akaike Information Criterion): 844.82 is a measure of the relative quality of the statistical model for a given set of data. It deals with the trade-off between the goodness of fit and the simplicity of the model. Lower AIC values suggest a better model.
- AICC (Corrected Akaike Information Criterion): 845.24 is a version of AIC adjusted for small sample sizes.
- BIC (Bayesian Information Criterion): 851.1 is another criterion for model selection, with lower values indicating a better model, taking into account the complexity of the model.
- ME (Mean Error): 1.310345 suggests that the model's forecasts are, on average, slightly over the actual values.
- RMSE (Root Mean Squared Error): 260.1771 indicates the model's typical forecast error magnitude.
- MAE (Mean Absolute Error): 209.0784 is the average magnitude of the errors in the predictions, without considering their direction.
- MPE (Mean Percentage Error): -3.025129 indicates that on average, the model's predictions are about 3% less than the actual values.
- MAPE (Mean Absolute Percentage Error): 14.37263% is the average absolute percent error per forecast, which gives an idea of the error magnitude in percentage terms.
- MASE (Mean Absolute Scaled Error): 0.5804085 is a measure of accuracy in a time series forecast that is scaled against the naïve model; values greater than one indicate a model performing worse than a naïve forecast.
- ACF1 (Autocorrelation of residuals at lag 1): -0.1024486 indicates a slight negative correlation between the residuals across consecutive forecasts, suggesting that there is no significant autocorrelation left in the residuals.
For graduate students looking to apply time-series forecasting in their research, ARIMA models offer a robust framework. The power of ARIMA lies in its ability to transform non-stationary historical data into a stationary series that can reveal insights and forecast future trends. Whether it’s predicting stock prices, weather patterns, or energy demand, ARIMA provides a window into the future, grounded in the rigor of statistical analysis.
BridgeText can help you with all of your statistics needs.