TimeSeriesBenchmarkMeasures

Library "TimeSeriesBenchmarkMeasures"
Time Series Benchmark Metrics. \
Provides a comprehensive set of functions for benchmarking time series data, allowing you to evaluate the accuracy, stability, and risk characteristics of various models or strategies. The functions cover a wide range of statistical measures, including accuracy metrics (MAE, MSE, RMSE, NRMSE, MAPE, SMAPE), autocorrelation analysis (ACF, ADF), and risk measures (Theils Inequality, Sharpness, Resolution, Coverage, and Pinball).

___
Reference:
- github.com/PYFTS/pyFTS/blob/master/pyFTS/benchmarks/Measures.py .
- medium.com/analytics-vidhya/assessment-of-accuracy-metrics-for-time-series-forecasting-bc115b655705 .
- salesforce.com/blog/gift-eval-time-series-benchmark/ .
- towardsdatascience.com/an-overview-of-forecasting-performance-metrics-ef548dad0134/ .
- github.com/PYFTS/pyFTS/blob/master/pyFTS/benchmarks/Measures.py .

mae(actual, forecasts)
In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of Y versus X include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement.
Parameters:
actual (array<float>): List of actual values.
forecasts (array<float>): List of forecasts values.
Returns: - Mean Absolute Error (MAE).

___
Reference:
- en.wikipedia.org/wiki/Mean_absolute_error .
- The Orange Book of Machine Learning - Carl McBride Ellis .

mse(actual, forecasts)
The Mean Squared Error (MSE) is a measure of the quality of an estimator. As it is derived from the square of Euclidean distance, it is always a positive value that decreases as the error approaches zero.
Parameters:
actual (array<float>): List of actual values.
forecasts (array<float>): List of forecasts values.
Returns: - Mean Squared Error (MSE).

___
Reference:
- en.wikipedia.org/wiki/Mean_squared_error .

rmse(targets, forecasts, order, offset)
Calculates the Root Mean Squared Error (RMSE) between target observations and forecasts. RMSE is a standard measure of the differences between values predicted by a model and the values actually observed.
Parameters:
targets (array<float>): List of target observations.
forecasts (array<float>): List of forecasts.
order (int): Model order parameter that determines the starting position in the targets array, `default=0`.
offset (int): Forecast offset related to target, `default=0`.
Returns: - RMSE value.

nmrse(targets, forecasts, order, offset)
Normalised Root Mean Squared Error.
Parameters:
targets (array<float>): List of target observations.
forecasts (array<float>): List of forecasts.
order (int): Model order parameter that determines the starting position in the targets array, `default=0`.
offset (int): Forecast offset related to target, `default=0`.
Returns: - NRMSE value.

rmse_interval(targets, forecasts)
Root Mean Squared Error for a set of interval windows. Computes RMSE by converting interval forecasts (with min/max bounds) into point forecasts using the mean of the interval bounds, then compares against actual target values.
Parameters:
targets (array<float>): List of target observations.
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - RMSE value for the combined interval list.

mape(targets, forecasts)
Mean Average Percentual Error.
Parameters:
targets (array<float>): List of target observations.
forecasts (array<float>): List of forecasts.
Returns: - MAPE value.

smape(targets, forecasts, mode)
Symmetric Mean Average Percentual Error. Calculates the Mean Absolute Percentage Error (MAPE) between actual targets and forecasts. MAPE is a common metric for evaluating forecast accuracy, expressed as a percentage, lower values indicate a better forecast accuracy.
Parameters:
targets (array<float>): List of target observations.
forecasts (array<float>): List of forecasts.
mode (int): Type of method: default=0:`sum(abs(Fi-Ti)) / sum(Fi+Ti)` , 1:`mean(abs(Fi-Ti) / ((Fi + Ti) / 2))` , 2:`mean(abs(Fi-Ti) / (abs(Fi) + abs(Ti))) * 100`
Returns: - SMAPE value.

mape_interval(targets, forecasts)
Mean Average Percentual Error for a set of interval windows.
Parameters:
targets (array<float>): List of target observations.
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - MAPE value for the combined interval list.

acf(data, k)
Autocorrelation Function (ACF) for a time series at a specified lag.
Parameters:
data (array<float>): Sample data of the observations.
k (int): The lag period for which to calculate the autocorrelation. Must be a non-negative integer.
Returns: - The autocorrelation value at the specified lag, ranging from -1 to 1.

___
The autocorrelation function measures the linear dependence between observations in a time series
at different time lags. It quantifies how well the series correlates with itself at different
time intervals, which is useful for identifying patterns, seasonality, and the appropriate
lag structure for time series models.

ACF values close to 1 indicate strong positive correlation, values close to -1 indicate
strong negative correlation, and values near 0 indicate no linear correlation.

___
Reference:
- statisticsbyjim.com/time-series/autocorrelation-partial-autocorrelation/

acf_multiple(data, k)
Autocorrelation function (ACF) for a time series at a set of specified lags.
Parameters:
data (array<float>): Sample data of the observations.
k (array<int>): List of lag periods for which to calculate the autocorrelation. Must be a non-negative integer.
Returns: - List of ACF values for provided lags.

___
The autocorrelation function measures the linear dependence between observations in a time series
at different time lags. It quantifies how well the series correlates with itself at different
time intervals, which is useful for identifying patterns, seasonality, and the appropriate
lag structure for time series models.

ACF values close to 1 indicate strong positive correlation, values close to -1 indicate
strong negative correlation, and values near 0 indicate no linear correlation.

___
Reference:
- statisticsbyjim.com/time-series/autocorrelation-partial-autocorrelation/

adfuller(data, n_lag, conf)
: Augmented Dickey-Fuller test for stationarity.
Parameters:
data (array<float>): Data series.
n_lag (int): Maximum lag.
conf (string): Confidence Probability level used to test for critical value, (`90%`, `95%`, `99%`).
Returns: - `adf` The test statistic.
- `crit` Critical value for the test statistic at the 10 % levels.
- `nobs` Number of observations used for the ADF regression and calculation of the critical values.

___
The Augmented Dickey-Fuller test is used to determine whether a time series is stationary
or contains a unit root (non-stationary). The null hypothesis is that the series has a unit root
(is non-stationary), while the alternative hypothesis is that the series is stationary.

A stationary time series has statistical properties that do not change over time, making it
suitable for many time series forecasting models. If the test statistic is less than the
critical value, we reject the null hypothesis and conclude the series is stationary.

___
Reference:
- jstor.org/stable/2286348
- en.wikipedia.org/wiki/Augmented_Dickey–Fuller_test

theils_inequality(targets, forecasts)
Calculates Theil's Inequality Coefficient, a measure of forecast accuracy that quantifies the relative difference between actual and predicted values.
Parameters:
targets (array<float>): List of target observations.
forecasts (array<float>): Matrix with list of forecasts, ordered column wise.
Returns: - Theil's Inequality Coefficient value, value closer to 0 is better.

___
Theil's Inequality Coefficient is calculated as: `sqrt(Sum((y_i - f_i)^2)) / (sqrt(Sum(y_i^2)) + sqrt(Sum(f_i^2)))`
where `y_i` represents actual values and `f_i` represents forecast values.
This metric ranges from 0 to infinity, with 0 indicating perfect forecast accuracy.

___
Reference:
- en.wikipedia.org/wiki/Theil_index

sharpness(forecasts)
The average width of the forecast intervals across all observations, representing the sharpness or precision of the predictive intervals.
Parameters:
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - Sharpness The sharpness level, which is the average width of all prediction intervals across the forecast horizon.

___
Sharpness is an important metric for evaluating forecast quality. It measures how narrow or wide the
prediction intervals are. Higher sharpness (narrower intervals) indicates greater precision in the
forecast intervals, while lower sharpness (wider intervals) suggests less precision.

The sharpness metric is calculated as the mean of the interval widths across all observations, where
each interval width is the difference between the upper and lower bounds of the prediction interval.

Note: This function assumes that the forecasts matrix has at least 2 columns, with the first column
representing the lower bounds and the second column representing the upper bounds of prediction intervals.

___
Reference:
- Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts. otexts.com/fpp2/

resolution(forecasts)
Calculates the resolution of forecast intervals, measuring the average absolute difference between individual forecast interval widths and the overall sharpness measure.
Parameters:
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - The average absolute difference between individual forecast interval widths and the overall sharpness measure, representing the resolution of the forecasts.

___
Resolution is a key metric for evaluating forecast quality that measures the consistency of prediction
interval widths. It quantifies how much the individual forecast intervals vary from the average interval
width (sharpness). High resolution indicates that the forecast intervals are relatively consistent
across observations, while low resolution suggests significant variation in interval widths.

The resolution is calculated as the mean absolute deviation of individual interval widths from the
overall sharpness value. This provides insight into the uniformity of the forecast uncertainty
estimates across the forecast horizon.

Note: This function requires the forecasts matrix to have at least 2 columns (min, max) representing
the lower and upper bounds of prediction intervals.

___
Reference:
- [Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation.](sites.stat.washington.edu/raftery/Research/PDF/Gneiting2007jasa.pdf)
- [Journal of the American statistical Association, 102(477), 359-378.](jstor.org/stable/i27639812)

coverage(targets, forecasts)
Calculates the coverage probability, which is the percentage of target values that fall within the corresponding forecasted prediction intervals.
Parameters:
targets (array<float>): List of target values.
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - Percent of target values that fall within their corresponding forecast intervals, expressed as a decimal value between 0 and 1 (or 0% and 100%).

___
Coverage probability is a crucial metric for evaluating the reliability of prediction intervals.
It measures how well the forecast intervals capture the actual observed values. An ideal forecast
should have a coverage probability close to the nominal confidence level (e.g., 90%, 95%, or 99%).

For example, if a 95% prediction interval is used, we expect approximately 95% of the actual
target values to fall within those intervals. If the coverage is significantly lower than the
nominal level, the intervals may be too narrow; if it's significantly higher, the intervals may
be too wide.

Note: This function requires the targets array and forecasts matrix to have the same number of
observations, and the forecasts matrix must have at least 2 columns (min, max) representing
the lower and upper bounds of prediction intervals.

___
Reference:
- [Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of business & economic statistics, 13(3), 253-263.](jstor.org/stable/1392185)

pinball(tau, target, forecast)
Pinball loss function, measures the asymmetric loss for quantile forecasts.
Parameters:
tau (float): The quantile level (between 0 and 1), where 0.5 represents the median.
target (float): The actual observed value to compare against.
forecast (float): The forecasted value.
Returns: - The Pinball loss value, which quantifies the distance between the forecast and target relative to the specified quantile level.

___
The Pinball loss function is specifically designed for evaluating quantile forecasts. It is
asymmetric, meaning it penalizes underestimates and overestimates differently depending on the
quantile level being evaluated.

For a given quantile τ, the loss function is defined as:
- If target >= forecast: (target - forecast) * τ
- If target < forecast: (forecast - target) * (1 - τ)

This loss function is commonly used in quantile regression and probabilistic forecasting
to evaluate how well forecasts capture specific quantiles of the target distribution.

___
Reference:
- [Forecasting: Principles and Practice (3rd Edition). Chapter 5.9](otexts.com/fpp3/distaccuracy.html)

pinball_mean(tau, targets, forecasts)
Calculates the mean pinball loss for quantile regression.
Parameters:
tau (float): The quantile level (between 0 and 1), where 0.5 represents the median.
targets (array<float>): The actual observed values to compare against.
forecasts (matrix<float>): The forecasted values in matrix format with at least 2 columns (min, max).
Returns: - The mean pinball loss value across all observations.

___
The pinball_mean() function computes the average Pinball loss across multiple observations,
making it suitable for evaluating overall forecast performance in quantile regression tasks.

This function leverages the asymmetric Pinball loss function to evaluate how well forecasts
capture specific quantiles of the target distribution. The choice of which column from the
forecasts matrix to use depends on the quantile level:
- For τ ≤ 0.5: Uses the first column (min) of forecasts
- For τ > 0.5: Uses the second column (max) of forecasts

This loss function is commonly used in quantile regression and probabilistic forecasting
to evaluate how well forecasts capture specific quantiles of the target distribution.

___
Reference:
- [Forecasting: Principles and Practice (3rd Edition). Chapter 5.9](otexts.com/fpp3/distaccuracy.html)

Pine library

In true TradingView spirit, the author has published this Pine code as an open-source library so that other Pine programmers from our community can reuse it. Cheers to the author! You may use this library privately or in other open-source publications, but reuse of this code in publications is governed by House Rules.

RicardoSantos

Disclaimer

The information and publications are not meant to be, and do not constitute, financial, investment, trading, or other types of advice or recommendations supplied or endorsed by TradingView. Read more in the Terms of Use.

Pine library

RicardoSantos