Recursive Decomposition–Reconstruction–Ensemble Method with Complexity Traits

Recursive Decomposition–Reconstruction–Ensemble Method with Complexity Traits: Comparison

Please note this is a comparison between Version 2 by Dean Liu and Version 1 by Fang Wang.

The subject of oil price forecasting has obtained an incredible amount of interest from academics and policymakers in recent years due to the widespread impact that it has on various economic fields and markets. Thus, a novel method based on decomposition–reconstruction–ensemble for crude oil price forecasting is proposed. Based on the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) technique, in this paper we construct a ra recursive CEEMDAN decomposition–reconstruction–ensemble model considering the complexity traits of crude oil data was constructed. In this model, the steps of mode reconstruction, component prediction, and ensemble prediction are driven by complexity traits. For illustration and verification purposes, the West Texas Intermediate (WTI) and Brent crude oil spot prices are used as the sample data. The empirical result demonstrates that the proposed model has better prediction performance than the benchmark models.

oil price forecasting
complexity trait
component reconstruction
recursive CEEMDAN algorithm

1. Introduction

Crude oil, which is the world’s most important chemical raw material and strategic resource, ensures the normal operation of the national economy and people’s livelihoods, and it is a critical support for the development of the entire modern industrial society. Crude oil plays an important role in the global economy, political situation, and military strength of various countries as a basic energy source. As a result, changes in crude oil prices have sparked widespread concern worldwide. Because of the interactive impact of various factors such as the global economy, exchange rate changes, speculative behavior, and geopolitics, the oil price always exhibits non-linearity, non-stationarity, and high complexity, which poses significant challenges to crude oil price forecasting.

In the literature, various linear and nonlinear models have been used separately or in combination to make forecast (see, e.g., Buyuksahin & Ertekin ^[1]). Linear methods assume that a given time series is regular with no sudden movements. It becomes challenging because sudden movements with variation and extreme values are normal in many real-world time series such as financial data and renewable energy data (see, e.g., Xu et al. ^[2]). Numerous nonlinear time series prediction methods (see, e.g., Kantz & Schreiber ^[3]) have been proposed in the literature to capture these nonlinearities. Conventional linear methods can better approximate time series with no high volatility and multicollinearity. Zhang et al. ^[4] and Elman ^[5] show that nonlinear methods have the advantages when modeling a complex structure in time series with high accuracy. No universal model is suitable for all circumstances because each type of method outperforms others in different domains. Individually capturing general patterns in the time series data using only one linear or nonlinear model appears to be difficult (see, e.g., Khashei & Bijari ^[6]). To overcome this limitation, Taskaya & Casey ^[7] proposed hybrid techniques with both linear and nonlinear models. The hybrid methodology is a synthesis of various prediction methods. It is usually a combination of traditional econometric models and AI algorithms (see, e.g., Wang et al. ^[8]) or a combination of different econometric models or AI algorithms.

In addition to the hybrid methodology, the ensemble learning algorithm is an important paradigm to overcome the limitations of single methods. Both hybrid methodology and the ensemble method consider the shortcomings of single models. With the divide-and-conquer strategy (see, e.g., Yu et al. ^[9] and Dong et al. ^[10]), the decomposition–ensemble learning methods are an important branch of ensemble learning paradigms. Because it will take a lot of time to make individual prediction from all decomposed components, the number of decomposed components is necessarily reduced. Yu et al. ^[11] first proposed a decomposition–ensemble model with a reconstruction step that considered some data characteristics. Recently, Yu & Ma ^[12] introduced a memory-trait-driven reconstruction method into the decomposition and ensemble framework. Inspired by their work, a new model based on decomposition–ensemble learning with a reconstruction step that considers the data complexity traits is used to explore the price predictions of crude oil. In this model, all steps of mode reconstruction, component prediction, and ensemble prediction are driven by complexity traits.

2. Forecasting by Statistical Models

Statistical models, which are also known as random time series models, include exponential smoothing (ES) (see, e.g., Kourentzes et al. ^[13]), auto-regressive integrated moving average (ARIMA) model (see, e.g., Guo ^[14]), generalized auto-regressive conditional heteroskedasticity (GARCH) model (see, e.g., Zhang et al. ^[15]), hidden Markov model (HMM) (see, e.g., Isah & Bon ^[16]), and vectorial auto-regression (VAR) (see, e.g., Mirmirani & Li ^[17]). For example, Zolfaghari & Gholami ^[18] showed that ARIMA models had a good forecasting impact on international crude oil prices. To modify the mean and variance of the log returns of crude oil prices, Zhu et al. ^[19] introduced a hidden Markov model to obtain the behavior of random events and subjective factors for time series fluctuations. Using a VAR model, Drachal ^[20] applied the global economic policy uncertainty index, production, volatility index, and crude oil volatility to predict crude oil prices. Despite their simplicity and ease of implementation, these statistical models cannot directly process time series with nonlinear characteristics due to their linear correlation structure. Meanwhile, as the soft computing technology has advanced, many different intelligent algorithms have been developed and widely used in various data predictions. However, conventional statistical and econometric models are constrained by stringent theoretical assumptions, including linearity, stationarity, and dependence on specific distributional properties. As a result, these methods may encounter limitations in accurately forecasting wind power time series that are non-stationary, nonlinear, and characterized by complex dynamics.

3. Forecasting by Artificial Intelligence and Machine Learning Methods

A crucial presumption in the application of econometric models is that the time series data under study are a linear process. However, crude oil prices do not satisfy this requirement, which can result in less accurate forecasting outcomes. In contrast, various nonlinear intelligence and machine learning methods (e.g., the support vector machine (SVM) proposed by Yu et al. ^[21] and the extreme learning machine (ELM) proposed by Wang et al. ^[22]) have emerged to satisfy the requirements, and they can be applied to time series prediction tasks. Moreover, deep learning is gaining popularity in machine learning, since conventional machine learning techniques employ shallow structures. Recently, an artificial neural network (ANN) (see, e.g., Jammazi & Aloui ^[23]), a back-propagation neural network (BPNN) (see, e.g., Khashei & Bijari ^[6]), long short-term memory (LSTM) networks (see, e.g., Urolagin et al. ^[24]), and convolutional neural networks (CNNs) (see, e.g., Li et al. ^[25]) can implement time series with nonlinear characteristics and have high prediction precision. For example, Wang & Wang ^[26] created a crude oil price forecasting model that utilized a random Elman recurrent neural network, and the predictive power of the model was analyzed in comparison to other models. Yu et al. ^[27] incorporated the cutting-edge AI method of EELM into an ensemble model formulation to forecast crude oil prices, and findings showed that the suggested unique ensemble learning paradigm statistically outperformed all investigated benchmark models. However, these models have some drawbacks, including local minima, over-fitting, and a large sample size. While it has been demonstrated that ensemble models can outperform individual models, they are still susceptible to issues such as overfitting and being trapped in local extrema, which can limit their ability to generalize effectively.

4. Forecasting by Hybrid Models

To overcome the limitations of the aforementioned techniques, hybrid models have been proposed. It is not uncommon for researchers to employ a combination of econometric models and artificial intelligence algorithms or even a combination of econometric models and artificial intelligence algorithms. For example, Cheng et al. ^[28] predicted crude oil prices in 2018 using the vector error correction and nonlinear auto-regressive neural network (VEC-NAR) model. To enhance the technical indicator-based crude oil price forecasting, He et al. ^[29] implemented a unique hybrid forecast approach using scaled principal component analysis (s-PCA). In-sample and out-of-sample performance comparisons revealed that the s-PCA model was superior to the compared models. Wang & Fang ^[30] developed a novel combination of the FNN model and stochastic time effective function for crude oil prices forecasting, i.e., the WT-FNN model, and the findings revealed that the WT-FNN model had the best predictive impact. Zhang et al. ^[15] offered a novel hybrid technique to predict crude oil prices based on the least square support vector machine, particle swarm optimization, and GARCH model. The experimental findings demonstrated that this approach might accurately estimate crude oil prices. To predict crude oil prices accurately, Wang et al. ^[31] employed a Markov model to implement the GARCH-MIDAS model for both short-term and long-term state conversion, but they discovered that short-term predictions were more accurate. Like the hybrid approach, ourthe proposed decomposition–ensemble method also takes into account the shortcomings of single models. The biggest difference is that the ensemble learning employs several identical individual methods for ensemble prediction.

5. Forecasting by the Decomposition–Ensemble Learning Method

Recent studies have established a novel ensemble predicting approach called the decomposition ensemble to manage the challenge of forecasting nonlinear time-series data. Similar to the hybrid method, this approach considers the limitations of single models. Ensemble learning employs multiple identical single techniques for ensemble prediction, whereas the hybrid model employs multiple distinct single models for combination prediction. Oil price predictions typically rely on various significant studies. For example, Li et al. ^[25] and Li et al. ^[32] decomposed the monthly crude oil futures price data into multiple modes using VMD. Then, they forecast each mode using a SVM that was optimized by a genetic algorithm and a BPNN that was optimized by a genetic algorithm. Using the Akaike information criterion (AIC) to determine a reasonable lag, Ding ^[33] proposed a decomposition ensemble model using ensemble empirical mode decomposition (EEMD) for crude oil forecasting. Yu et al. ^[9] used empirical mode decomposition (EMD) to decompose crude oil prices and the feedforward neural network (FNN) to forecast the components. Zheng et al. ^[34] recently proposed a method combining an empirical mode decomposition algorithm, quadratic surface support vector regression, and the autoregressive integrated moving average method for the stock indices and future price forecasting. The study obtained better forecasting results than the direct forecasting model. However, the existing literature on constructing the decomposition–ensemble framework has some limitations. It primarily focuses on selecting decomposition–reconstruction–prediction–ensemble methods based on the characteristics of the model, rather than taking into account the characteristics of the data themselves. Therefore, the method proposed in this paper has the ability of selecting appropriate decomposition methods, reconstruction methods, prediction methods, and ensemble methods based on the specific traits of the data.

References

Buyuksahin, U.C.; Ertekin, S. Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition. Neurocomputing 2019, 361, 151–163.
Xu, W.; Peng, H.; Zeng, X.; Zhou, F.; Tian, X.; Peng, X. Deep belief network-based AR model for nonlinear time series forecasting. Appl. Soft Comput. 2019, 77, 605–621.
Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis: Contents; Cambridge University Press: Cambridge, UK, 1997.
Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62.
Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211.
Khashei, M.; Bijari, M. A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Appl. Soft Comput. 2011, 11, 2664–2675.
Taskaya, T.; Casey, M.C. A comparative study of autoregressive neural network hybrids. Neural Netw. 2005, 18, 781–789.
Wang, J.J.; Wang, J.Z.; Zhang, Z.G.; Guo, S.P. Stock index forecasting based on a hybrid model. Omega 2012, 40, 758–766.
Yu, L.A.; Wang, S.Y.; Lai, K.K. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm. Energy Econ. 2008, 30, 2623–2635.
Dong, J.; Dai, W.; Tang, L.; Yu, L. Why do EMD-based methods improve prediction? A multiscale complexity perspective. J. Forecast. 2019, 38, 714–731.
Yu, L.; Wang, Z.; Tang, L. A decomposition–ensemble model with datacharacteristic-driven reconstruction for crude oil price forecasting. Appl. Energy 2015, 156, 251–267.
Yu, L.; Ma, M. A memory-trait-driven decomposition–reconstruction–ensemble? learning paradigm for oil price forecasting. Appl. Soft Comput. 2021, 111, 107699.
Kourentzes, N.; Barrow, D.; Petropoulos, F. Another look at forecast selection and combination: Evidence from forecast pooling. Int. J. Prod. Econ. 2019, 209, 226–235.
Guo, J. Oil price forecast using deep learning and ARIMA. In Proceedings of the 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 8–10 November 2019; pp. 241–247.
Zhang, J.L.; Zhang, Y.J.; Zhang, L. A novel hybrid method for crude oil price forecasting. Energy Econ. 2015, 49, 649–659.
Isah, N.; Bon, A.T. Application of Markov model in crude oil price forecasting. Traektoria Nauk. 2017, 3, 1007–1012.
Mirmirani, S.; Li, H. A Comparison of VAR and Neural Networks with Genetic Algorithm in Forecasting Price of oil Applications of Artificial Intelligence in Finance and Economics; Emerald Group Publishing Limited: Bingley, UK, 2004; pp. 203–223.
Zolfaghari, M.; Gholami, S. A hybrid approach of adaptive wavelet transform, long short-term memory and ARIMA-GARCH family models for the stock index prediction. Expert Syst. Appl. 2021, 182, 115149.
Zhu, D.M.; Ching, W.K.; Elliott, R.J.; Siu, T.K.; Zhang, L.M. Hidden Markov models with threshold effects and their applications to oil price forecasting. J. Ind. Manag. Optim. 2017, 13, 757–773.
Drachal, K. Forecasting crude oil real prices with averaging time-varying VAR models. Resour. Policy 2021, 74, 102244.
Yu, L.; Zhang, X.; Wang, S. Assessing potentiality of support vector machine method in crude oil price forecasting. EURASIA J. Math. Sci. Technol. Educ. 2017, 13, 7893–7904.
Wang, J.; George, A.; Hyndman, R.J.; Wang, S. Crude oil price forecasting based on internet concern using an extreme learning machine. Int. J. Forecast. 2018, 34, 665–677.
Jammazi, R.; Aloui, C. Crude oil price forecasting: Experimental evidence from wavelet decomposition and neural network modeling. Energy Econ. 2012, 34, 828–841.
Urolagin, S.; Sharma, N.; Datta, T.K. A combined architecture of multivariate LSTM with Mahalanobis and Z-Score transformations for oil price forecasting. Energy 2021, 231, 120963–120975.
Li, J.; Zhu, S.; Wu, Q. Monthly crude oil spot price forecasting using variational mode decomposition. Energy Econ. 2019, 83, 240–253.
Wang, J.; Wang, J. Forecasting energy market indices with recurrent neural networks: Case study of crude oil price fluctuations. Energy 2016, 102, 365–374.
Yu, L.A.; Dai, W.; Tang, L. A novel decomposition ensemble model with extended extreme learning machine for crude oil price forecasting. Eng. Appl. Artif. Intell. 2016, 47, 110–121.
Cheng, F.Z.; Li, T.; Wei, Y.M.; Fan, T.J. The VEC-NAR model for short-term forecasting of oil prices. Energy Econ. 2019, 78, 656–667.
He, M.X.; Zhang, Y.J.; Wen, Y.D. Forecasting crude oil prices: A scaled PCA approach. Energy Econ. 2021, 97, 656–667.
Wang, D.H.; Fang, T.H. Forecasting crude oil prices with a WT-FNN model. Energies 2022, 15, 1955.
Wang, L.; Wu, J.B.; Cao, Y.; Hong, Y.R. Forecasting renewable energy stock volatility using short and long-term Markov switching GARCH-MIDAS models: Either, neither or both? Energy Econ. 2022, 111, 106056.
Li, X.; Shang, W.; Wang, S. Text-based crude oil price forecasting: A deep learning approach. Int. J. Forecast. 2019, 35, 1548–1560.
Ding, Y.S. A novel decompose-ensemble methodology with AIC-ANN approach for crude oil forecasting. Energy 2018, 154, 328–336.
Zheng, J.L.; Tian, Y.; Luo, J.; Hong, T. A novel hybrid method based on kernel-free support vector regression for stock indices and price forecasting. J. Oper. Res. Soc. 2022, 74, 690–702.