Time Series Prediction for Future Stock Markets

Time Series Prediction for Future Stock Markets: Comparison

Please note this is a comparison between Version 1 by Mahboubeh Shadabfar and Version 2 by Peter Tang.

There has been a great deal of attention paid to investors’ stock predictions, leading researchers to propose a variety of models. Time-series-based linear models include the auto-regressive integrated moving average (ARIMA), exponential smoothing model (ESM), and generalized auto-regressive conditional heteroskedasticity (GARCH). Stock returns can be difficult to predict because the data are nonstationary or nonlinear in nature, and linear models have trouble capturing their patterns. The linear models are also called statistical models.

portfolio optimization
Shanghai stock market
time series prediction
Monte Carlo sampling method
Exceedance probability
Mean semi-variance model

1. Exponential Smoothing Model (ESM)

The exponential smoothing model (ESM) is a broadly used smoothing technique for time series data. In this technique, smoothing is performed by an exponential window function [1]. In a study by De Faria et al. [2], where these researchers compared the ability of ANN and adaptive ESM models to predict the Brazilian stock market index, the results demonstrated that ESM could predict outcomes, and both methods performed similarly. Nevertheless, the neural network model achieved slightly better root mean square error (RMSE) than the adaptive ESM.

Dutta et al. [3] took the novel approach of performing logistic regression with financial ratios as independent variables. After performing the regression, they analyzed how these ratios relate to stock performance. In the end, they attempted to predict excellent or poor companies using one-year performance data. As Devi et al. [4] suggest, the stock analysis literature has failed to properly address a variety of issues including high dimensionality and how naive investors behave. In their work, these researchers used historical data pertaining to four mid-capitalization companies in India to train an ARIMA model and then used the Akaike information criterion Bayesian information criterion (AICBIC) test to predict the accuracy of this model. They ultimately concluded that the low error and volatility offered by the Nifty index makes it the best choice for naive investors.

In their hybrid approach, it is proposed that Wang et al. [5] employ the most advantageous combination of ESM, ARIMA, and BPNN. Time-series forecasts can be made using exponential smoothing models (ESMs), auto-regressive integrated moving average (ARIMA) models, and backpropagation neural networks (BPNNs).

2. Autoregressive Family Models

Using the famous autoregressive moving average (ARMA) model, Box and Jenkins’ work ^[6][7] made an essential contribution to prediction theory. Their work was greatly influenced by the work of Yule and Wold ^[7][8]. In ARMA models, movements of time series are predicted based on the time lags between past and future observations ^[8][9]. According to Shumway and Stoffer ^[9][10], time series data can be affected by the independent variables’ past observations.

In a study by Ariyo et al. ^[10][11], these researchers first discussed how ARIMA models are developed and then used a variety of accuracy measures, including the standard error of the regression, adjusted R-square, and Bayesian information to determine the best ARIMA model for predicting Nokia and Zenith Bank stock prices. Bhuriya et al. ^[11][12] used multiple types of regression models, including linear, polynomial, and radial basis function (RBF) regression models, to predict the stock price of Tata Consultancy Services based on open-high-low-close and volume data and then compared the performance of these models in terms of the confidence values of the predicted results.

3. Generalized Autoregressive Conditional Heteroscedasticity (GARCH) Family Model

Using traditional econometric models to forecast financial time series can lead to significant errors, according to Liu and Morley, since homoscedasticity appears inappropriate for time series characterized by sharp peaks, fat tails, and clusters of volatility. To address this problem, researchers have conducted significant research to improve the econometric models’ accuracy ^[12][13].

Engle ^[13][14] introduced the auto-regressive conditional heteroscedasticity (ARCH) to address the issue. Unlike conventional methods, the ARCH model rejects linear risk–return relationships ^[14][15]. By using changing variance, this approach constructs a function which is related to past volatility that accounts for financial data with “sharp peaks” and “fat tails” ^[15][16]. Researchers discovered that ARCH needed a large order of q to effectively predict conditional heteroscedasticity ^[16][17]. Accordingly, Bollerslev ^[17][18] proposed the generalized auto-regressive conditional heteroskedasticity (GARCH), which incorporates the lag phase into the variance. The GARCH model is an efficient way to capture economic data volatility. In recent years, the application of the GARCH model has expanded in the financial sector due to its performance in detecting frequent fluctuations in financial data volatility ^[18][19]. Nelson ^[19][20] developed a group of diffusion approximations on the basis of the exponential ARCH model.

Yuling Wang et al. ^[20][21] conducted an empirical study using the Shanghai Composite Index and Shenzhen Component Index returns based on GARCH-type generalized autoregressive conditional heteroscedasticity models. In the study of Li and Mak ^[21][22], residual autocorrelation can be a reliable tool for testing models with conditional heteroscedasticity in non-linear time series.

4. Other Models

In addition to the methods reviewed above, the hybrid methods made from the combination of the above methods are also prevalent among researchers. Chang et al. ^[22][132] used stepwise regression analysis to identify the variables that significantly affect the stock market index’s trend. Three models were constructed based on the identified variables: multiple regression analysis models, backpropagation neural networks, and ARIMA models.

Khaheshi and Hajirahimi ^[23][133] developed a hybrid model for predicting stock prices. For this purpose, they used ARIMA and MLPs for constructing series hybrid models. Pan ^[24][134] proposed a method for solving multicollinearity problems and nonlinear problems simultaneously by the combined use of principal component regression (PCR) and general regression neural network (GRNN). Using ARIMA-ANN, Rathnayaka et al. ^[25][135] attempted to understand the behavior patterns of CSE price indices and develop a new hybrid forecasting approach.

Multivariate timeseries analysis is also playing an important role in stock market prediction. Kling and Bessler ^[26][136] utilized different multivariate methods on out-of-samples and compared the results with univariate methods. A study by Liapis et al. ^[27][137] discussed sentiment analysis methods for data extracted from social networks and their application to multivariate financial prediction architectures. Ma and Liu ^[28][23] used the principles of the nonlinear dynamical theory for the characterization and prediction of stock return series of the Shanghai stock exchange. The concept of phase space reconstruction was utilized in both multivariate and univariate nonlinear prediction methods. The results of this study showed the better performance of multivariate nonlinear prediction models than univariate nonlinear prediction models in this application.

Some studies have used statistical network theory to analyze the stock market. In one such study, Nobi et al. ^[29][24] tried to determine how the 2008 global financial crisis affected the threshold networks of a local Korean financial market near the time of that crisis. Zhang and Zhuang ^[30][25] constructed a variety of such networks for the Chinese stock market and then conducted an empirical analysis on the topological features and stability of these models and how they correlate with the international stock market indexes in terms of the multifractal properties of financial time series, value at risk (VaR), and price fluctuation correlation. According to Tabak et al. ^[31][26], Brazilian stock market networks possess topological properties. Using the correlation matrix for various stocks in different sectors, they constructed a minimum spanning tree based on ultrametricity. Khoojine and Han ^[32][33][27,28] used the statistical network theory to investigate the stock market anomaly of Shanghai, Shenzhen, and S&P 500 markets during 2015–2016. In a study by Khoojine and Han ^[34][29], they analyzed behavior of the log-return based network of the Chinese stock market by developing a stock price network autoregressive model (SPNAR).