Forecasting a Stock Trend Using Genetic Algorithm

Forecasting a Stock Trend Using Genetic Algorithm: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor: Rebecca Abraham , Mahmoud El Samad , Amer Bakhach , Hani El-Chaarani ,

, Sam El Nemar , Dalia Jaber

The performance of the model significantly outperforms the dummy forecast. In some cases, the accuracy of the model was up to 80%. The results also showed that S&P 500 (the US stock market) is the most useful stock index in the prediction on all sectors. This could be because the 15 considered stocks have been selected from NYSE and NASDAQ. In contrast, CAC40 (French stock index) seems to have the lowest impact on two out of the three sectors. Moreover, past historical data of the stock itself helps significantly in predicting its trend. However, the results also show that the accuracy of the model decreases considerably while trying to predict large price changes.

computational or mathematical finance
stock trend prediction
random forest

1. Introduction

Stock market movements are influenced by many exogenous variables, such as political and geopolitical events, exchange rates, movements of other stock markets, the economic environment, firm policies, and the psychology of investors (Gidofalvi 2001; Nobel Prize Committee 2013; El-Chaarani 2016; El-Chaarani 2019).

For efficient market theory (Fama 1965; Fama 1970; Fama 1998), all relevant information must be reflected in efficient stock markets. In the weak-form market efficiency, prices of stocks reflect all information of past prices. In semi-strong market efficiency form, prices of stocks reflect all of the available public information.

Dynamic markets with nonlinear stock price movements, along with the multiplicity of predictors makes forecasting a stock’s trend challenging. In fact, an efficient forecasting solution in the stock market can play a crucial role in motivating people toward stock trading (Sharma et al. 2017).

Artificial Neural Networks (ANN) and the Support Vector Machine (SVM) are the most commonly used machine learning algorithms for forecasting stocks (Guresen et al. 2011; Hoseinzade 2019; Kara et al. 2011; Wang and Wang 2015). Many Artificial Neural Networks tend to predict the next day’s closing price (Li and Liao 2017). Certain Artificial Neural Networks-based models have been enriched with other algorithms to boost the accuracy of the forecast (Nair et al. 2011). Likewise, Support Vector Machine models have been developed to forecast stock trends (Reddy and Sai 2018). Some proposals combined the Support Vector Machine, or Artificial Neural Networks with preprocessing techniques, following meta-heuristics algorithms to find the optimal machine learning parameters, the Artificial Neural Networks architecture, and a set of input features (Asadi et al. 2012; Sedighi et al. 2019; Zhang and Wu 2009). Recently, deep learning techniques such as Convolutional Neural Network (CNN), Deep Multilayer Perceptron (MLP), Deep Belief Network (DBN), Recurrent neural network (RNN), Long Short-Term Memory (LSTM), and Generative Adversarial Network (GAN) have proved to be applicable to stock studies (Aloud 2020; Sang and DiPierro 2019; Selvin et al. 2017; Zhanga et al. 2019). The features selection is a key factor for the knowledge discovery process. Its importance is derived from its role in improving the accuracy and efficiency of the prediction model by selecting the relevant variables and reducing the dimensionality of the datasets (Mao et al. 2016; Sugunnasil and Somhom 2010). Yet, features selection models suffer from the drawback of using isolated features to forecast trends. There is a gap in the literature for stock trend models that not only use features selection, but include international stock indices to forecast trends, as such models capture the essence of worldwide market movements, rather than isolated features.

2. History and Development

Artificial Neural Network prediction models have been proposed in many research works, such as Chandan et al. (2016). However, the noisy behavior of stock markets constructs an obstacle for the Artificial Neural Networks, leading to convergence to suboptimal solutions (Hoseinzade 2019). To solve this problem, Kara et al. (2011) suggested that a Support Vendor Machine preprocessing model can help in eliminating irrelevant features. With respect to predicting stock trends, Reddy and Sai (2018) proposed a Support Vendor Machine and Radial Basis Function approach to forecast stock prices in large as well as in small capitalizations. Their predictor is trained based on the available historical data to predict the next day’s data. While the obtained numerical results showed high efficiency of the algorithm, its drawback is that the solution assumes four fixed features without any specific engineering or optimization. The experimentation relies on online data without addressing its quality.

Hoseinzade (2019) suggested a model, called CNNpred, that can be applied to a collection of data from different sources, from different markets. The approach uses feature extraction to predict the next day’s trend of movement for specific indices, including the S&P 500, NASDAQ, DJI, NYSE, and the RUSSELL 3000. Similarly, Chen (2018) used a conv1D function to process 1D data in the convolutional layer. To improve the results, the proposed model used preprocessed stock data as input. The work goal was to forecast stock prices in the Chinese stock market. The proposed model was limited to four features as open, close, high, and low prices. The chief limitation was that the validation relied on a limited dataset.

Karathanasopoulos et al. (2019) proposed several optimization techniques to find the optimal Neural Network hierarchy to forecast 12 Exchange Traded Funds (ETFs). They considered three optimization approaches, namely genetic algorithm, differential evolution, and the particle swarm optimizer. They also considered three multilayer perceptron, recurrent neural networks, and radial basis function neural networks. Their results showed that differential evolution was the optimal method, with the highest forecasting accuracy.

The study most analogous to this is the one by Jiao and Jakubowicz (2017). This study predicted the daily direction of stock price movement. The authors considered predicting stock movement as a binary classification problem. They studied 463 stocks, constituents of the S&P 500, and 8 international indices. International indices included three Asian indices (Nikkei 225, Hang Seng, and All Ords), two European indices (DAX, FTSE 100), and three US indices (NYSE Composite, Dow Jones Industrial Average, and S&P500). Thus, when the daily return was positive (greater than zero), the mean price direction was uptrend. When the daily return was negative (less than zero), the mean price direction was downtrend. After that, they used a lag operator to extract more features from stock indices, in addition to more than 200 technical indicators, as input features into a classifier. Then, they employed genetic algorithm-based feature selection, to use the selected features as input into a classifier. The authors used four classification algorithms to compare their prediction performance. The classification algorithms used included Random Forest, Gradient Boosted Trees, Artificial Neural Networks, and logistic regression.

Sable et al. (2017) provided a short-term prediction model, using the Genetic Algorithm, and evolutionary strategies, predicting the price of eight scripts, with six attributes for each script (Opening Price, Closing Price, Highest Price, Lowest Price, Volume, and Adjusted Closing Price). The eight scripts reflect US-based companies.

Shen and Shafiq (2020) proposed a prediction model for the stock market price trend. The proposal relies on a customization of feature engineering and deep learning. The pillars of the proposal are various techniques of feature engineering with a fine-tuned system, instead of just a deep learning model. The work assessment relies on only the Chinese stock market.

Wanjawa and Muchemi (2014) proved that the configuration model 5:21:21:1 can achieve very good prediction accuracy. The assessment was done on 1000 records trained on 130 K cycles. The training percentage was 80%. Using Artificial Neural Networks (ANN) to predict three stocks on the New York Stock Exchange (NYSE), with Encog and Neuroph for validation, the prediction was achieved with error range [0.71%–2.77%].

Soni et al. (2022), Rouf et al. (2021), Rahul et al. (2020), and Tawarish and Satyanarayana (2019) provide interesting surveys on the ML techniques used for stock market prediction. Almost all of these surveys show that a large percentage of proposals use Support Vector Machine (SVM), fewer use Genetic Algorithm, and fewer use the Random Forest.

This entry is adapted from the peer-reviewed paper 10.3390/jrfm15050188

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.