Machine Learning Methods for Stock Market Prediction

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Ahmad Nikabadi	--	1589	2023-07-20 12:17:49	\|
2	layout	Jessie Wu	Meta information modification	1589	2023-07-21 05:50:45	\|

This entry is adapted from the peer-reviewed paper 10.3390/math11132950

Stock market prediction models are developed with different goals. The primary focus of stock market prediction has been on forecasting the price of a share for a specific future period. The price of a share is a numerical value, and its variation over time is often treated as a time series in various studies.

stock market prediction social network analysis deep learning user behavior networks

1. Introduction

Each stock market prediction model comprises two primary components: the prediction method, and the features used by the model. While some researchers aim to enhance prediction accuracy by employing more advanced techniques, others focus on obtaining informative feature sets from various information sources. There are studies that report advancements in both aspects.

2. Prediction Methods

Regarding the first category of related works that focus on improving the prediction method, there is a wide range of algorithms and tools available for stock market prediction, such as neural networks, deep learning, support vector machines (SVMs), and random forests.

Machine learning methods have predominantly been used for technical analysis in this field, with various studies comparing various types of algorithms. Ensemble approaches such as random forests and AdaBoost, as well as single classifier models such as neural networks and logistic regression, were compared using data from 5767 businesses ^[1]. Sharma et al. ^[2] proposed using LSboost to aggregate the predictions of an ensemble of trees in a random forest (referred to as LS-RF). Each prediction model defines a set of technical indicators as inputs. The performance of the suggested model was compared to that of the well-known support vector regression model. Picasso et al. ^[3] incorporated technical and fundamental analysis parameters to evaluate the performance of several machine learning methods, including SVM and random forest. Various other supervised approaches, such as support vector regression (SVR) ^[4], multiple linear regression (LMLR) ^[5], and the j48 algorithm ^[6], have been investigated within the field of stock market prediction.

Artificial neural networks (ANN), based on several studies, have emerged as popular tools for financial prediction ^[7]^[8]^[9]. Most of these studies have utilized historical market data as input features. Among ANNs, the multilayer perceptron (MLP) network is widely employed for stock forecasts. MLP is a feed-forward network comprising one or more hidden layers: an input layer, and an output layer. Each layer incorporates non-linear learning capabilities. Previous studies ^[10]^[11]^[12] have proposed MLP networks for stock market prediction tasks. Deep ANNs are also extensively used in this field. To predict NASDAQ prices, ref. ^[13] tested ANNs with various structures using historical prices on four- and nine-day timeframes. Their findings indicated that deep ANNs outperformed shallow networks. Arévalo et al. ^[14] applied a deep ANN with five hidden layers to forecast Apple’s stock in the NASDAQ exchange, and they achieved approximately 65% directional accuracy. Chong et al. ^[15] explored different data representation methods, including auto-encoder, RBM, and PCA, using raw data with 380 variables. These representations were employed as input for a deep ANN in stock prediction tasks. The results indicated no significant superiority of one method over the others. Hoseinzade et al. ^[16] employed combinations of historical prices, technical indicators, and macroeconomic data as features. They used a CNN model to train the model in 3D spaces. The proposed model was compared to a PCA+ANN technique, and the results showed the CNN outperformed the other methods. Gao et al. ^[17] and Wang et al. ^[18] integrated attention layers with CNNs to forecast the following day’s index price based on past data. The findings suggested that the attention-based approach yielded the best results among the tested models.

Recurrent neural networks (RNNs) incorporate an internal memory, thereby enabling them to capture historical information and generate predictions ^[19]^[20]. Among RNNs, LSTM is a widely used type that has also been applied to stock market prediction. Nelson et al. ^[21] fed technical indicators into an LSTM to forecast price trends in the Brazilian stock exchange. The results showed the superior performance of LSTM compared to MLP. In another work, ^[22] introduced a recursive network, the Echo State Network (ESN), to predict S&P 500 stocks. They used various stock market features, including price, volume, and the moving average. The ESN was applied to 50 stocks and achieved an error rate of 0.0027. Ding and Qin ^[23] proposed an LSTM-based network with multiple inputs and outputs—specifically, the opening price, lowest price, and highest price of a stock. Their investigations revealed that the suggested model outperformed the LSTM network model and other deep recurrent neural networks in predicting multiple values simultaneously, with a prediction accuracy exceeding 95%. Jin et al. ^[24] incorporated investors’ sentiment into stock prediction by utilizing empirical modal decomposition (EMD) to fail the complex sequence of stock prices. They also employed an LSTM network with attention mechanisms to focus on the most relevant data. The revised LSTM model not only improved prediction accuracy, but also reduced time delay according to the study’s findings. Liu et al. ^[25] proposed a two-component multi-element hierarchical attention capsule network. The first component, multi-element hierarchical attention, assigned weights to valuable information from various news and social media sources. The capsule network component captured additional context information from events. Their model enhanced prediction accuracy by quantifying the diverse influences of events.

3. Feature-Based Methods

Market data, which encompasses the open/high/low/close (OHLC) prices of a share over a specific period, stands as the most prevalent feature employed in nearly all prediction algorithms in this field ^[26]^[27]^[28]^[29]. The time of measurement (ranging from seconds to months) and the number of measurements used as inputs in the model may differ across various models.

In recent times, social networks have significantly influenced various aspects of human life, including financial markets. Social networks have a notable impact on financial markets through user interactions, opinion sharing, engaging in discussions, and following trusted individuals. Social trading is a specific form of this phenomenon, where investors observe and replicate the strategies of experts or peer traders. Within common social networks, two vital sources of information are the messages posted by users and the social information related to users themselves, such as following relationships. These aspects are further explored below. Concerning social network textual messages, sentiment analysis is a prevalent tool used to extract users’ opinions about shares, with the aim to classify the sentiment as positive, negative, or neutral ^[30]. Notably, the study conducted by Nelson et al. ^[31] represents one of the earliest attempts to forecast stock fluctuations using Twitter data.

To accurately assess stock market sentiment, the researchers evaluated a random subsample of tweets over six months and subsequently determined the correlation between this data and future stock market indicators. Baker et al. ^[32]^[33] developed a sentiment index that captures changes in investors’ sentiments. They showed that fluctuations in this index impacted investors and stimulated changes in the overall stock market. Gilbert et al. ^[34] suggested that an individual’s emotional state influences their decision making and confirmed that sentiment inferred from web content contained information that could forecast stock prices. While some studies have shown a correlation between emotional trends in internet comments and stock market movements, few have attempted to predict stock prices using sentiment analysis. For instance, Guo et al. ^[35] proposed a technique based on the hot optimization route, which examined the relationship between user mood and the stock market by analyzing user review data from a stock review website. Zhou et al. ^[36] achieved a stock market prediction accuracy of 64.15% using the SVM-ES model, wherein they incorporated social sentiments such as contempt, pleasure, melancholy, and fear. Picasso et al. ^[3] employed data science and machine learning tools to combine technical and fundamental assessments. The result was a predictive model capable of forecasting the trajectory of a portfolio comprising the twenty most-capitalized enterprises in the NASDAQ100 index. Bouktif et al. ^[37] employed improved sentiment analysis to assess the predictability of stock market directions. They delved deeper into stocks by examining various factors such as historical stock price, sentiment polarity, subjectivity, N-grams, custom text-based features, and feature delays. By employing advanced causality analysis, algorithmic feature selection, and machine learning techniques, including regularized model stacking, they collected and evaluated data from 10 major NASDAQ shares across diverse stock domains. Their method achieved a 60 percent accuracy rate, which surpassed existing sentiment-based stock market prediction algorithms, including deep learning. Alhamzeh et al. ^[38] analyzed innovative data sources, specifically StockTwits paired with financial news, and tackled the problem as a binary classification task. They adopted a hybrid approach that combines sentiment and event-based features. The findings indicated that StockTwits data outperformed price data in predicting the closing prices of eight NASDAQ100 companies. Another valuable information source from social networks is user-related data, including the user’s influence within the network and the accuracy of their predictions. Kamkarhaghighi et al. ^[39] examined the relationship between a Twitter user’s influential power in stock market prediction and their social network information, including details about their followers. They identified several active users in the stock exchange as valuable users and calculated a score for the accuracy of each user’s predictions. By setting a threshold to distinguish valuable and non-valuable users, they trained and reported the accuracy of a naive Bayes model using attributes such as the number of followers and the number of related followers of users. Ultimately, they concluded that users’ profile information could provide insights into their influence on the stock market. Bujari et al. ^[40] explored the relationship between various social features, including the number of each user’s followers and the volume of tweets related to each stock, in relation to that stock’s market data. The results revealed that predictive features for each stock differ, and there is no general model that applies to all stocks.

References

Ballings, M.; Van den Poel, D.; Hespeels, N.; Gryp, R. Evaluating multiple classifiers for stock price direction prediction. Expert Syst. Appl. 2015, 42, 7046–7056.
Sharma, N.; Juneja, A. Combining of random forest estimates using LSboost for stock market index prediction. In Proceedings of the 2nd International Conference for Convergence in Technology (I2CT), Mumbai, India, 7–9 April 2017; pp. 1199–1202.
Picasso, A.; Merello, S.; Ma, Y.; Oneto, L.; Cambria, E. Technical analysis and sentiment embeddings for market trend prediction. Expert Syst. Appl. 2019, 135, 60–70.
Izzah, A.; Sari, Y.A.; Widyastuti, R.; Cinderatama, T.A. Mobile app for stock prediction using Improved Multiple Linear Regression. In Proceedings of the 2017 International Conference on Sustainable Information Engineering and Technology (SIET), Batu, Indonesia, 24–25 November 2017; pp. 150–154.
Ariyo, A.A.; Adewumi, A.O.; Ayo, C.K. Stock price prediction using the ARIMA model. In Proceedings of the 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, Cambridge, UK, 26–28 March 2014; pp. 106–112.
Ouahilal, M.; El Mohajir, M.; Chahhou, M.; El Mohajir, B.E. Optimizing stock market price prediction using a hybrid approach based on HP filters and support vector regression. In Proceedings of the 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt), Tangier, Morocco, 24–26 October 2016; pp. 290–294.
Krollner, B.; Vanstone, B.J.; Finnie, G.R. Financial time series forecasting with machine learning techniques: A survey. In Proceedings of the 18th European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, 28–30 April 2010.
Tollo, G.; Tanev, S.; Liotta, G.; De March, D. Using online textual data, principal component analysis and artificial neural networks to study business and innovation practices in technology-driven firms. Comput. Ind. 2015, 74, 16–28.
Corazza, M.; De March, D.; Tollo, G. Design of adaptive Elman networks for credit risk assessment. Quant. Financ. 2021, 21, 323–340.
Hu, H.; Tang, L.; Zhang, S.; Wang, H. Predicting the direction of stock markets using optimized neural networks with Google Trends. Neurocomputing 2018, 285, 188–195.
Qiu, M.; Song, Y. Predicting the direction of stock market index movement using an optimized artificial neural network model. PLoS ONE 2016, 11, e0155133.
Mingyue, Q.; Cheng, L.; Yu, S. Application of the Artificial Neural Network in predicting the direction of stock market index. In Proceedings of the 2016 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS), Fukuoka, Japan, 6–8 July 2016; pp. 219–223.
Moghaddam, A.H.; Moghaddam, M.H.; Esfandyari, M. Stock market index prediction using artificial neural network. J. Econ. Financ. Adm. Sci. 2016, 21, 89–93.
Arévalo, A.; Niño, J.; Hernández, G.; Sandoval, J. High-frequency trading strategy based on deep neural networks. In Proceedings of the International Conference on Intelligent Computing, Lanzhou, China, 2–5 August 2016; Springer: Cham, Switzerland, 2016; pp. 424–436.
Chong, E.; Han, C.; Park, F. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Syst. Appl. 2017, 83, 187–205.
Hoseinzade, E.; Haratizadeh, S. CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst. Appl. 2019, 129, 273–285.
Gao, P.; Zhang, R.; Yang, X. The application of stock index price prediction with neural networks. Math. Comput. Appl. 2020, 25, 53.
Wang, Y.; Li, Q.; Huang, Z.; Li, J. EAN: Event attention network for stock price trend prediction based on sentimental embedding. In Proceedings of the 10th ACM Conference on Web Science, Boston, MA, USA, 30 June–3 July 2019; pp. 311–320.
Moravvej, S.; Mousavirad, S.; Oliva, D.; Mohammadi, F. A Novel Plagiarism Detection Approach Combining BERT-based Word Embedding, Attention-based LSTMs and an Improved Differential Evolution Algorithm. arXiv 2023, arXiv:2305.02374.
Yu, S.; Xia, F.; Li, S.; Hou, M.; Sheng, Q. Spatio-Temporal Graph Learning for Epidemic Prediction. Acm Trans. Intell. Syst. Technol. 2023, 14, 36.
Nelson, D.M.; Pereira, A.C.; de Oliveira, R.A. Stock market’s price movement prediction with LSTM neural networks. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1419–1426.
Bernal, A.; Fok, S.; Pidaparthi, R. Financial Market Time Series Prediction with Recurrent Neural Networks; Citeseer: State College, PA, USA, 2012.
Ding, G.; Qin, L. Study on the prediction of stock price based on the associated network model of LSTM. Int. J. Mach. Learn. Cybern. 2020, 11, 1307–1317.
Jin, Z.; Yang, Y.; Liu, Y. Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput. Appl. 2019, 32, 9713–9729.
Liu, J.; Lin, H.; Yang, L.; Xu, B.; Wen, D. Multi-Element Hierarchical Attention Capsule Network for Stock Prediction. IEEE Access 2020, 8, 143114–143123.
Althelaya, K.A.; El-Alfy, E.S.M.; Mohammed, S. Evaluation of bidirectional LSTM for short-and long-term stock market prediction. In Proceedings of the 2018 9th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 3–5 April 2018; pp. 151–156.
Baughman, M.; Haas, C.; Wolski, R.; Foster, I.; Chard, K. Predicting Amazon spot prices with LSTM networks. In Proceedings of the 9th Workshop on Scientific Cloud Computing, Tempe, AZ, USA, 11 June 2018; pp. 1–7.
Lin, Y.F.; Huang, T.M.; Chung, W.H.; Ueng, Y.L. Forecasting fluctuations in the financial index using a recurrent neural network based on price features. IEEE Trans. Emerg. Top. Comput. Intell. 2020, 5, 780–791.
Zhou, Z.; Zhao, J.; Xu, K. Can online emotions predict the stock market in China? In Proceedings of the International Conference on Web Information Systems Engineering, Shanghai, China, 8–10 November 2016; Springer: Cham, Switzerland, 2016; pp. 328–342.
Taboada, M. Sentiment analysis: An overview from linguistics. Annu. Rev. Linguist. 2016, 2, 325–347.
Zhang, X.; Fuehres, H.; Gloor, P.A. Predicting stock market indicators through twitter “I hope it is not as bad as I fear”. Procedia-Soc. Behav. Sci. 2011, 26, 55–62.
Baker, M.; Wurgler, J. Investor sentiment and the cross-section of stock returns. J. Financ. 2006, 61, 1645–1680.
Baker, M.; Wurgler, J. Investor sentiment in the stock market. J. Econ. Perspect. 2007, 21, 129–152.
Gilbert, E.; Karahalios, K. Widespread worry and the stock market. In Proceedings of the International AAAI Conference on Web and Social Media, Washington, DC, USA, 23–26 May 2010; Volume 4.
Guo, Z.; Ye, W.; Yang, J.; Zeng, Y. Financial index time series prediction based on bidirectional two dimensional locality preserving projection. In Proceedings of the 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), Beijing, China, 10–12 March 2017; pp. 934–938.
Zhou, S.; Zhou, L.; Mao, M.; Tai, H.M.; Wan, Y. An optimized heterogeneous structure LSTM network for electricity price forecasting. IEEE Access 2019, 7, 108161–108173.
Bouktif, S.; Fiaz, A.; Awad, M.A. Augmented Textual Features-Based Stock Market Prediction. IEEE Access 2020, 8, 40269–40282.
Alhamzeh, A.; Mukhopadhaya, S.; Hafid, S.; Bremard, A.; Egyed-Zsigmond, E.; Kosch, H.; Brunie, L. A Hybrid Approach for Stock Market Prediction Using Financial News and Stocktwits. In CLEF 2021: Experimental IR Meets Multilinguality, Multimodality, and Interaction; Lecture Notes in Computer, Science; Candan, K.S., Ionescu, B., Goeuriot, L., Larsen, B., Müller, H., Joly, A., Maistro, M., Piroi, F., Faggioli, G., Ferro, N., Eds.; Springer: Cham, Switzerland, 2021; Volume 12880.
Kamkarhaghighi, M.; Chepurna, I.; Aghababaei, S.; Makrehchi, M. Discovering credible Twitter users in the stock market domain. In Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), Omaha, NE, USA, 13–16 October 2016; pp. 66–72.
Bujari, A.; Furini, M.; Laina, N. On using cashtags to predict companies stock trends. In Proceedings of the 2017 14th IEEE Annual Consumer Communications Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2017; pp. 25–28.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Computer Science, Artificial Intelligence

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Pegah Eslamieh

Mehdi Shajari

Ahmad Nickabadi

View Times: 343

Update Date: 21 Jul 2023

Table of Contents

Video Upload Options

Confirm

1. Introduction

2. Prediction Methods

3. Feature-Based Methods

References