Several epidemiological models are being used around the world to project the number of infected individuals and the mortality rates of the COVID-19 outbreak. Advancing accurate prediction models is of utmost importance to take proper actions. Due to the lack of essential data and uncertainty, the epidemiological models have been challenged regarding the delivery of higher accuracy for long-term prediction. As an alternative to the susceptible-infected-resistant (SIR)-based models, this study proposes a hybrid machine learning approach to predict the COVID-19, and we exemplify its potential using data from Hungary. The hybrid machine learning methods of adaptive network-based fuzzy inference system (ANFIS) and multi-layered perceptron-imperialist competitive algorithm (MLP-ICA) are proposed to predict time series of infected individuals and mortality rate. The models predict that by late May, the outbreak and the total morality will drop substantially. The validation is performed for 9 days with promising results, which confirms the model accuracy. It is expected that the model maintains its accuracy as long as no significant interruption occurs. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research.
1. Introduction
Although SIR-based models been widely used for modeling the COVID-19 outbreak, they include some degree of uncertainties. Several advancements are emerging to improve the quality of SIR-based models suitable to the COVID-19 outbreak. As an alternative to the SIR-based models, this study proposed machine learning as a new trend in advancing outbreak models. The machine learning approach makes no assumption on the pandemic and spread of the infection. Instead, it predicts the time series of the infected cases as well as total mortality cases. In this study, the hybrid machine learning model of MLP-ICA and ANFIS is used to predict the COVID-19 outbreak in Hungary. The models predict that by late May, the outbreak and the total morality will drop substantially. Based on the promising results reported in this study, and due to the complex phenomenon of COVID-19 outbreak, this study, as an alternative modeling strategy, suggests machine learning as a potential technology to be considered to model the outbreak. However, further research would be essential to validate the results and improve the quality of prediction.
Although SIR-based models been widely used for modeling the COVID-19 outbreak, they include some degree of uncertainties. Several advancements are emerging to improve the quality of SIR-based models suitable to the COVID-19 outbreak. As an alternative to the SIR-based models, this study proposed machine learning as a new trend in advancing outbreak models. The machine learning approach makes no assumption on the pandemic and spread of the infection. Instead, it predicts the time series of the infected cases as well as total mortality cases. In this study, the hybrid machine learning model of MLP-ICA and ANFIS is used to predict the COVID-19 outbreak in Hungary. The models predict that by late May, the outbreak and the total morality will drop substantially. Based on the promising results reported in this study, and due to the complex phenomenon of COVID-19 outbreak, this study, as an alternative modeling strategy, suggests machine learning as a potential technology to be considered to model the outbreak. However, further research would be essential to validate the results and improve the quality of prediction.
In this study, two scenarios were proposed for sampling the data. Scenario 1 considered sampling the odd days, and Scenario 2 used even days for training the data. Training the two machine learning models of ANFIS and MLP-ICA, were considered through using two scenarios. It is concluded that using different scenarios for data sampling has a minimum effect on the model performance. A detailed investigation was carried out to explore the most suitable number of neurons. Furthermore, the performance of the proposed algorithm is evaluated using both training and validation data. The training data are used to train the algorithm and define the best set of parameters to be used in ANFIS and MLP-ICA. After that, the best setup for each algorithm is used to predict outbreaks on the validation samples. The validation is performed for 9 days with promising results, which confirms the model accuracy. In this study, due to the lack of adequate sample data to avoid overfitting, the training is used to choose and evaluate the model with higher performance. In future research, as the COVID-19 progresses in time and with the availability of more sample data, further testing and validation can be used to better evaluate the models.
2. Development
Two scenarios were proposed for sampling the data. Scenario 1 considered sampling the odd days, and Scenario 2 used even days for training the data. Training the two machine learning models of ANFIS and MLP-ICA, were considered through using two scenarios. It is concluded that using different scenarios for data sampling has a minimum effect on the model performance. A detailed investigation was carried out to explore the most suitable number of neurons. Furthermore, the performance of the proposed algorithm is evaluated using both training and validation data. The training data are used to train the algorithm and define the best set of parameters to be used in ANFIS and MLP-ICA. After that, the best setup for each algorithm is used to predict outbreaks on the validation samples. The validation is performed for 9 days with promising results, which confirms the model accuracy. In this study, due to the lack of adequate sample data to avoid overfitting, the training is used to choose and evaluate the model with higher performance. In future research, as the COVID-19 progresses in time and with the availability of more sample data, further testing and validation can be used to better evaluate the models.
Both models showed promising results in terms of predicting the time series without the assumptions that epidemiological models require. Both machine learning models, as an alternative to epidemiological models, showed potential in predicting COVID-19 outbreak as well as estimating total mortality. Yet, MLP-ICA outperformed ANFIS with delivering accurate results on validation samples. Considering the availability of a small amount of training data, further investigation would be essential to explore the true capability of the proposed hybrid model. It is expected that the model maintains its accuracy as long as no major interruption occurs. For instance, if other outbreaks would initiate in the other cities, or the prevention regime changes, naturally the model will not maintain its accuracy. For future studies, advancing deep learning and deep reinforcement learning models is strongly encouraged for comparative studies on various ML models for individual countries.
Both models showed promising results in terms of predicting the time series without the assumptions that epidemiological models require. Both machine learning models, as an alternative to epidemiological models, showed potential in predicting COVID-19 outbreak as well as estimating total mortality. Yet, MLP-ICA outperformed ANFIS with delivering accurate results on validation samples. Considering the availability of a small amount of training data, further investigation would be essential to explore the true capability of the proposed hybrid model. It is expected that the model maintains its accuracy as long as no major interruption occurs. For instance, if other outbreaks would initiate in the other cities, or the prevention regime changes, naturally the model will not maintain its accuracy. For future studies, advancing deep learning and deep reinforcement learning models is strongly encouraged for comparative studies on various ML models for individual countries.