COVID Mortality Prediction

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Alberto Di Napoli	+ 1753 word(s)	1753	2021-09-13 10:40:58	\|
2	format correct	Nora Tang	Meta information modification	1753	2021-10-27 04:32:45	\|

This entry is adapted from the peer-reviewed paper 10.3390/jpm11090893

The considerations in this review may help to develop further studies to predict mortality in COVID patients, including both adulthood and childhood, although children and young people remain at low risk of COVID mortality. Moreover, suggestions collected in this study could also be useful to predict prognoses other than mortality (e.g., intubation and length of hospital stay).

machine learning deep learning COVID mortality prediction imaging computer Tomography (CT)

1. Introduction

More than a year has passed since the report of the first case of coronavirus disease 2019 (COVID), and many deaths continue to occur. Despite the discovery of different vaccine formulas from different pharmaceutical companies, many problems related to mass production and distribution across the world still persist. This factor is accompanied by political and economic constraints that may further limit vaccine access ^[1]. For these reasons, pandemic containment is a hard task, resulting in increased deaths. At the time this manuscript is written, SARS-CoV-2 numbers reported by the World Health Organization (Ginevra, Switzerland) ( https://covid19.who.int/ , 31 May 2021) worldwide include: almost 173,005,553 people infected with SARS-CoV-2; more than 3,727,605 death cases and around 1,900,955,505 vaccine doses administered. Multiple hospitalizations, due to the rapid spread of the virus have required an improvement of patient management throughout the healthcare system. In this context, it is important to minimize the time required for resource allocation and clinical decision making, such as triage, choice of ventilation modality, admission to the intensive care unit. Currently, baseline machine learning (ML) and deep learning (DL) techniques are widely accepted thanks to their ability to obtain information from the input data without “a priori” definitions ^[2]. These approaches can be efficiently tested in healthcare applications such as diagnosis of diseases, analysis of medical images, collection of big data, research and clinical trials, management of smart health records, prediction of outbreaks ^[3]. Consequently, DL models are capable of solving complex tasks in the intricate clinical field ^[4]. ML is acquiring an increasingly sought-after role in predicting the outcome of COVID patients ^[3]^[5]^[6]^[7]. For instance, a mortality prediction model could rapidly and effectively help clinical decision-making for COVID patients at imminent risk of death. Recent studies reviewed predictive models for SARS-CoV-2 diagnosis and severity, length of hospital stay, intensive care unit (ICU) admission, mechanical ventilation modality outcomes ^[8]^[9]^[10]^[11]^[12], highlighting pitfalls of the machine and deep learning methods based on imaging data ^[13]; however, systematic reviews focused on prediction of COVID mortality outcome with ML methods, including DL techniques, are lacking in the literature.

The aim of this review is to discuss the current state of the art of ML methods to predict COVID mortality by: (1) summarizing the existing published literature on baseline ML- and DL-based COVID mortality prognosis systems based on medical evaluations, laboratory exams and Computer Tomography (CT); (2) presenting relevant information including the type of data employed, the data splitting technique, the proposed ML methodology and evaluation metrics; (3) providing possible explanations of the best results obtained; (4) discussing challenging aspects of current studies, providing suggestions for future developments.

2. Literature Review Methods

This systematic review considers the state of the art in ML and DL as applied to COVID mortality prediction. We performed a MEDLINE search on PubMed on 26 May 2021 using the terms “machine learning covid survival” (146 results), “machine learning covid mortality” (131 results), “deep learning covid survival” (49 results), “deep learning covid mortality” (45 results) and additional similar terms. The search results were filtered to remove: duplicates, ML approaches for SARS-CoV-2 diagnosis or prognosis besides mortality, preprint works, abstract works, papers that deviated from our purpose. We try to shed some light on peculiar characteristics of these studies in terms of: (i) data source, (ii) data partitioning, (iii) class of features, (iv) implemented features ranking method, (v) implemented ML technique, (vi) metrics evaluated for performance assessment.

We focused on the type of model validation that each study used to split data into train and test groups. Particularly, we chose to report the number of subjects used for the train and test set, and the corresponding number of survived and non-survived subjects. Additionally, we categorized validations type in: internal, external, merged and prospective (in particular internal prospective or merged prospective); referring to Internal validation when the studies subdivided a single-site database into train and test groups; external validation when studies trained and tested the model using data from independent cohorts, obtained from different sites. Moreover, we referred to merged validation for studies that combined data from different sites producing a single database to split into train and test groups or used multisite publicly available epidemiological datasets. Finally, we indicated prospective validation when studies implemented a temporal validation, assessing temporal generalizability. In the case of internal prospective validation, data of hospitalized patients from a first timeframe was used for training and data of patients admitted at a different time from the same hospital was used for testing. Differently, prospective merged validation relied on multisite data to train the model and multisite data collected in a subsequent timeframe for testing.

We expected to collect papers with both clinical and imaging features. In the latter, we included hand-made extracted features with radiomic analysis and the features learned with the use of convolutional neural networks (CNN). Clinical features comprise demographic (e.g., age, sex, race) , comorbidities (e.g., diabetes, heart disease), symptoms (e.g., cough, fever), vital signs (e.g., heart rate, oxygen saturation), laboratory values (e.g., glucose, creatinine, haemoglobin), disease treatment and clinical course (e.g., artificial ventilation, length of hospital stay, drugs). Clinical features can be classified in binary (yes/no: 0/1) and continuous features (numerical values). We considered binary features when studies associated them with 0/1 values or dichotomized continuous feature’s value in a binary form, defining a numerical range and setting the feature to 1 if the value is within that range, 0 otherwise. While we have referred to continuous features when studies used predictors (features used for prediction tasks) as continuous variables or dichotomized binary features in continuous features.

To build a reliable model for solving classification, the feature set should contain as much useful information as possible, and a number of features as small as possible. It is necessary to filter out the irrelevant and redundant features by choosing a subset of relevant features to avoid over-fitting and tackle the problem of dimensionality ^[14]. Feature ranking (or selection or reduction) techniques are a good approach for features space dimensionality reduction ^[15]. Feature ranking improves features understanding and reduces the computational cost, increasing the efficiency of the classification. Since Shapley Additive Explanation (SHAP) and least absolute shrinkage and selection operator (LASSO) logistic regression algorithm are widely used methods for model interpretation and feature selection in survival studies ^[16]^[17]^[18]^[19], we highlighted whether the studies used these methods or others. Particularly SHAP is a method to explain individual predictions by computing the contribution of each feature to the prediction. LASSO is a new method for estimation in linear models based on regression analysis.

3. Literature Review Results

A total of 19/24 studies adopted binary features ^[20]^[21]^[22]^[23]^[24]^[25]^[26]^[27]^[28]^[29]^[30]^[31]^[32]^[33]^[34]^[35]^[36]^[37]^[38]^[39]. 1/24 study dichotomized continuous feature’s value in a binary form ^[40].

A total of 16/24 studies adopted continuous features ^[21]^[22]^[23]^[26]^[41]^[27]^[28]^[29]^[30]^[32]^[34]^[35]^[36]^[37]^[38]^[39]. A total of 2/24 studies dichotomized binary feature in continuous feature associating a Charlson comorbidity score to the feature’s value ^[42]^[36].

Most studies used a high number of starting features ^[23]^[26]^[41]^[27]^[28]^[29]^[30]^[32]^[34]^[35]^[42].

We found 8/24 articles in which SHAP method was used to optimize survival prediction in COVID ^[22]^[23]^[24]^[25]^[41]^[29]^[42]^[36]^[37]^[43]. Vaid et al. demonstrated that interactions between features had a weak contribution to outcome prediction compared to the importance of each feature individually ^[23]. On the contrary, Abdullal et al. used SHAP analysis to assess the contribution of patient variables to the mortality prediction, with no features reduction ^[24]^[25]. A similar approach was employed by other studies ^[22]^[41]^[29]. Subudhi et al. tested 18 models and performed the SHAP technique on the temporally distinct patients to compare the important features selected on the different validation cohorts ^[36]. In the other works, the most relevant features were selected with LASSO ^[23]^[30]^[31]^[32]. Ko et al. employed the analysis of variance (ANOVA) to select features with the most significant difference between survivors and deceased. Particularly, in the study by Ko et al. , the purpose was to identify a significant difference between the two classes (survived and no survived) by selecting the features with p -values less than 10 −5 ^[26]. In the study by Di et al. , the moDel Agnostic Language for Exploration and eXplanation (DALEX) package is used as a features selection method usually adopted for predictive models. Booth et al. implemented a different ranking method including a Logistic Regressor (LR) classifier, obtaining regression coefficients as a measure of feature importance]. An et al. compared different features ranking models to figure out if there was a coherence in using different features ranking procedures ^[31]. Hu et al. used regression algorithms for feature reduction as well ^[34]. Li et al. used the univariate analysis to compare distribution differences between COVID survivors and non-survivors ^[27]. Moreover, they compared an evaluation model with 83 features and a model with only the first five features selected. Yan et al. performed feature ranking with a Multi-tree XGBoost ^[35]. With DL models, features selection can be implemented by combining available features, as shown by Zhu et al. ^[28], to obtain the optimal number of features necessary for classification. Three articles did not apply any feature selection before the prediction algorithm ^[20]^[21]^[33].

4. Conclusions

This systematic review specifically considers the state of the art in ML and DL as applied to COVID mortality prediction. Both binary and multi-class features are considered throughout the review. We summarized the developed models considering data source, data partitioning, class of features, ML technique and evaluation metrics for performance assessment. Clinical features are used in all studies for data samples, while only one paper currently has CT images features. Most of the studies presented an imbalanced number of survived and non-survived cases. We found some best practices that studies could follow for developing optimal ML models: (1) the use of a high-quality dataset with a large balanced number of samples, (2) the implementation of an ensemble of different ML methodologies, (3) clinical features should include different features class type including Age, CRP, LDH values, (4) as many metrics as possible should be reported to have a complete view on model performance, including both the most common metrics, such as AUCROC and ACC, and other important metrics for performance prediction assessment, such as SENS, SPEC, PPV and NPV.

The considerations in this review may help to develop further studies to predict mortality in COVID patients, including both adulthood and childhood, although children and young people remain at low risk of COVID mortality ^[44]. Moreover, suggestions collected in this study could also be useful to predict prognoses other than mortality (e.g., intubation and length of hospital stay).

References

Forni, G.; Mantovani, A.; Forni, G.; Mantovani, A.; Moretta, L.; Rappuoli, R.; Rezza, G.; Bagnasco, A.; Barsacchi, G.; Bussolati, G.; et al. COVID-19 vaccines: Where we stand and challenges ahead. Cell Death Differ. 2021, 28, 626–639.
Bishop, C.M. Pattern Recognition and Machine Learning Springer Mathematical Notation Ni; Springer: Secaucus, NJ, USA, 2006; Available online: http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf (accessed on 26 May 2021).
Ma, C.; Yao, Z.; Zhang, Q.; Zou, X. Quantitative integration of radiomic and genomic data improves survival prediction of low-grade glioma patients. Math. Biosci. Eng. 2020, 18, 727–744.
Pasquini, L.; Napolitano, A.; Tagliente, E.; Dellepiane, F.; Lucignani, M.; Vidiri, A.; Ranazzi, G.; Stoppacciaro, A.; Moltoni, G.; Nicolai, M.; et al. Deep Learning Can Differentiate IDH-Mutant from IDH-Wild GBM. J. Pers. Med. 2021, 11, 290.
Wang, S.; Zha, Y.; Li, W.; Wu, Q.; Li, X.; Niu, M.; Wang, M.; Qiu, X.; Li, H.; Yu, H.; et al. A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis. Eur. Respir. J. 2020, 56.
Yue, H.; Yu, Q.; Liu, C.; Huang, Y.; Jiang, Z.; Shao, C.; Zhang, H.; Ma, B.; Wang, Y.; Xie, G.; et al. Machine learning-based CT radiomics method for predicting hospital stay in patients with pneumonia associated with SARS-CoV-2 infection: A multicenter study. Ann. Transl. Med. 2020, 8, 859.
Schalekamp, S.; Huisman, M.; van Dijk, R.A.; Boomsma, M.F.; Freire Jorge, P.J.; de Boer, W.S.; Herder, G.J.M.; Bonarius, M.; Groot, O.A.; Jong, E.; et al. Model-based prediction of critical illness in hospitalized patients with COVID-19. Radiology 2020, 298, E46–E54.
Wynants, L.; Van Calster, B.; Collins, G.S.; Riley, R.D.; Heinze, G.; Schuit, E.; Bonten, M.M.J.; Damen, J.A.A.; Debray, T.P.A.; De Vos, M.; et al. Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ 2020, 369.
Alabool, H.; Alarabiat, D.; Habib, M.; Khasawneh, A.M.; Alshinwan, M.; Shehab, M. Artificial Intelligence Techniques for Containment COVID-19 Pandemic: A Systematic Review. Available online: https://www.researchsquare.com/article/rs-30432/v1 (accessed on 26 May 2021).
Albahri, O.S.; Zaidan, A.A.; Albahri, A.S.; Zaidan, B.B.; Abdulkareem, K.H.; Al-qaysi, Z.T.; Alamoodi, A.H.; Aleesa, A.M.; Chyad, M.A.; Alesa, R.M.; et al. Systematic review of artificial intelligence techniques in the detection and classification of COVID-19 medical images in terms of evaluation and benchmarking: Taxonomy analysis, challenges, future solutions and methodological aspects. J. Infect. Public Health 2020, 13, 1381–1396.
Shi, F.; Wang, J.; Shi, J.; Wu, Z.; Wang, Q.; Tang, Z.; He, K.; Shi, Y.; Shen, D. Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for COVID-19. IEEE Rev. Biomed. Eng. 2020, 14, 4–15.
Islam, M.M.; Karray, F.; Alhajj, R.; Zeng, J. A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19). IEEE Access 2021, 9, 30551–30572.
Roberts, M.; Driggs, D.; Thorpe, M.; Gilbey, J.; Yeung, M.; Ursprung, S.; Aviles-Rivero, A.I.; Etmann, C.; Mccague, C.; Beer, L.; et al. COVID-19 using chest radiographs and CT scans. Nat. Mach. Intell. 2021, 3, 199–217.
Gheyas, I.A.; Smith, L.S. Feature subset selection in large dimensionality domains. Pattern Recognit. 2010, 43, 5–13.
Mehmood, T.; Sæbø, S.; Liland, K.H. Comparison of variable selection methods in partial least squares regression. J. Chemom. 2020, 34, 1–14.
Takahashi, S.; Asada, K.; Takasawa, K.; Shimoyama, R.; Sakai, A.; Bolatkan, A.; Shinkai, N.; Kobayashi, K.; Komatsu, M.; Kaneko, S.; et al. Predicting deep learning based multi-omics parallel integration survival subtypes in lung cancer using reverse phase protein array data. Biomolecules 2020, 10, 1460.
Moncada-Torres, A.; van Maaren, M.C.; Hendriks, M.P.; Siesling, S.; Geleijnse, G. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci. Rep. 2021, 11, 1–13.
Kidd, A.C.; McGettrick, M.; Tsim, S.; Halligan, D.L.; Bylesjo, M.; Blyth, K.G. Survival prediction in mesothelioma using a scalable Lasso regression model: Instructions for use and initial performance using clinical predictors. BMJ Open Respir. Res. 2018, 5, e000240.
Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288.
Li, Y.; Horowitz, M.A.; Liu, J.; Chew, A.; Lan, H.; Liu, Q.; Sha, D.; Yang, C. Individual-Level Fatality Prediction of COVID-19 Patients Using AI Methods. Front. Public Health 2020, 8, 566.
Ning, W.; Lei, S.; Yang, J.; Cao, Y.; Jiang, P.; Yang, Q.; Zhang, J.; Wang, X.; Chen, F.; Geng, Z.; et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat. Biomed. Eng. 2020, 4, 1197–1207.
Bertsimas, D.; Lukin, G.; Mingardi, L.; Nohadani, O.; Orfanoudaki, A.; Stellato, B.; Wiberg, H.; Gonzalez-Garcia, S.; Parra-Calderón, C.L.; Robinson, K.; et al. COVID-19 mortality risk assessment: An international multi-center study. PLoS ONE 2020, 15, e0243262.
Vaid, A.; Somani, S.; Russak, A.J.; de Freitas, J.K.; Chaudhry, F.F.; Paranjpe, I.; Johnson, K.W.; Lee, S.J.; Miotto, R.; Richter, F.; et al. Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: Model development and validation. J. Med. Internet Res. 2020, 22, e24018.
Abdulaal, A.; Patel, A.; Charani, E.; Denny, S.; Mughal, N.; Moore, L. Prognostic modeling of COVID-19 using artificial intelligence in the United Kingdom: Model development and validation. J. Med. Internet Res. 2020, 22, e20259.
Abdulaal, A.; Patel, A.; Charani, E.; Denny, S.; Alqahtani, S.A.; Davies, G.W.; Mughal, N.; Moore, L.S.P. Comparison of deep learning with regression analysis in creating predictive models for SARS-CoV-2 outcomes. BMC Med. Inform. Decis. Mak. 2020, 20, 1–11.
Ko, H.; Chung, H.; Kang, W.S.; Park, C.; Kim, D.W.; Kim, S.E.; Chung, C.R.; Ko, R.E.; Lee, H.; Seo, J.H.; et al. An artificial intelligence model to predict the mortality of COVID-19 patients at hospital admission time using routine blood samples: Development and validation of an ensemble model. J. Med. Internet Res. 2020, 22, e25442.
Li, S.; Lin, Y.; Zhu, T.; Fan, M.; Xu, S.; Qiu, W.; Chen, C.; Li, L.; Wang, Y.; Yan, J.; et al. Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method. Neural Comput. Appl. 2021, 1.
Zhu, J.S.; Ge, P.; Jiang, C.; Zhang, Y.; Li, X.; Zhao, Z.; Zhang, L.; Duong, T.Q. Deep-learning artificial intelligence analysis of clinical variables predicts mortality in COVID-19 patients. J. Am. Coll. Emerg. Physicians Open 2020, 1, 1364–1373.
Yu, L.; Halalau, A.; Dalal, B.; Abbas, A.E.; Ivascu, F.; Amin, M.; Nair, G.B. Machine learning methods to predict mechanical ventilation and mortality in patients with COVID-19. PLoS ONE 2021, 16, e0249285.
Gao, Y.; Cai, G.Y.; Fang, W.; Li, H.Y.; Wang, S.Y.; Chen, L.; Yu, Y.; Liu, D.; Xu, S.; Cui, P.F.; et al. Machine learning based early warning system enables accurate mortality risk prediction for COVID-19. Nat. Commun. 2020, 11, 1–10.
An, C.; Lim, H.; Kim, D.W.; Chang, J.H.; Choi, Y.J.; Kim, S.W. Machine learning prediction for mortality of patients diagnosed with COVID-19: A nationwide Korean cohort study. Sci. Rep. 2020, 10, 1–11.
Guan, X.; Zhang, B.; Fu, M.; Li, M.; Yuan, X.; Zhu, Y.; Peng, J.; Guo, H.; Lu, Y. Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: Results from a retrospective cohort study. Ann. Med. 2021, 53, 257–266.
Vaid, A.; Jaladanki, S.K.; Xu, J.; Teng, S.; Kumar, A.; Lee, S.; Somani, S.; Paranjpe, I.; de Freitas, J.K.; Wanyan, T.; et al. Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: Machine learning approach. JMIR Med. Inform. 2021, 9, e24207.
Hu, C.; Liu, Z.; Jiang, Y.; Shi, O.; Zhang, X.; Xu, K.; Suo, C.; Wang, Q.; Song, Y.; Yu, K.; et al. Early prediction of mortality risk among patients with severe COVID-19, using machine learning. Int. J. Epidemiol. 2021, 49, 1918–1929.
Yan, L.; Zhang, H.-T.; Goncalves, J.; Xiao, Y.; Wang, M.; Guo, Y.; Sun, C.; Tang, X.; Jing, L.; Zhang, M.; et al. An interpretable mortality prediction model for COVID-19 patients. Nat. Mach. Intell. 2020, 2, 283–288.
Subudhi, S.; Verma, A.; Patel, A.B.; Hardin, C.C.; Khandekar, M.J.; Lee, H.; McEvoy, D.; Stylianopoulos, T.; Munn, L.L.; Dutta, S.; et al. Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. NPJ Digit. Med. 2021, 4, 1–7.
Rozenbaum, D.; Shreve, J.; Radakovich, N.; Douggal, A.; Jehi, L.; Nazha, A. Personalized Prediction of Hospital Mortality in COVID-19 positive patients. Mayo Clin. Proc. Innov. Qual. Outcomes 2021, 5, 795–801.
Tezza, F.; Lorenzoni, G.; Azzolina, D.; Barbar, S.; Leone, L.A.C.; Gregori, D. Predicting in-hospital mortality of patients with covid-19 using machine learning techniques. J. Pers. Med. 2021, 11, 343.
Stachel, A.; Daniel, K.; Ding, D.; Francois, F.; Phillips, M.; Lighter, J. Development and validation of a machine learning model to predict mortality risk in patients with COVID-19. BMJ Health Care Inform. 2021, 28.
Di, A.; Bonaccio, M.; Costanzo, S. Since January 2020 Elsevier Has Created a COVID-19 Resource Centre with Free Information in English and Mandarin on the Novel Coronavirus COVID- 19. The COVID-19 Resource Centre is Hosted on Elsevier Connect, the Company’ s Public News and Information. 2020. Available online: https://www.binasss.sa.cr/agocovid/4.pdf (accessed on 26 May 2021).
Booth, A.L.; Abels, E.; McCaffrey, P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod. Pathol. 2021, 34, 522–531.
Ikemura, K.; Bellin, E.; Yagi, Y.; Billett, H.; Saada, M.; Simone, K.; Stahl, L.; Szymanski, J.; Goldstein, D.Y.; Gil, M.R. Using automated machine learning to predict the mortality of patients with COVID-19: Prediction model development study. J. Med. Internet Res. 2021, 23, e23458.
Li, Y.; Xia, L. Coronavirus disease 2019 (COVID-19): Role of chest CT in diagnosis and management. Am. J. Roentgenol. 2020, 214, 1280–1286.
Bhopal, S.S.; Bagaria, J.; Olabi, B.; Bhopal, R. Children and young people remain at low risk of COVID-19 mortality. Lancet Child Adolesc. Health 2021, 5, e12–e13.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Others

Contributor MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Alberto Di Napoli

View Times: 403

Update Date: 27 Oct 2021

Table of Contents

Video Upload Options

Confirm