AThis study presents a novel approach to clinical event classification in healthcare using federated learning (FL) and cross-device ensemble models based on vital signs data from the MIMIC-IV dataset was presented. The FL structure allows training directly on each client's device, ensuring patient privacy. The proposed method achieves an impressive accuracy of 98.9%, highlighting the significant potential of FL and ensemble technology in handling sensitive patient data for healthcare applications.
1. Introduction
Artificial intelligence (AI) techniques and technologies are used to improve various aspects of healthcare. This can include medical imaging, drug discovery, patient diagnosis, and treatment planning
[1]. There is a growing body of research in this field, as AI can significantly improve the efficiency and accuracy of healthcare processes, ultimately leading to better patient outcomes. Some examples of related work include using AI to diagnose diseases such as cancer, using machine learning to analyze patient data and predict potential health issues, and using natural language processing to improve the efficiency of electronic medical records.
Big data
[1] has recently become a buzzword in many industries, and healthcare is no exception. The healthcare sector generates vast amounts of data daily, including electronic health records, claims data, and clinical trial results
[2][3][2,3]. Such data can be analyzed to identify patterns, trends, and associations that can help improve patient care, reduce costs, and advance medical research. The use of big data in healthcare is still in its initial stages, but it has already shown promise in several areas. For example, big data have been used to improve population health management by identifying patterns in patient health data that can help healthcare providers better understand the health needs of their patient population and develop strategies to improve population health. Big data have also been used to predict future patient needs and outcomes using predictive analytics and to develop clinical decision support systems that provide healthcare providers with real-time recommendations based on a patient’s medical history and current condition
[4]. Although there has been a considerable improvement in the healthcare system, as mentioned above, privacy has been the main issue concerning big data, especially in the healthcare system. In addition, enhanced machine learning techniques and advanced pre-processing can be a positive approach to solving a problem using big data.
Machine learning, a branch of artificial intelligence, entails training computer algorithms to identify patterns within data and utilizing those patterns to make informed decisions. In healthcare, machine learning is used to analyze substantial amounts of data from various sources, such as electronic health records, medical imaging, and wearable devices, to identify patterns and trends that can help improve patient care
[5]. Predictive analytics: Machine learning algorithms can be used to analyze patient data to predict future health outcomes, such as the likelihood of developing a specific condition or needing medical intervention. This can help healthcare providers make more informed decisions about patient care and allocate resources more efficiently by understanding the geographical inequalities of healthcare resources with Bayesian analysis
[6], clinical data prediction using random forest classification
[7], and disease prediction with XGBoost classification
[8]. Clinical decision support: machine learning can be used to develop clinical decision support systems, which provide healthcare providers with real-time recommendations based on a patient’s medical history and current condition
[9]. Diagnosis and treatment: Machine learning can analyze medical images, such as CT scans or X-rays, to assist in diagnosis and treatment planning. It can also analyze lab test results to identify potential health issues
[10]. Personalized medicine: machine learning can be used to develop personalized treatment plans for individual patients, considering their genetics, lifestyle, and medical history
[11].
Federated learning (FL)
[12] trains machine learning models on decentralized data. Instead of centralizing data in an individual location, federated learning allows data to remain on individual devices, such as smartphones or IoT devices. The model is trained across multiple devices by sending model updates to each device and receiving updated parameters. A global model is repeatable until it reaches a satisfactory level of performance. This allows for training on much larger datasets than possible with a centralized approach and helps protect users’ privacy by keeping their data on their own cross-devices, such as electronic health records (EHRs), wearable devices (e.g., smartwatches and fitness trackers), and medical imaging devices. In the case of federated learning, cross-device functionality allows each of these devices to contribute to the learning process by training their own local models on the data they have and then sharing the model parameters with a central server. The server then aggregates these parameters to update the global model, which is then sent back to each device.
Figure 1 shows the general architecture of using federated learning in the healthcare system, with components and connections with FL.
Figure 1.
The general concept of federated learning in the healthcare system.
Federated learning has the potential to be particularly useful in the healthcare industry, where data privacy and security are of paramount importance. With it, sensitive patient data can be kept on individual devices and hospital servers rather than centralized in a specific location
[13][14][15][13,14,15]. This can help to protect patient privacy and comply with regulations such as HIPAA
[16]. In addition, federated learning can train more accurate models by allowing for data aggregation from a more considerable number of patients. This can be especially beneficial in rare disease research
[17], where a centralized dataset may not have enough examples to train a dependable model.
2. Federated Learning for Clinical Event Classification Using Vital Signs Data
Clinical event classification using vital signs
[18][19] data is critical in healthcare as it allows for early detection and management of various medical conditions. Researchers globally have extensively explored computational techniques, including machine learning and predictive modeling, to develop accurate and reliable methods for such predictions. Effective classification can identify health risks or critical events earlier, allowing for timely intervention and potentially preventing severe outcomes. Moreover, automated classification systems can quickly analyze a high volume of patient data, assisting healthcare providers in making more accurate and faster diagnoses by using machine learning models.
Machine learning is a popular approach in this field, as it allows for the analysis of vast amounts of historical and current data from various sources in healthcare to make predictions
[1][19][1,20]. Medical machine learning contributes significantly to reducing healthcare spending and renewing the relationship between doctor and patient by reducing investment in this field
[20][21]. A wireless radar, for example, collects vital signs data using radar technology and categorizes healthy and infected people using five machine learning models
[21][22]. In 2019, Juan-Jose Beunza et al.
[22][23] compared several supervised classification machine learning algorithms for internal validity and accuracy to predict clinical events. The Framingham open database used new methods in the data preparation process and obtained an accuracy value of 0.81 for women, and a value of 0.78 for men. However, this degree of accuracy is not considered sufficient, and the performance of these methods is often hindered by the lack of large, diverse, and labeled data. Yuanyuan et al.
[23][24] introduced a system for using a convolutional neural network (CNN) with enhanced deep learning techniques to predict heart disease on an Internet of Medical Things (IoMT) platform. The “enhanced deep learning” aspect refers to using advanced techniques such as transfer learning or ensemble methods to improve the performance of the CNN. The IoMT platform uses medical devices connected to the Internet to collect and transmit data for analysis.
Jie Xu et al.
[12] conducted a survey to examine the use of federated learning in the biomedical field, aiming to provide an overview of various solutions for dealing with federated learning’s statistical system and privacy challenges. Another example highlighting these technologies’ potential applications and impacts in healthcare is a study by Thanveer Shaik et al.
[24][25], who proposed a decentralized privacy-protected system for monitoring in-patient activity in hospitals using sensors and AI models to classify twelve routine activities with the FedStack system. FedStack is a proposed system for using stacked federated learning for personalized activity monitoring. Federated learning is a technique for training machine learning models on decentralized data, where data is distributed across multiple devices or locations. Stacked federated learning refers to a specific technique where multiple federated models are trained and combined to form a final model. Similarly, Ittai Dayan et al.
[25][26] predicted the future oxygen requirements for symptomatic COVID-19 patients using vital signs, laboratory data, and chest X-rays with the FL model. Moreover, the study proposed using federated learning for predicting clinical outcomes in patients with COVID-19. Federated learning is a technique for training machine learning models on decentralized data, in which information is distributed across multiple devices or locations. In this case, the authors suggest this approach to train models on data from different hospitals or clinics and improve the accuracy of predictions for patients with COVID-19. They also claim that this approach can help make predictions in real time, improving the models’ performance by sharing knowledge across different institutions.