2. Artificial-Intelligence-Based Clinical Decision Support Systems in Primary Care
Clinical Decision Support Systems aid physicians in tasks ranging from administrative automation and documentation to clinical management and patient safety
[5]. They become more advantageous when integrated with EHRs as patients’ individual clinical profiles can be matched to the system’s knowledge base. This allows for customized recommendations and specific sets of administrative actions
[8].
Regardless, clinician satisfaction remains low due to several factors, such as excessive time consumption, workflow interruption, suboptimal EHR integration, irrelevant recommendations, and poor user-friendliness
[20][21][33,34]. A systematic review and meta-analysis by Meunier et al. found that many PCPs either perceived no need for CDSS assistance or disagreed with its recommendations
[11]. Additionally, CDSSs disrupt physician workflow and increase their cognitive load, resulting in physicians spending more time to complete tasks and less time with patients
[4]. Another significant concern is alert fatigue, forcing physicians to disregard up to 96% of the alerts offered by the CDSS, which sometimes may be detrimental to the patient’s well-being
[3][5][9][3,5,9].
As the prevalence of chronic conditions continues to rise, the demand for healthcare services and documentation also increases, resulting in a higher volume of data usage. This incites a vicious cycle with EHRs and CDSSs overloading physicians and physicians entering incomplete, non-uniform data, leading to physician burnout and poor patient management
[22][23][35,36]. In a study interviewing 1792 physicians (30% PCPs) about health information technology (HIT)-related burnout, 69.8% reported HIT-related stress, and 25.9% presented ≥1 symptom of burnout. Family medicine was the specialty with the highest prevalence of burnout symptoms and the third with the highest prevalence of HIT-related stress
[24][37].
The overall burnout that primary physicians face represents one of the most significant challenges in PHC. Medication prescription errors are frequently reported among family physicians in the United States and other countries
[25][38]. On top of that, approximately 5% of adult patients in the US experience diagnostic errors in the outpatient setting every year, with 33% leading to permanent severe injury or immediate or inevitable death
[4].
In an attempt to diminish prescription errors, Herter et al.
[26][30] implemented a system that considered patients’ characteristics to increase the proportion of successful UTI treatments and avoid overmedication and the risk of resistance. It increased the treatment success rate by 8% and improved adherence to treatment guidelines. While not yet implemented in PHC, one study in Israel reported the use of a CDSS powered by ML that identifies and intercepts potential medication prescription errors based on the analysis of historical EHRs and the patient’s current clinical environment and temporal circumstance. This AI-CDSS reduced prescription errors without causing alert fatigue
[27][39].
The big data in EHRs may be a valuable tool for AI-CDSSs. By incorporating AI into CDSSs, they become more capable of clinical reasoning as they can handle more information and approach it more holistically. With ML, AI algorithms can identify patterns, trends, and correlations in EHRs that may not be apparent to physicians
[15][16][19][28][15,16,19,40]. Likewise, they can learn from historical patient data to make predictions and recommendations for current patients
[29][30][23,27].
In researchers' study, the AI-CDSS in China was helpful for supporting physicians’ diagnoses and avoiding biases when in disagreement. Additionally, it provided similar cases to the current patient and relevant literature in real time. Physicians perceived this as a tool for training their knowledge, facilitating information research, and preventing adverse events
[31][32]. In Yao et al.
[32][29], the prediction capabilities of their AI-CDSS increased the diagnosis of low ejection fraction within 90 days of the intervention, achieving statistical significance. The intervention proved to be even more effective in the outpatient clinics.
With DL, AI arms CDSSs with the possibility of offering personalized treatment recommendations based on a patient’s unique medical history, genetics, and treatment responses
[15][16][17][19][26][29][31][33][34][35][15,16,17,19,23,28,30,32,41,42]. Similarly, it can report abnormal tests or clinical results in real time and suggest alternative treatment options
[29][31][32][36][23,29,31,32]. This immediateness can reduce the time needed for optimal treatment and increase physicians’ quality time spent with their patients
[14][17][14,17].
Researchers identified, as an example, the AI-CDSS used in Seol et al.
[30][27], the Asthma-Guidance and Prediction System (A-GPS). Even though it did not prove a significant difference in its core objectives compared to the control, it reduced the time for follow-up care after asthma exacerbations and decreased healthcare costs. Additionally, it showed the potential to reduce clinicians’ burden by significantly reducing the median time for EHR review by 7.8 min.
When optimally developed, AI-CDSSs may be powerful tools in team-based care models, such as most PHC settings. They can assist physicians in delivering integrated services by organizing and ensuring that the entire patient-management process, from preventive care and coordination to full diagnostic workup, is effectively performed
[13][37][13,43]. In addition, they can automate the process of note writing, extracting relevant clinical information from previous encounters and assembling it into appropriate places in the note
[13][14][17][13,14,17]. This guarantees that physicians only focus on human interactions, which is the hallmark of primary care.
With their AI-CDSS, physicians in Cruz et al. improved their adherence to clinical pathways in 8 of the 18 recommendations related to common diseases in PHC; 3 were statistically significant
[36][31]. Moreover, in Romero-Brufau et al., physicians perceived that the use of their AI-CDSS helped increase patients’ preparedness to manage diabetes and helped coordinate care
[33][28].
Among researchers' main findings is the scarcity of the literature research regarding AI-CDSS in PHC in real clinical settings, and not only the outcomes obtained but also the objectives of the studies, which were heterogeneous. Some focused on assessing the effectiveness of the systems
[26][30][32][36][27,29,30,31], while others focused on the physicians’ attitudes toward them
[31][33][28,32]. The effectiveness of the systems varied, with some proving to be more effective than their comparison group
[26][32][29,30], some just proving to be somewhat useful
[30][33][36][27,28,31], and others not being useful at all
[31][32].
CDSSs and EHRs represent a burden for many physicians, leading to negative prejudices and biases toward them
[4]. Additionally, there may be resistance and skepticism toward AI due to the increased workload that EHRs create
[17]. Furthermore, there is mistrust in AI and concerns that AI may replace physicians
[18][38][18,45].
Because of the latter, early research focuses on comparing and understanding physicians’ attitudes toward AI-CDSSs. This is the case of Romero-Brufau et al.
[33][28] and Wang et al.
[31][32]. In the former, the researchers found that physicians were less excited about AI and were more likely to feel like AI did not understand their jobs, even after becoming familiar with it. Clinicians gave a median score of 11 on a 1–100 scale, where 0 indicated that the system was not helpful. Only 14% of the physicians would recommend the AI-CDSS to another clinic, and only 10% thought that the AI-CDSS should continue to be integrated into their clinic within the EHR. Thirty-four percent believed the system had the potential to be helpful. This could be because the physicians perceived the interventions recommended by the system as inadequate, not sufficiently personalized for each patient, or simply unuseful
[33][28].
In the same way, Wang et al.
[31][32] reported that physicians felt the AI-CDSS “Brilliant Doctor” was not optimized for their local context, limiting or eliminating its use. Physicians reported that the confidence score of the diagnosis recommendations was too low, alerts were not useful, resource limitations were not considered, and it would take too long to complete what the system asked in order to obtain recommendations. These negative perceptions were not shared in N.P. Cruz et al.
[36][31] and Seol et al.
[30][27], where physicians were satisfied with the AI-CDSS.
Even with AI proving actual improvement in several health fields, its general implementation faces some challenges. There are four major ethical challenges: informed consent for the use of personal data, safety and transparency, algorithmic fairness and biases, and data privacy
[34][39][41,46]. First, most common AI systems lack explainability, what is known as the AI “black box.” This means that there is no way to be sure about which elements make the AI algorithm come to its conclusion. This lack of explainability also represents a main legal concern and reason why physicians distrust AI
[40][47]. There is no consensus on to what extent patients should know about the AI that will be used, which biases it could have, or what risks it would pose. Moreover, what should be said about the incapacity to interpret the reason behind each recommendation fully?
Secondly, for AI algorithms to function appropriately, they must be initially trained with an extensive dataset. For optimal training, at least 10 times the number of samples as parameters in the network are needed. This is unfeasible for PHC because of data and dataset scarcity, as most people do not have access to it
[19][41][19,22]. On top of that, most healthcare organizations lack the data infrastructure to collect the data required to adequately train an algorithm tailored to the local population and practice patterns and to guarantee the absence of bias
[6][15][22][42][6,15,35,48].
To solve this problem, some ML models are trained by using synthetic information, and others use datasets that may only derive from specific populations, leading to a selection bias
[13][14][17][34][39][43][13,14,17,41,46,49]. The deficiency of real clinical backgrounds and racial diversity leads to inaccurate recommendations, false diagnoses, ineffective treatments, disparity perpetuation, and even fatalities
[2]. Another phenomenon derived from data misalignment is the dataset shift, in which systems underperform due to small changes between the data used for training and the actual population in which the algorithm is being implemented
[44][45][46][24,50,51].
This raises questions about accountability
[16][34][16,41]. Who would be blamed in the case of an adverse event? Although there are forums and committees currently trying to settle this issue, right now it remains unclear, which leaves AI developers free of responsibility, physicians uncomfortable using it, and patients deprived of its potential benefits.
AI may have the capacity to grant equitable care among all types of populations, regardless of their socioeconomic backgrounds. However, the cost of implementing these technologies is high, and most developing countries do not have EHRs, or the ones they have are obsolete, sabotaging the implementation of efficient CDSSs
[4][11][4,11]. This may partly explain why the success of CDSS in high-income countries cannot be translated to low-resource settings
[6]. A reflection of the latter is the results in the research, with five out of six AI-CDSS being tested in high-income countries. Additionally, the AI used in the “Brilliant Doctor” CDSS was not state-of-the-art nor optimally integrated into their EHR, making it difficult to work with
[31][32].
Finally, the mistrust physicians and patients have towards AI is another critical challenge for its implementation
[18]. In a study analyzing physicians’ perceptions of AI, physicians felt it would make their jobs less satisfying, and almost everyone feared they would be replaced. They also believed AI would be unable to automate clinical reasoning because AI is too rigid, and clinical reasoning is fundamentally the opposite. There were several other concerns, like the fear of unquestioningly following AI’s recommendations and the idea that AI would take control of their jobs
[38][45].
In another study, the main reason for patients’ resistance to AI was the belief that AI is too inflexible and would be unable to consider their individual characteristics or circumstances
[17]. There is also concern that increasing interaction, mainly with the AI-CDSS, would change the dynamics of the patient–provider relationship, rendering the practical clinic less accurate
[14][47][14,21].
Recently, a vast effort has been put into the creation and implementation of explainable AI (XAI) models. These are described as “white-box” or “glass-box” models, which produce explainable results; however, they do not always achieve a state-of-the-art performance due to the simplicity of their algorithms
[48][49][52,53]. To overcome this, there has been an increasing interest in developing XAI models and techniques to make the current models interpretable. Interpretability techniques, such as local interpretable model-agnostic explanations (LIMEs), Shapley Additive explanations (SHAPs), and Ancors, can be applied to any “black-box” model to make its output more transparent
[48][52].
In healthcare, where the transparency of advice and therapeutic decisions is fundamental, approaches to explain the decisions of ML algorithms focus on visualizing the elements that contributed to each decision, such as heatmaps, which highlight the data that contributed the most to decision making
[49][53]. Although XAI is not yet a well-established field, and few pipelines have been developed, the huge volume of studies on interpretability methods showcases the benefits that these models will bring to current AI utilization
[48][49][52,53].
Making AI models more transparent will not eradicate mistrust by itself, as issues such as accountability and personal beliefs remain neglected. AI implementation should be a collaborative effort between AI users, developers, legislators, the public, and non-interested parties to ensure fairness
[50][54]. More emphasis on conducting qualitative research testing the performance of AI systems would help physicians be sure their use is backed by sound research and not merely by expert opinion. AI education is paramount for a thorough understanding of AI models and, with this, more trust in using these models. With this in mind, some medical schools are upgrading their curriculums to include augmented medicine and improve digital health literacy
[16]. Furthermore, some guidelines imply that trust can be achieved through transparency, education, reliability, and accountability
[50][54].
The needs of both physicians and patients must be considered. According to Shortliffe and Sepulveda, there are six characteristics that an AI-CDSS must have to be accepted and integrated
[51][55]:
-
There should be transparency in the logic of the recommendation.
-
It should be time-efficient and able to blend into the workflow.
-
It should be intuitive and easy to learn.
-
It should understand the individual characteristics of the setting in which it is implemented.
-
It should be made clear that it is designed to inform and assist, not to replace.
-
It should have rigorous, peer-reviewed scientific evidence.
To address some validation concerns and ensure transparent reporting, Vasey et al. proposed the DECIDE-AI reporting guideline, which focuses on the evaluation stage of AI-CDSS
[44][24]. Additionally, there should be a specific contracting instrument to ensure that data sharing involves both necessary protection and fair retributions to healthcare organizations and their patients
[22][35].
Co-development between developers and physicians is fundamental to obtaining adequate satisfaction levels and limitations for all parties
[43][49]. Moreover, physicians need to stop thinking of AI as a replacement and instead start thinking of it as a complement. In PHC, AI and AI-CDSS could become pivotal points for improvement, mainly since reportedly half of the care provided can be safely performed by non-physicians and nurses
[52][56]. Also, 77% of the time spent on preventative care and 47% on chronic care could be delegated to non-physicians
[53][57]. With optimized AI-CDSS, the time dedicated to healthcare could change focus from quantity to quality.