1000/1000
Hot
Most Recent
Classification of asthma phenotypes has a potentially relevant impact on the clinical management of the disease. Methods for statistical classification without a priori assumptions (data-driven approaches) may contribute to developing a better comprehension of trait heterogeneity in disease phenotyping.
Asthma is a chronic inflammatory disease of the airways, characterized by at least partially reversible airway obstruction and bronchial hyper-responsiveness [1][2]. Global Initiative for Asthma (GINA) currently defines asthma as a heterogeneous disease, with a history of respiratory symptoms that vary over time and in intensity, together with variable expiratory airflow [2]. Taking into account that asthma is such a heterogeneous condition with complex pathophysiology, phenotypic classification is essential for the investigation of etiology and treatment tailoring [3].
Patients with asthma have been categorized into subgroups using theory- or data-driven approaches. In the classical theory-driven approach, patients with asthma are classified in categories defined a priori according to current knowledge (e.g., based on etiology, severity, and/or triggers) [4]. However, this approach generates asthma phenotypes that are not mutually exclusive, and the correlation with therapeutic response and prognosis might not be the most adequate [5].
Several classes of data-driven algorithms have been involved in tackling the issue of trait heterogeneity in disease phenotyping. The techniques most used to address phenotypic heterogeneity in health care data include distance-based (item-centered, e.g., clustering analysis) and model-based (patient-centered, e.g., latent class analysis) approaches, both of which are not mutually exclusive [6].
Distance-based approaches use the information on the distance between observations in a data set to generate natural groupings of cases [3]. The most commonly used clustering analysis methods are hierarchical, partitioning (k-means or k-medoids), and two-step clustering, which can be roughly described as a combination of the first two. Hierarchical clustering analysis functions by creating a hierarchy of groups that can be represented in a dendrogram, while the partitional methods divide the data into non-overlapping subsets that allow for the classification of each subject to exactly one group [3].
The included primary studies used a wide variety of methods for cluster analysis, with the most common method being hierarchical cluster analysis (n = 19), followed by k-means cluster analysis (n = 16) and two-step cluster analysis (n = 14). Latent class analysis was the most used model-based approach (n = 9) (Figure 1).
Figure 1. Data-driven method chosen for asthma phenotyping ordered by absolute frequency of use.
It was not possible to retrieve the variables used in two studies [7][8]. Variables belonging to the lung function, clinical, and atopy domains were all used in more than half of these studies. Figure 2 shows the percentage of studies that used each one of the represented domains of variables.
Figure 2. Proportion of each domain of variables in the 66 studies with retrievable chosen variables.
In hierarchical cluster analysis, the most frequent phenotypes were atopic/allergic asthma. A common association with atopic asthma was the early age of onset, while late-onset asthma was recurrently linked with severe disease. Atopic asthma was also the most frequent phenotype in two-step cluster analysis. In both k-means and k-medoids cluster analysis, severe asthma occurred the most often. In model-based methods, latent class analysis studies identified mostly phenotypes related to symptoms. Factor analysis used severity of disease to classify asthma, while latent transition analysis used allergic status and symptoms. One study derived longitudinal trajectories in terms of pulmonary function using latent mixture modeling.
This systematic review revealed a high degree of variability regarding the data-driven methods and variables applied in the models among the studies that identified data-driven asthma phenotypes in adults. There was a lack of consistency in the studies concerning the study setting, target population, choice of statistical method and variables, and ultimately, the label of the phenotype. Overall, the most frequent phenotypes were related to atopy, gender (female), and severe disease.
In the group of patients of the primary care setting, three phenotypes were determined, namely, “early-onset atopic asthma”, “obese, non-eosinophilic asthma”, and “benign asthma.” In the group of patients with refractory asthma managed in secondary care, four phenotypes were obtained “early onset atopic asthma”, “obese, non-eosinophilic asthma”, “early onset symptomatic asthma with minimal eosinophilic disease”, and “late-onset, eosinophilic asthma with few symptoms” [9]. These phenotypes persisted in later studies, with different variants [10][7][11][12].
An improvement of the characterization of asthma heterogeneity is an essential step in the development of more personalized approaches to asthma management and therapy. There is a need for further research to produce population-based studies with analysis of the longitudinal consistency of data-driven phenotypes.
The most used dimensions were variables regarding personal, clinical, and functional data. However, other dimensions were used in several studies. For example, Lefaudeux et al. demonstrated that clustering based on clinicophysiologic parameters can produce stable and reproducible clusters [13]. Deccache et al. aimed to characterize treatment adherence with a multidimensional approach encompassing asthma control, attitude towards the disease, and compliance with treatment [14]. Finally, Labor et al. aimed to assess the association of specific asthma phenotypes with mood disorders—five phenotypes were identified by cluster analysis of cross-sectional data in a sample of adult patients of a tertiary center: “allergic asthma”, “aspirin-exacerbated respiratory disease”, “late-onset asthma”, “obesity-associated asthma”, and “infection-associated asthma” [15].
In conclusion, data-driven methods are increasingly used to derive asthma phenotypes; studies suggest that both clinic and statistical expertise are required. Further research should focus on population-based samples and evaluation of longitudinal consistency of phenotypes.