Developing a prediction model for persistent airflow limitation in asthmatic children
Original Article

Developing a prediction model for persistent airflow limitation in asthmatic children

Shiqiu Xiong1,2, Xinyu Jia1, Wei Chen1, Chuanhe Liu1

1Department of Allergy, Center for Asthma Prevention and Lung Function Laboratory, Capital Center for Children’s Health, Capital Medical University, Beijing, China; 2Department of Pediatrics, Xi’an Children’s Hospital, Xi’an, China

Contributions: (I) Conception and design: S Xiong, C Liu; (II) Administrative support: C Liu; (III) Provision of study materials or patients: S Xiong, C Liu; (IV) Collection and assembly of data: S Xiong, W Chen, X Jia; (V) Data analysis and interpretation: S Xiong, W Chen, X Jia; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Chuanhe Liu, MD. Department of Allergy, Center for Asthma Prevention and Lung Function Laboratory, Capital Center for Children’s Health, Capital Medical University, No. 2 Yabao Road, Chaoyang District, Beijing 100020, China. Email: liuchcip@126.com.

Background: A small proportion of asthmatic children will develop persistent airflow limitation (PAL). The purpose of this study was to develop predictive models using classical and machine learning methods to identify asthmatic children at risk of PAL.

Methods: A total of 1,671 asthmatic children were enrolled between January 1, 2019, and December 31, 2020, to serve as training and internal validation sets. Temporal validation included 401 patients from January 1, 2021, to December 31, 2021. PAL was determined in the third year after enrollment, defined as a fixed forced expiratory volume in 1 second (FEV1)/forced vital capacity (FVC) ratio below 0.75. Predictors included demographic and clinical data. Machine learning algorithms, including random forest (RF) and extreme gradient boost (XGBoost), along with the classical logistic regression (LR) methods, were utilized to develop prediction models. Discrimination ability evaluation was conducted using the area under the curve (AUC), accuracy, sensitivity, and specificity, while fitness estimation utilized calibration curves and Brier scores. Additionally, decision curve analysis was employed for clinical value evaluation.

Results: In the internal validation, the RF model achieved an AUC of 0.857 (95% CI: 0.791–0.924), followed by LR with an AUC of 0.849 (95% CI: 0.780–0.908) and XGBoost with an AUC of 0.835 (95% CI: 0.761–0.909). In the temporal validation, the three prediction models exhibited similar performance. Specifically, RF attained an AUC of 0.853 (95% CI: 0.771–0.935), LR achieved an AUC of 0.836 (95% CI: 0.742–0.938), and XGBoost reached an AUC of 0.848 (95% CI: 0.757–0.940). The calibration curve and low Brier score indicated good fitness of all prediction models, and decision curve analysis revealed desirable net benefits for all prediction models in both internal and temporal validation.

Conclusions: PAL in asthmatic children can be predicted with clinically meaningful accuracy using routinely available clinical data, and three prediction models (LR, RF, and XGBoost) demonstrated comparable performance in identifying high-risk patients.

Keywords: Persistent airflow limitation (PAL); asthma; children; prediction model; machine learning (ML)


Submitted Nov 22, 2023. Accepted for publication Aug 15, 2025. Published online Oct 29, 2025.

doi: 10.21037/jtd-23-1796


Highlight box

Key findings

• The prediction models incorporating demographic and clinical data achieved excellent performance in identifying asthmatic children at high risk of persistent airflow limitation.

What is known and what is new?

• A subset of asthmatic children will develop persistent airflow limitation, leading to poor prognosis.

• This study provided a clinical prediction model capable of accurately identifying children at a heightened risk of developing airflow limitation.

What is the implication, and what should change now?

• This research facilitates the early identification of asthmatic children susceptible to persistent airflow limitation development, offering valuable guidance for implementing proactive treatment strategies.


Introduction

Asthma is one of the most prevalent chronic respiratory diseases in children, and a defining characteristic of asthma is the presence of reversible airflow limitation (1). However, a small proportion of asthmatic children will develop persistent airflow limitation (PAL) (2). Research has shown that patients with PAL experience an accelerated decline in lung function (3). Furthermore, asthmatic children with persistently low lung function have a high risk of developing chronic obstructive pulmonary disease (COPD) in adulthood (4). Therefore, it is crucial to identify asthmatic patients who are at a high risk of developing PAL early on and implement positive treatment strategies. However, existing studies primarily focus on exploring characteristics and identifying risk factors for PAL without developing prediction models for clinical use (5,6).

Machine learning (ML), a branch of artificial intelligence, has garnered significant attention in the medical field due to its ability to process diverse types of data, such as laboratory data, images, invoice records, and free text, enabling applications in disease diagnosis, phenotype exploration, prognosis prediction, and clinical decision-making (7,8). Among the various ML approaches, supervised learning is most employed in medical research. This paradigm aims to develop predictive models that accurately map input features to corresponding outputs, enabling reliable predictions for both unlabeled samples and new observations (9).

In the field of asthma, numerous predictive models have been developed to identify individuals at high risk of asthma diagnosis, exacerbations, or poor outcomes (10,11). By leveraging various ML algorithms—such as boosting methods, random forest (RF), and neural networks—these models can integrate a broader spectrum of clinical, demographic, and even omics data, thereby enhancing predictive accuracy (12-14). However, ML-based prediction models specifically targeting PAL are currently lacking. In this study, we applied several ML methods, including RF and extreme gradient boost (XGBoost) along with the classical logistic regression (LR) method, to develop clinical prediction models aiming to discriminate asthmatic children who were at a high risk of developing PAL in the future. Furthermore, we investigated whether these ML-based prediction models outperformed the traditional LR model. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-1796/rc).


Methods

Study design

This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This retrospective study was conducted with the approval of the Ethics Committee of Capital Center for Children’s Health, Capital Medical University (Approval No. SHERLL 2014040) and individual consent for this retrospective analysis was waived. The study included asthmatic children who visited the outpatient clinic of the Department of Allergy between January 1, 2019, and December 31, 2020, serving as the development cohort for training and internal validation for the prediction model. After a two-year follow-up period, the PAL status of these asthmatic patients was estimated. Asthmatic children who visited our clinic between January 1, 2021, and December 31, 2021, were enrolled for temporal validation to evaluate the generalizability of the prediction model.

Participants

The diagnosis of asthma was made in accordance with the guidelines provided by the Global Initiative for Asthma (GINA) (1). We included patients aged between 5 and 16 years who had visited our clinic for asthma on at least three occasions. Asthmatic children with incomplete medication data, missing spirometry records, or those diagnosed with other conditions such as chronic respiratory disease other than asthma, cardiovascular diseases, autoimmune disorders, or mental health issues were excluded from the cohort.

Predictor collection and definition

We collected a variety of predictors primarily from the electronic medical records of the study participants. These predictors included (I) demographic data: age, gender, and body mass index (BMI); (II) clinical characteristics: asthma severity, history of asthma exacerbation and pneumonia, atopy status, and allergic comorbidities; (III) laboratory tests: blood neutrophils and eosinophils, total serum immunoglobulin E (IgE) level, specific IgE values, skin prick test, fraction of exhaled nitric oxide (FeNO); (IV) baseline spirometry measurements (performed at the enrollment): forced expiratory volume in 1 second (FEV1)/forced vital capacity (FVC), percentage of predicted FEV1 (FEV1%pred), percentage of predicted FVC (FVC%pred), percentage of predicted peak expiratory flow (PEF%pred), and percentage of predicted maximal mid-expiratory flow (MMEF%pred); and (V) asthma medication use: daily dose of inhaled corticosteroids (ICS), systemic steroid use, medication adherence, and immunotherapy.

Asthma severity was determined based on the daily dose of ICS following the GINA guidelines (1). Asthma exacerbation was defined according to the criteria recommended in the Official American Thoracic Society/European Respiratory Society Statement (15). Atopy status was defined as having either a positive Phadiatop test (≥0.35 KU/L) or a positive skin prick test for airborne allergens. A more detailed description of predictors is provided in Table S1.

Outcome definition

PAL was estimated at the third year after enrollment. The definition of PAL in asthma lacks consensus. In COPD, it is defined as FEV1/FVC below a post-bronchodilator (BD) threshold of 0.70 (16). Alternatively, some define it as an FEV1/FVC lower than the lower limit of normal after optimal treatment. However, a fixed FEV1/FVC ratio of 0.75 for asthma patients has shown improved sensitivity and specificity (17). In this study, PAL status was assessed in the third year after enrollment and defined as FEV1/FVC <0.75 in all completed spirometry measurements conducted within a one-year period. Within PAL, we made a further distinction between irreversible PAL (IPAL) and reversible PAL (RPAL). IPAL was characterized by FEV1/FVC <0.75 for both pre- BD and post-BD measurements, while RPAL was characterized by an FEV1/FVC <0.75 for the pre-BD measurement, but ≥0.75 for the post-BD measurement (5).

Prediction model development

We employed a stratified random sampling method to partition the dataset into two groups with a ratio of 7:3, designating the former as the training set and the latter as the internal validation set. Prediction models were constructed using LR, RF, and XGBoost, respectively. Initially, predictors with a missing rate exceeding 25% were excluded from the analysis. Subsequently, multiple imputation techniques were applied to address missing values in the remaining predictors. For the LR model, feature selection was carried out using stepwise regression based on Bayesian information criterion values. We did not perform feature selection for the RF and XGBoost, as these models are adept at handling diverse data types and can effectively manage high-dimensional predictor sets (18).

To address the class imbalance in the training dataset due to the low incidence of PAL, we applied multiple resampling techniques (oversampling, undersampling, and mixed sampling approaches) exclusively to the training set while maintaining original distributions in validation sets for unbiased evaluation. The model developed using an oversampled dataset demonstrated superior performance and was selected as the final output prediction model. To prevent overfitting, parameter tuning was conducted through five-fold cross-validation on the training datasets.

Performance evaluation

Receiver operating characteristic (ROC) curves were generated, and the corresponding area under the ROC curve (AUC) was calculated. The threshold for predicted probabilities was determined based on the highest Youden index, from which accuracy, sensitivity, and specificity were computed. For temporal validation, we enrolled a separate cohort between January 1, 2021, and December 31, 2021, maintaining consistency in outcome definitions and predictor variables with the training dataset. Parameters were calculated using bootstrap sampling from this cohort. Calibration curves were generated to evaluate the alignment between predicted probabilities and observed outcomes. The accuracy of probabilistic prediction was quantified using the Brier score. Furthermore, decision curve analysis (DCA) was conducted to estimate the clinical value of each model.

Permutation importance is a technique used to calculate feature importance by measuring the performance drop when a single feature is randomly shuffled. In this study, we applied permutation importance to estimate the importance of predictors using the DALEX package in R (19). All statistical analyses and data visualization were performed using R (Version 4.0.2) and Prism (Version 9.5.1).

Statistical analysis

Continuous variables were expressed as either the mean ± standard deviation (SD) or the median with interquartile range (IQR), depending on the data distribution. Categorical variables were presented as frequencies and percentages. To compare continuous variables between two independent groups, either the Student’s t-test or the Mann-Whitney U test was applied, based on the distribution of the data. Categorical variables were analyzed using the chi-squared test or Fisher’s exact test, as appropriate. A P value of less than 0.05 was considered statistically significant. All statistical analyses were conducted using R software (Version 4.0.2).


Results

Characteristics of participants

Our cohort comprised 1,671 eligible participants, with 6.0% (n=101) developing PAL. Of these 101 PAL patients, 59 had IPAL, and 42 had RPAL. As presented in Table 1, patients with PAL had a lower percentage of females than non-PAL (NPAL) patients (16.8% vs. 29.7%, P=0.008). Patients with PAL were also older [9.0 (7.0, 11.0) vs. 7.0 (5.0, 9.0) years, P<0.001] and had a higher BMI [19.33 (15.80, 21.99) vs. 16.52 (15.02, 19.39) kg/m2, P<0.001] than NPAL patients. Marked differences were seen at asthma onset, asthma exacerbation, blood neutrophil percentage, FeNO level, and all baseline spirometry measurements (Table 1).

Table 1

Comparison of characteristics of participants with or without persistent airflow limitation

Characteristics NPAL PAL P
Total, n Value Total, n Value
IPAL/RPAL 1,570 101 42/59
Female, n (%) 1,570 466 (29.7) 101 17 (16.8) 0.008
Age (years), median (IQR) 1,570 7.0 (5.0, 9.0) 101 9.0 (7.0, 11.0) <0.001
BMI (kg/m2), median (IQR) 1,525 16.52 (15.02, 19.39) 96 19.33 (15.80, 21.99) <0.001
Onset, n (%) 1,570 508 (32.4) 101 20 (19.8) 0.01
Severe asthma, n (%) 1,570 82 (5.2) 101 5 (5.0) >0.99
Poor adherence, n (%) 1,570 608 (38.7) 101 43 (42.6) 0.50
Systematic steroids use, n (%) 1,570 112 (7.1) 101 11 (10.9) 0.22
Exacerbation, n (%) 1,570 159 (10.1) 101 17 (16.8) 0.049
Allergic rhinitis, n (%) 1,570 1,546 (98.5) 101 99 (98.0) 0.66
Allergic conjunctivitis, n (%) 1,570 200 (12.7) 101 10 (9.9) 0.49
Food allergy, n (%) 1,570 25 (1.6) 101 1 (0.1) >0.99
Eczema or dermatitis, n (%) 1,570 379 (24.1) 101 16 (15.8) 0.07
Atopy, n (%) 1,570 191 (13.5) 101 76 (89.4) 0.54
Pneumonia, n (%) 1,570 93 (5.9) 101 10 (9.9) 0.16
Neutrophil percentage, median (IQR) 1,265 0.47 (0.40, 0.54) 75 0.51 (0.44, 0.58) 0.002
Eosinophil percentage, median (IQR) 1,265 0.048 (0.026, 0.070) 75 0.050 (0.024, 0.080) 0.82
FeNO (ppb), median (IQR) 1,455 18.0 (10.0, 35.0) 90 28.0 (13.2, 50.0) <0.001
Total IgE (KU/L), median (IQR) 1,013 228.0 (108.0, 489.0) 53 273.0 (187.0, 473.0) 0.18
FEV1%pred, median (IQR) 1,506 97.90 (89.60, 106.38) 96 94.05 (86.15, 102.62) 0.002
FVC%pred, median (IQR) 1,506 96.8 (88.5, 104.88) 96 101.8 (92.1, 109.4) 0.002
FEV1/FVC, median (IQR) 1,506 85.86 (81.74, 90.13) 96 77.50 (74.35, 81.06) <0.001
PEF%pred, median (IQR) 1,506 95.00 (85.30, 104.70) 96 88.1 (78.83, 100.95) <0.001
MMEF%pred, median (IQR) 1,506 80.38 (65.90, 92.88) 96 62.10 (52.23, 71.15) <0.001

BMI, body mass index; FeNO, fraction of exhaled nitric oxide; FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; IPAL, irreversible persistent airflow limitation; IQR, interquartile range; MMEF%pred, percentage of predicted maximal mid-expiratory flow; NPAL, non-persistent airflow limitation; PAL, persistent airflow limitation; PEF%pred, percentage of predicted peak expiratory flow; RPAL, reversible persistent airflow limitation.

Performance of prediction models

The AUC value of the RF model on the internal validation dataset was 0.857 (95% CI: 0.791–0.924), accompanied by an accuracy of 0.730 (95% CI: 0.691–0.768), a sensitivity of 0.723 (95% CI: 0.682–0.761), and a specificity of 0.875 (95% CI: 0.714–1.000). The LR model achieved an AUC of 0.849 (95% CI: 0.780–0.908), an accuracy of 0.732 (95% CI: 0.695–0.774), a sensitivity of 0.727 (95% CI: 0.687–0.769), and a specificity of 0.833 (95% CI: 0.682–0.964). XGBoost demonstrated an AUC of 0.835 (95% CI: 0.761–0.909), an accuracy of 0.631 (95% CI: 0.587–0.671), a sensitivity of 0.618 (95% CI: 0.576–0.660), and a specificity of 0.875 (95% CI: 0.714–1.000) (Table 2, Figure S1).

Table 2

Prediction performance of prediction models

Model AUC (95% CI) Accuracy (95% CI) Sensitivity (95% CI) Specificity (95% CI)
LR 0.849 (0.780, 0.908) 0.732 (0.695, 0.774) 0.727 (0.687, 0.769) 0.833 (0.682, 0.964)
RF 0.857 (0.791, 0.924) 0.730 (0.691, 0.768) 0.723 (0.682, 0.761) 0.875 (0.714, 1.000)
XGBoost 0.835 (0.761, 0.909) 0.631 (0.587, 0.671) 0.618 (0.576, 0.660) 0.875 (0.714, 1.000)

AUC, area under the curve; CI, confidence interval; LR, logistic regression; RF, random forest; XGBoost, extreme gradient boost.

The calibration curves (Figure 1) for the three models exhibited alignment between predicted and observed positive cases (children with PAL). Furthermore, these three models attained low Brier scores, signifying the high accuracy of their probabilistic forecasts. DCA for the three models revealed that they could achieve a high net benefit, suggesting desirable clinical value (Figure S1). Overall, three prediction models based on different algorithms demonstrated a similar performance.

Figure 1 The calibration curves of prediction models on the internal validation dataset. (A) Logistic regression. (B) Random forest. (C) Extreme gradient boost.

Feature importance

The feature importance is presented in Figure 2. Among the top ten features, baseline FEV1/FVC emerged as the most crucial, followed by MMEF%pred, age, FeNO levels, PEF%pred, FVC%pred, BMI, blood neutrophil percent, FEV1%pred, and asthma onset. The seven most important features are illustrated in Figure S2. Baseline FEV1/FVC and MMEF% exhibit a negative correlation with PAL development. Asthmatic children aged over 11 years and with a BMI exceeding 19 kg/m2 are more prone to PAL development. While FeNO levels, PEF%pred, and FVC%pred do not show a direct correlation with PAL development, patients with FeNO levels surpassing 30 ppb and PEF%pred below 80 demonstrate a heightened likelihood of PAL development.

Figure 2 Predictors with the importance of the risk of persistent airflow limitation development. BMI, body mass index; FeNO, fraction of exhaled nitric oxide; FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; MMEF, maximal mid-expiratory flow; PEF, peak expiratory flow.

Temporal validation of prediction models

We further assessed the generalization ability of the three models through temporal validation. The dataset used for temporal validation contained 401 asthmatic children, and only 4.49% of patients (n=18) developed PAL. The characteristics of these patients are presented in Table S2.

As illustrated in Figure 3, the RF model demonstrated the AUC value of 0.853 (95% CI: 0.771–0.935). The XGBoost model followed closely with an AUC value of 0.848 (95% CI: 0.757–0.940), and the LR model had an AUC of 0.836 (95% CI: 0.742–0.938). The calibration curve and low Brier score substantiated the good fitness of all three models (Figure S3). Furthermore, the DCA for prediction models exhibited favorable net benefits (Figure S4).

Figure 3 The ROC curves of the three models on the external validation dataset. CI, confidence interval; LR, logistic regression; RF, random forest; ROC, Receiver operating characteristic; XGBoost, extreme gradient boost.

Discussion

Our study demonstrated that PAL in asthmatic children can be effectively predicted using clinical prediction models. All three models (RF, XGBoost, and LR) exhibited comparable discriminatory power during internal validation, and their consistent performance in temporal validation highlights the robustness and generalizability of the prediction models. Moreover, baseline spirometry measurements and demographical data (age and BMI) were identified as important predictors.

ML has been employed in the field of medicine since the 1970s. In recent years, its popularity has surged due to advancements in computing power and its capability to process vast and complex datasets (20). In the domain of asthma research, ML has also been widely employed. For example, ML-based prediction models have utilized environmental data, medical records, as well as genomic or microbiome data from early life to predict the development of asthma (21-23). Researchers have also effectively utilized natural language processing techniques on electronic health and medical records to diagnose asthma accurately (24). Furthermore, ML has played a crucial role in the development of tools for asthma monitoring and predicting asthma exacerbations (25,26). Although ML has been extensively applied in various aspects of asthma research, limited studies have focused on utilizing ML to identify asthmatic children at high risk of developing PAL.

In this study, demographic and clinical data were utilized as predictors, and classical LR, along with ML algorithms (RF and XGBoost), were employed to develop clinical prediction models. All models demonstrated strong discrimination ability, with an AUC range of 0.835–0.857 in internal validation and 0.836–0.853 in external validation. This enables early detection of asthmatic children at risk of PAL development. Positive treatment strategies, like omalizumab or dupilumab, may enhance lung function and reduce asthma exacerbations, a consistently demonstrated predictor of PAL progression (27-29). Future studies should investigate the effectiveness of implementing these positive treatment strategies in identified high-risk asthmatic children to prevent PAL development.

An ongoing debate surrounds whether ML algorithms surpass classical methods. Recent systematic reviews have suggested that ML algorithms consistently outperform classical regression models, such as LR (30,31). Our previous systematic review of asthma exacerbation prediction models similarly highlighted the superior performance of boosting models compared to LR (13). However, Christodoulou’s systematic review of clinical prediction models found no conclusive evidence supporting ML’s superiority over LR (32). Similarly, a study by Gravesteijn, focusing on traumatic brain injury prognosis, revealed that ML did not outperform LR (33). In many instances, the predictive performance of ML and LR is comparable, especially when using structured clinical data and when datasets are not extremely large or complex (34,35). In this study, we also found that the RF and XGBoost algorithms did not significantly outperform LR.

In this study, important features were also identified. Several baseline spirometry parameters, including FEV1/FVC and MMEF%pred, were negatively related to the proportion of PAL development, indicating that developing PAL was a protracted chronic process, possibly evolving over several years. Sousa’s prospective study on asthmatic children highlighted age and BMI as factors associated with PAL (36). Consistent with this, our study revealed that older age and higher BMI in patients were linked to an increased likelihood of PAL development. While prior studies have consistently identified asthma exacerbations as a significant predictor of PAL (2,29), and our univariate comparison similarly showed higher exacerbation rates in the PAL group, this factor did not emerge as a significant predictor in our study. This discrepancy could be attributed to developing models on an oversampled dataset, potentially altering the distribution of patients with asthma exacerbation. These insights collectively enhance our understanding of PAL development and underscore the importance of considering specific demographic and clinical factors in predictive modeling.

The key strength of our study lies in the use of readily accessible clinical predictors, such as spirometry parameters and demographic characteristics, which enhances the practicality of our model for implementation in resource-limited settings. Through permutation importance analysis, we addressed the “black-box” nature of ML, identifying FEV1/FVC, MMEF%pred, and age as the most influential predictors, thus making the model more interpretable. Our model demonstrated excellent discrimination and calibration, empowering clinicians to identify high-risk children early and initiate personalized interventions. However, it is important to acknowledge several limitations of our study. Firstly, the data we utilized exhibited significant imbalance due to the low prevalence of PAL, which may have potentially influenced the performance of the binary prediction model. Despite employing class rebalancing techniques, caution was warranted as oversampling techniques could potentially lead to overfitting and biased results. Secondly, our validation process was limited to temporal validation, which assesses the model’s generalization ability within the same dataset over time. However, additional geographic validation would enhance the robustness of our findings by evaluating the models’ performance in different populations or settings. To facilitate clinical translation, future work will focus on developing user-friendly web-based tools or software incorporating these prediction models to broaden their clinical applicability.


Conclusions

PAL in asthmatic children can be predicted with clinically meaningful accuracy using routinely available clinical data, and three prediction models (LR, RF, and XGBoost) demonstrated comparable performance in identifying high-risk patients. Future research should focus on geographical validation to further validate the reliability of our prediction model. Additionally, exploring the impact of positive interventions on reducing PAL development in high-risk asthmatic children is also warranted.


Acknowledgments

We would like to thank all the clinical technicians who conducted the spirometry measurements and skin prick tests.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-1796/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-1796/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-1796/prf

Funding: The study was supported by the Capital Health Research and Development of Special (2022-1G-4241).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-1796/coif). All authors report that this study was supported by the Capital Health Research and Development of Special (2022-1G-4241). The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study was approved by the Ethics Committee of the Capital Center for Children’s Health, Capital Medical University, Beijing, China (Approval No. SHERLL 2014040) and individual consent for this retrospective analysis was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Global Initiative for Asthma. Global Strategy for Asthma Management and Prevention, 2024. Updated May 2024. Available online: https://ginasthma.org/
  2. Xiong S, Tian C, Shao M, et al. Persistent Airflow Limitation Prediction and Risk Factor Analysis Among Asthmatic Children: A Retrospective Cohort Study. Pediatr Pulmonol 2025;60:e27381. [Crossref] [PubMed]
  3. Contoli M, Baraldo S, Marku B, et al. Fixed airflow obstruction due to asthma or chronic obstructive pulmonary disease: 5-year follow-up. J Allergy Clin Immunol 2010;125:830-7. [Crossref] [PubMed]
  4. Qin C, Gao J, Sang X, et al. Childhood respiratory risk profiles associate with lung function and COPD among the old population. Ann Med 2025;57:2470954. [Crossref] [PubMed]
  5. Wang G, Kull I, Bergström A, et al. Early-life risk factors for reversible and irreversible airflow limitation in young adults: findings from the BAMSE birth cohort. Thorax 2021;76:503-7. [Crossref] [PubMed]
  6. Xiong SQ, Tian CY, Chen W, et al. Clinical characteristics analysis of persistent airflow limitation in asthmatic children. Zhonghua Jie He He Hu Xi Za Zhi 2024;47:807-14. [Crossref] [PubMed]
  7. Asif S, Wenhui Y. Advancements and Prospects of Machine Learning in Medical Diagnostics: Unveiling the Future of Diagnostic Precision. Arch Computat Methods Eng 2025;32:853-83.
  8. An Q, Rahman S, Zhou J, et al. A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges. Sensors (Basel) 2023;23:4178. [Crossref] [PubMed]
  9. Uddin S, Khan A, Hossain ME, et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak 2019;19:281. [Crossref] [PubMed]
  10. Votto M, De Silvestri A, Postiglione L, et al. Predicting paediatric asthma exacerbations with machine learning: a systematic review with meta-analysis. Eur Respir Rev 2024;33:240118. [Crossref] [PubMed]
  11. Louis G, Schleich F, Guillaume M, et al. Development and validation of a predictive model combining patient-reported outcome measures, spirometry and exhaled nitric oxide fraction for asthma diagnosis. ERJ Open Res 2023;9:00451-2022. [Crossref] [PubMed]
  12. Turcatel G, Xiao Y, Caveney S, et al. Predicting Asthma Exacerbations Using Machine Learning Models. Adv Ther 2025;42:362-74. [Crossref] [PubMed]
  13. Xiong S, Chen W, Jia X, et al. Machine learning for prediction of asthma exacerbations among asthmatic patients: a systematic review and meta-analysis. BMC Pulm Med 2023;23:278. [Crossref] [PubMed]
  14. Wei K, Qian F, Li Y, et al. Integrating multi-omics data of childhood asthma using a deep association model. Fundam Res 2024;4:738-51. [Crossref] [PubMed]
  15. Reddel HK, Taylor DR, Bateman ED, et al. An official American Thoracic Society/European Respiratory Society statement: asthma control and exacerbations: standardizing endpoints for clinical asthma trials and clinical practice. Am J Respir Crit Care Med 2009;180:59-99. [Crossref] [PubMed]
  16. Global Strategy for Prevention. Diagnosis and Management of COPD, 2024. Updated December 2023. Available online: https://goldcopd.org/2024-gold-report/
  17. Cerveri I, Corsico AG, Accordini S, et al. Underestimation of airflow obstruction among young adults using FEV1/FVC <70% as a fixed cut-off: a longitudinal evaluation of clinical and functional outcomes. Thorax 2008;63:1040-5. [Crossref] [PubMed]
  18. Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform 2023;24:bbad002. [Crossref] [PubMed]
  19. Biecek P. DALEX: Explainers for Complex Predictive Models in R. Journal of Machine Learning Research 2018;19:1-5.
  20. Handelman GS, Kok HK, Chandra RV, et al. eDoctor: machine learning and the future of medicine. J Intern Med 2018;284:603-19. [Crossref] [PubMed]
  21. He P, Moraes TJ, Dai D, et al. Early prediction of pediatric asthma in the Canadian Healthy Infant Longitudinal Development (CHILD) birth cohort using machine learning. Pediatr Res 2024;95:1818-25. [Crossref] [PubMed]
  22. Dessie EY, Gautam Y, Ding L, et al. Development and validation of asthma risk prediction models using co-expression gene modules and machine learning methods. Sci Rep 2023;13:11279. [Crossref] [PubMed]
  23. Wang XW, Wang T, Schaub DP, et al. Benchmarking omics-based prediction of asthma development in children. Respir Res 2023;24:63. [Crossref] [PubMed]
  24. Wu ST, Sohn S, Ravikumar KE, et al. Automated chart review for asthma cohort identification using natural language processing: an exploratory study. Ann Allergy Asthma Immunol 2013;111:364-9. [Crossref] [PubMed]
  25. Zhou C, Shuai L, Hu H, et al. Applications of machine learning approaches for pediatric asthma exacerbation management: a systematic review. BMC Med Inform Decis Mak 2025;25:170. [Crossref] [PubMed]
  26. Bhat GS, Shankar N, Kim D, et al. Machine Learning-Based Asthma Risk Prediction Using IoT and Smartphone Applications. IEEE Access 2021;9:118708-15.
  27. Bousquet J, Humbert M, Gibson PG, et al. Real-World Effectiveness of Omalizumab in Severe Allergic Asthma: A Meta-Analysis of Observational Studies. J Allergy Clin Immunol Pract 2021;9:2702-14. [Crossref] [PubMed]
  28. Hanania NA, Castro M, Bateman E, et al. Efficacy of dupilumab in patients with moderate-to-severe asthma and persistent airflow obstruction. Ann Allergy Asthma Immunol 2023;130:206-214.e2. [Crossref] [PubMed]
  29. Mindus S, Gislason T, Benediktsdottir B, et al. Respiratory symptoms, exacerbations and sleep disturbances are more common among participants with asthma and chronic airflow limitation: an epidemiological study in Estonia, Iceland and Sweden. BMJ Open Respir Res 2024;11:e002063. [Crossref] [PubMed]
  30. Tiruneh SA, Vu TTT, Rolnik DL, et al. Machine Learning Algorithms Versus Classical Regression Models in Pre-Eclampsia Prediction: A Systematic Review. Curr Hypertens Rep 2024;26:309-23. [Crossref] [PubMed]
  31. Javanmard Z, Zarean Shahraki S, Safari K, et al. Artificial intelligence in breast cancer survival prediction: a comprehensive systematic review and meta-analysis. Front Oncol 2024;14:1420328. [Crossref] [PubMed]
  32. Christodoulou E, Ma J, Collins GS, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019;110:12-22. [Crossref] [PubMed]
  33. Gravesteijn BY, Nieboer D, Ercole A, et al. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury. J Clin Epidemiol 2020;122:95-107. [Crossref] [PubMed]
  34. Romano L, Manno A, Rossi F, et al. Statistical models versus machine learning approach for competing risks in proctological surgery. Updates Surg 2025;77:333-41. [Crossref] [PubMed]
  35. Dong XX, Liu JH, Zhang TY, et al. Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study. Psychiatry Investig 2025;22:267-78. [Crossref] [PubMed]
  36. Sousa AW, Barros Cabral AL, Arruda Martins M, et al. Risk factors for fixed airflow obstruction in children and adolescents with asthma: 4-Year follow-up. Pediatr Pulmonol 2020;55:591-8. [Crossref] [PubMed]
Cite this article as: Xiong S, Jia X, Chen W, Liu C. Developing a prediction model for persistent airflow limitation in asthmatic children. J Thorac Dis 2025;17(10):8735-8744. doi: 10.21037/jtd-23-1796

Download Citation