Skip to main navigation Skip to search Skip to main content

Machine learning-based early prediction of asthma in preschoolers: The COCOA birth cohort study

  • Chang Hoon Han
  • , Seok Jae Heo
  • , Haerin Jang
  • , So Yeon Lee
  • , Ji Soo Park
  • , Dong In Suh
  • , Youn Ho Shin
  • , Jihyun Kim
  • , Kangmo Ahn
  • , Myung Hyun Sohn
  • , Eom Ji Choi
  • , Sun Hee Choi
  • , Hey Sung Baek
  • , Soo Jong Hong
  • , Kyung Won Kim
  • , Inkyung Jung
  • , Soo Yeon Kim
  • Yonsei University
  • Seoul National University
  • Kyung Hee University
  • CHA Gangnam Medical Center
  • Hallym University
  • National Medical Center

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Early prediction of asthma in preschoolers, which is crucial for timely intervention, remains challenging. This study aimed to develop a machine learning (ML)-based model and a questionnaire-based scoring tool for the prediction of asthma at age 3 years. Methods: Data from the COhort for Childhood Origin of Asthma and allergic diseases (COCOA), a comprehensive prospective birth cohort in South Korea, was used. Children with complete 3-year follow-up (n = 2007) were divided into development (n = 1472) and validation (n = 535) cohorts based on birth year. Asthma diagnosis at age 3 years was based on physician diagnosis, recurrent wheezing episodes, asthma treatment, or parental reports. Random Forest-based predictive models were developed using data collected until the age of 2 years, initially selecting features via least absolute shrinkage and selection operator (LASSO) regression. A questionnaire-based scoring tool was also developed and compared with multiple ML algorithms. Results: The ML-based prediction models showed improved performance as the data accumulated. The 6-month, 1-year, and 2-year models had area under the receiver operating characteristic curve (AUROC) values of 0.614, 0.726, and 0.774, respectively, in the validation cohort. The performance of the questionnaire-based scoring tool (AUROC, 0.790) was comparable to that of the ML-based model. Important predictors included paternal total IgE levels, maternal iron supplementation during pregnancy, parental asthma history, nut allergy history, and recent lower respiratory infections. Conclusions: Our study successfully developed robust predictive models for early asthma that demonstrated high performance. The questionnaire-based scoring tool offers particular value because of its clinical applicability. Further validation in diverse populations and investigation of the causative pathways of the identified predictors are necessary to enhance clinical utility.

Original languageEnglish
Article numbere70223
JournalPediatric Allergy and Immunology
Volume36
Issue number10
DOIs
StatePublished - Oct 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • asthma
  • birth cohort
  • child
  • machine learning
  • preschool

Fingerprint

Dive into the research topics of 'Machine learning-based early prediction of asthma in preschoolers: The COCOA birth cohort study'. Together they form a unique fingerprint.

Cite this