Background
Predicting child and youth mental health (CYMH) emergency department (ED) revisits (RVs) is critical for improving patient outcomes and optimizing use of resources. Previous CYMH ED RV studies have used statistical methods with research cohorts and produced varying results. Our aims were to develop a predictive algorithm incorporating machine learning (ML) with electronic health records (EHR) and validate it against a clinician-driven algorithm in a proof of concept project.
Methods
Data were retrospectively collected from a tertiary care pediatric hospital’s EHR from November 2017–November 2023, yielding 12,700 ED encounters from 8,696 patients, 8–18 years of age. The feature set comprised patient demographics, visit-level variables, laboratory results, procedure codes, and medication records. A mapping of 230 International Classification of Diseases (ICD)-10 codes into 28 Diagnostic and Statistical Manual (DSM)-5 categories was performed and a logistic regression (LR) ML model developed. Both tasks used clinical expert input. Seven clinical experts then independently assigned weights to 191 variables using a custom-designed application to create a structured clinician-weighted baseline for comparison to the ML algorithm. Both models were evaluated using AUROC and F1 score as primary metrics with precision and recall as secondary. LR coefficients and odds ratios were the primary interpretability outputs, while SHapley Additive exPlanations (SHAP) were used for supplementary visualization across four age strata.
Results
The LR machine learning model achieved an AUROC of 0.78, outperforming the structured clinician-weighted baseline (AUROC range: 0.54–0.64) Detailed analysis revealed that predictors such as past ED RV count, psychotherapeutic medication history, substance use history, and prior outpatient MH visits were consistently influential.
Conclusions
This proof of concept project demonstrates that ML can provide complementary, clinically interpretable predictions of CYMH ED RV. Alignment between model-derived predictors and clinician-weighted features supports interpretability and lays a foundation for further development. Future steps include enhancing sensitivity, expanding feature sets, and conducting prospective silent-mode validation to refine performance.
Researchers
-
Jeff Gilchrist
Associate Scientist, CHEO Research Institute
-
Paula Cloutier
Investigator, CHEO Research Institute
-
Allison Kennedy
Investigator, CHEO Research Institute
-
Kathleen Pajer
Senior Scientist, CHEO Research Institute



