Back to all papers

Concordance-Based Validation of Electronic Health Records and Modality Log Files to Improve MRI Exam Duration Prediction and Scheduling Performance.

June 8, 2026pubmed logopapers

Authors

Li L,Mastrangelo C,Mossa-Basha M,Mabotuwana T

Affiliations (5)

  • Department of Industrial & Systems Engineering, University of Washington, Seattle, USA. [email protected].
  • Department of Industrial & Systems Engineering, University of Washington, Seattle, USA.
  • Department of Radiology, University of Washington, Seattle, USA.
  • Department of Radiology, University of Alabama at Birmingham, Birmingham, USA.
  • Philips Healthcare, Bothell, USA.

Abstract

Integrating multi-source healthcare data for predictive modeling requires rigorous data quality validation, yet concordance between data systems is rarely assessed prior to model development.This study evaluated inter-system concordance between magnetic resonance imaging (MRI) exam duration data derived from modality log files (MLFs) and electronic health records (EHRs), developed a concordance-based data cleaning framework, and determined whether machine learning models trained on validated multi-source data improve MRI scheduling accuracy compared with template-based scheduling. Exam duration data from February 2022 through February 2024 were extracted from MLFs and EHRs. After fuzzy merging, concordance was assessed using Bland-Altman analysis and concordance correlation coefficients. Outliers were removed based on inter-system agreement thresholds. A Random Forest regression model was trained on the cleaned dataset to predict exam durations and compared with current template-based scheduling across MRI procedure codes. A total of 52,112 records were extracted from MLFs and 46,570 from EHRs. After exclusions and fuzzy merging, 30,275 records were retained; restricting to procedure codes with ≥ 400 records yielded 22,737 records, and 16,297 remained after concordance-based outlier removal. Bland-Altman analysis revealed discordance between MLF and EHR duration measurements, and the concordance-based filtering improved the concordance correlation coefficient from 0.33 to 0.87. The Random Forest model outperformed template-based scheduling for 11 of 12 procedure codes, with mean absolute error reductions ranging from 2.0% to 57.0%. For high-variability procedures, the proportion of exams completed within ± 10 min of the scheduled duration increased from as low as 29% to over 79%. These findings demonstrate that concordance-based validation is critical when integrating multi-source healthcare data, and that machine learning models trained on validated data substantially improved MRI scheduling accuracy, particularly for procedures with high intrinsic variability.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.