Automated Echocardiographic Detection of Congenital Heart Disease Using Artificial Intelligence
Authors
Affiliations (1)
Affiliations (1)
- Boston Children's Hospital
Abstract
BackgroundDelayed or missed diagnosis of congenital heart disease (CHD) contributes to excess pediatric mortality worldwide. Echocardiography (echo) is central to diagnosing and triaging CHD, yet expert interpretation remains a scarce and maldistributed global resource. Artificial intelligence (AI) offers the potential to democratize diagnostics and extend expert-level interpretation beyond large academic centers, but its application in CHD remains underexplored. MethodsWe developed EchoFocus-CHD, an AI-enabled model for automated detection of 12 critical and 8 non-critical CHD lesions, individually and as composites. The composite critical CHD outcome was the primary endpoint. The model expands on a multi-task, view-agnostic architecture (PanEcho) with a transformer encoder to improve focus on relevant echo views. The model was trained (80%) and tested (20%) on the first echo per patient from Boston Childrens Hospital (BCH), with external validation on US and international studies from patients referred to BCH. ResultsThe internal and external cohorts included 3.4 million videos from 54,727 echos (median age at echo 7.1 [IQR, 0.2-15.0] years; 5.8% critical CHD; 23.6% non-critical CHD) and 167,484 videos from 3,356 echos (median age at echo 2.5 [IQR, 0.3-9.4] years; 29.4% critical CHD; 45.6% non-critical CHD), respectively. EchoFocus-CHD showed excellent internal ability to detect the composite critical CHD outcome (AUROC 0.94, LR+ 7.50, LR- 0.14) and individual critical lesions (AUROC 0.83-1.00), as well as composite non-critical CHD (AUROC 0.90, LR+ 5.00, LR- 0.23) and individual non-critical lesions (AUROC 0.70-0.96). Performance declined during external validation to detect critical CHD (AUROC 0.77), coinciding with greater expert disagreement on external cases ({kappa}=0.72 versus 0.82 for internal cases). Explainability analyses demonstrated that the model prioritized the same clinically relevant views (parasternal long-axis, parasternal short-axis, and subxiphoid long-axis) across internal and external cohorts, while UMAP analysis revealed a domain shift between cohorts. Retraining on all available US patients attenuated domain shift, improving international critical CHD detection (AUROC 0.87) and calibration. ConclusionsEchoFocus-CHD shows promise for automated CHD detection and highlights the need to address domain shift for real-world deployment. By identifying high-risk CHD lesions, this approach could support triage, prioritize expert review, and optimize resource allocation, advancing more equitable global cardiovascular care.