Deep learning for non-invasive detection of steatosis and fibrosis in MASLD: a multicenter study with a new fibroscan-labelled ultrasound dataset.
Authors
Affiliations (2)
Affiliations (2)
- Postgraduate Institute of Medical Education and Research, Chandigarh, India.
- Postgraduate Institute of Medical Education and Research, Chandigarh, India. [email protected].
Abstract
This study aimed to develop and validate deep learning models for non-invasive assessment of hepatic steatosis and fibrosis using conventional B-mode ultrasound images, with Fibroscan-derived measurement as reference standard. We utilized a multi-source approach comprising three distinct ultrasound datasets of patients with MASLD: a private clinical dataset (DS1, n = 111 patients, 1131 images), and two subsets from the public BEHSOF repository-Behbood Clinic subset (DS2, n = 95 patients, 1328 images) for training/validation, and Taleghani Hospital subset (DS3, n = 18 patients, 185 images) for external testing. Additionally, we validated on a temporally independent test set (DS4, n = 23 patients, 155 images). Deep learning models, including EfficientNet-B4 and Vision Transformers, were trained to classify steatosis (S0-S1 vs. S2-S3) and fibrosis (F0-F1 vs. F2-F4), respectively, using 5-fold cross-validation. Performance was assessed and compared with expert radiologists. For steatosis classification, our models achieved an AUROC of 0.83 (95% CI: 0.78-0.88) in cross-validation and 0.70 (95% CI: 0.43-0.93) on the external test set. For fibrosis, AUROC reached 0.86 (95% CI: 0.81-0.91) in cross-validation and 0.88 (95% CI: 0.71-1.00) on the external test set. On the temporally independent test set, performance was better with AUROCs of 0.82 (95% CI: 0.66-0.98) for steatosis and 0.91 (95% CI: 0.77-1.00) for fibrosis. The AI models consistently outperformed expert radiologists, whose interobserver agreement was moderate (κ = 0.58-0.65). This study demonstrates that deep learning models can accurately identify steatosis and fibrosis from ultrasound images across diverse populations and clinical settings.