Machine and deep learning for MRI-based quantification of liver iron overload: a systematic review and meta-analysis.
Authors
Affiliations (3)
Affiliations (3)
- Department of Medical Physics, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran. [email protected].
- Department of Medical Physics, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.
- Dean, College of Applied Medical Sciences, Cardiovascular Magnetic Resonance Imaging, University of Hail, Hai'l, Saudi Arabia.
Abstract
Liver iron overload, associated with conditions such as hereditary hemochromatosis and β‑thalassemia major, requires accurate quantification of liver iron concentration (LIC) to guide timely interventions and prevent complications. Magnetic resonance imaging (MRI) is the gold standard for noninvasive LIC assessment, but challenges in protocol variability and diagnostic consistency persist. Machine learning (ML) and deep learning (DL) offer potential to enhance MRI-based LIC quantification, yet their efficacy remains underexplored. This systematic review and meta-analysis evaluates the diagnostic accuracy, algorithmic performance, and clinical applicability of ML and DL techniques for MRI-based LIC quantification in liver iron overload, adhering to PRISMA guidelines. A comprehensive search across PubMed, Embase, Scopus, Web of Science, Cochrane Library, and IEEE Xplore identified studies applying ML/DL to MRI-based LIC quantification. Eligible studies were assessed for diagnostic accuracy (sensitivity, specificity, AUC), LIC quantification precision (correlation, mean absolute error), and clinical applicability (automation, processing time). Methodological quality was evaluated using the QUADAS‑2 tool, with qualitative synthesis and meta-analysis where feasible. Eight studies were included, employing algorithms such as convolutional neural networks (CNNs), radiomics, and fuzzy C‑mean clustering on T2*-weighted and multiparametric MRI. Pooled diagnostic accuracy from three studies showed a sensitivity of 0.79 (95% CI: 0.66-0.88) and specificity of 0.77 (95% CI: 0.64-0.86), with an AUC of 0.84. The DL methods demonstrated high precision (e.g., Pearson's r = 0.999) and automation, reducing processing times to as low as 0.1 s/slice. Limitations included heterogeneity, limited generalizability, and small external validation sets. Both ML and DL enhance MRI-based LIC quantification, offering high accuracy and efficiency. Standardized protocols and multicenter validation are needed to ensure clinical scalability and equitable access.