Latest Papers on Radiology AI. Sources: medrxiv, Tags: Segmentation.

Deep Learning for Breast Mass Discrimination: Integration of B-Mode Ultrasound & Nakagami Imaging with Automatic Lesion Segmentation

Hassan, M. W., Hossain, M. M.

•preprint•Sep 15 2025

ObjectiveThis study aims to enhance breast cancer diagnosis by developing an automated deep learning framework for real-time, quantitative ultrasound imaging. Breast cancer is the second leading cause of cancer-related deaths among women, and early detection is crucial for improving survival rates. Conventional ultrasound, valued for its non-invasive nature and real-time capability, is limited by qualitative assessments and inter-observer variability. Quantitative ultrasound (QUS) methods, including Nakagami imaging--which models the statistical distribution of backscattered signals and lesion morphology--present an opportunity for more objective analysis. MethodsThe proposed framework integrates three convolutional neural networks (CNNs): (1) NakaSynthNet, synthesizing quantitative Nakagami parameter images from B-mode ultrasound; (2) SegmentNet, enabling automated lesion segmentation; and (3) FeatureNet, which combines anatomical and statistical features for classifying lesions as benign or malignant. Training utilized a diverse dataset of 110,247 images, comprising clinical B-mode scans and various simulated examples (fruit, mammographic lesions, digital phantoms). Quantitative performance was evaluated using mean squared error (MSE), structural similarity index (SSIM), segmentation accuracy, sensitivity, specificity, and area under the curve (AUC). ResultsNakaSynthNet achieved real-time synthesis at 21 frames/s, with MSE of 0.09% and SSIM of 98%. SegmentNet reached 98.4% accuracy, and FeatureNet delivered 96.7% overall classification accuracy, 93% sensitivity, 98% specificity, and an AUC of 98%. ConclusionThe proposed multi-parametric deep learning pipeline enables accurate, real-time breast cancer diagnosis from ultrasound data using objective quantitative imaging. SignificanceThis framework advances the clinical utility of ultrasound by reducing subjectivity and providing robust, multi-parametric information for improved breast cancer detection.

Ultrasound Segmentation Breast Methodology In Silico Academic Lab

Toward Reliable Thalamic Segmentation: a rigorous evaluation of automated methods for structural MRI

Argyropoulos, G. P. D., Butler, C. R., Saranathan, M.

•preprint•Sep 12 2025

Automated thalamic nuclear segmentation has contributed towards a shift in neuroimaging analyses from treating the thalamus as a homogeneous, passive relay, to a set of individual nuclei, embedded within distinct brain-wide circuits. However, many studies continue to widely rely on FreeSurfers segmentation of T1-weighted structural MRIs, despite their poor intrathalamic nuclear contrast. Meanwhile, a convolutional neural network tool has been developed for FreeSurfer, using information from both diffusion and T1-weighted MRIs. Another popular thalamic nuclear segmentation technique is HIPS-THOMAS, a multi-atlas-based method that leverages white-matter-like contrast synthesized from T1-weighted MRIs. However, rigorous comparisons amongst methods remain scant, and the thalamic atlases against which these methods have been assessed have their own limitations. These issues may compromise the quality of cross-species comparisons, structural and functional connectivity studies in health and disease, as well as the efficacy of neuromodulatory interventions targeting the thalamus. Here, we report, for the first time, comparisons amongst HIPS-THOMAS, the standard FreeSurfer segmentation, and its more recent development, against two thalamic atlases as silver-standard ground-truths. We used two cohorts of healthy adults, and one cohort of patients in the chronic phase of autoimmune limbic encephalitis. In healthy adults, HIPS-THOMAS surpassed, not only the standard FreeSurfer segmentation, but also its more recent, diffusion-based update. The improvements made with the latter relative to the former were limited to a few nuclei. Finally, the standard FreeSurfer method underperformed, relative to the other two, in distinguishing between patients and healthy controls based on the affected anteroventral and pulvinar nuclei. In light of the above findings, we provide recommendations on the use of automated segmentation methods of the human thalamus using structural brain imaging.

MRI Segmentation Neurological Retrospective Clinical In Silico

Deep learning-based precision phenotyping of spine curvature identifies novel genetic risk loci for scoliosis in the UK Biobank

Zeosk, M., Kun, E., Reddy, S., Pandey, D., Xu, L., Wang, J. Y., Li, C., Gray, R. S., Wise, C. A., Otomo, N., Narasimhan, V. M.

•preprint•Sep 5 2025

Scoliosis is the most common developmental spinal deformity, but its genetic underpinnings remain only partially understood. To enhance the identification of scoliosis-related loci, we utilized whole body dual energy X-ray absorptiometry (DXA) scans from 57,887 individuals in the UK Biobank (UKB), and quantified spine curvature by applying deep learning models to segment then landmark vertebrae to measure the cumulative horizontal displacement of the spine from a central axis. On a subset of 120 individuals, our automated image-derived curvature measurements showed a correlation 0.92 with clinical Cobb angle assessments, supporting their validity as a proxy for scoliosis severity. To connect spinal curvature with its genetic basis we conducted a genome-wide association study (GWAS). Our quantitative imaging phenotype allowed us to identify 2 novel loci associated with scoliosis in a European population not seen in previous GWAS. These loci are in the gene SEM1/SHFM1 as well as on a lncRNA on chr 3 that is downstream of EDEM1 and upstream of GRM7. Genetic correlation analysis revealed significant overlap between our image-based GWAS and ICD-10 based GWAS in both the UKB and Biobank of Japan. We also showed that our quantitative GWAS had more statistical power to identify new loci than a case-control dataset with an order of magnitude larger sample size. Increased spine curvature was also associated with increased leg length discrepancy, reduced muscle strength and decreased bone density, and increased incidence of knee but not hip osteoarthritis. Our results illustrate the potential of using quantitative imaging phenotypes to uncover genetic associations that are challenging to capture with medical records alone and identify new loci for functional follow-up.

X-Ray Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab Open Dataset

Decoding Fibrosis: Transcriptomic and Clinical Insights via AI-Derived Collagen Deposition Phenotypes in MASLD

Wojciechowska, M. K., Thing, M., Hu, Y., Mazzoni, G., Harder, L. M., Werge, M. P., Kimer, N., Das, V., Moreno Martinez, J., Prada-Medina, C. A., Vyberg, M., Goldin, R., Serizawa, R., Tomlinson, J., Douglas Gaalsgard, E., Woodcock, D. J., Hvid, H., Pfister, D. R., Jurtz, V. I., Gluud, L.-L., Rittscher, J.

•preprint•Sep 2 2025

Histological assessment is foundational to multi-omics studies of liver disease, yet conventional fibrosis staging lacks resolution, and quantitative metrics like collagen proportionate area (CPA) fail to capture tissue architecture. While recent AI-driven approaches offer improved precision, they are proprietary and not accessible to academic research. Here, we present a novel, interpretable AI-based framework for characterising liver fibrosis from picrosirius red (PSR)-stained slides. By identifying distinct data-driven collagen deposition phenotypes (CDPs) which capture distinct morphologies, our method substantially improves the sensitivity and specificity of downstream transcriptomic and proteomic analyses compared to CPA and traditional fibrosis scores. Pathway analysis reveals that CDPs 4 and 5 are associated with active extracellular matrix remodelling, while phenotype correlates highlight links to liver functional status. Importantly, we demonstrate that selected CDPs can predict clinical outcomes with similar accuracy to established fibrosis metrics. All models and tools are made freely available to support transparent and reproducible multi-omics pathology research. HighlightsO_LIWe present a set of data-driven collagen deposition phenotypes for analysing PSR-stained liver biopsies, offering a spatially informed alternative to conventional fibrosis staging and CPA available as open-source code. C_LIO_LIThe identified collagen deposition phenotypes enhance transcriptomic and proteomic signal detection, revealing active ECM remodelling and distinct functional tissue states. C_LIO_LISelected phenotypes predict clinical outcomes with performance comparable to fibrosis stage and CPA, highlighting their potential as candidate quantitative indicators of fibrosis severity. C_LI O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=98 SRC="FIGDIR/small/25334719v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@1793532org.highwire.dtl.DTLVardef@93a0d8org.highwire.dtl.DTLVardef@24d289_HPS_FORMAT_FIGEXP M_FIG C_FIG

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code Open Dataset

Optimizing and Evaluating Robustness of AI for Brain Metastasis Detection and Segmentation via Loss Functions and Multi-dataset Training

Han, Y., Pathak, P., Award, O., Mohamed, A. S. R., Ugarte, V., Zhou, B., Hamstra, D. A., Echeverria, A. E., Mekdash, H. A., Siddiqui, Z. A., Sun, B.

•preprint•Sep 2 2025

Purpose: Accurate detection and segmentation of brain metastases (BM) from MRI are critical for the appropriate management of cancer patients. This study investigates strategies to enhance the robustness of artificial intelligence (AI)-based BM detection and segmentation models. Method: A DeepMedic-based network with a loss function, tunable with a sensitivity/specificity tradeoff weighting factor \alpha- was trained on T1 post-contrast MRI datasets from two institutions (514 patients, 4520 lesions). Robustness was evaluated on an external dataset from a third institution dataset (91 patients, 397 lesions), featuring ground truth annotations from two physicians. We investigated the impact of loss function weighting factor, \alpha and training dataset combinations. Detection performance (sensitivity, precision, F1 score) and segmentation accuracy (Dice similarity, and 95% Hausdorff distance (HD95)) were evaluated using one physician contours as the reference standard. The optimal AI model was then directly compared to the performance of the second physician. Results: Varying demonstrated a trade-off between sensitivity (higher ) and precision (lower ), with =0.5 yielding the best F1 score (0.80 {+/-} 0.04 vs. 0.78 {+/-} 0.04 for =0.95 and 0.72 {+/-} 0.03 for =0.99) on the external dataset. The optimally trained model achieved detection performance comparable to the physician (F1: AI=0.83 {+/-} 0.04, Physician=0.83 {+/-} 0.04), but slightly underperformed in segmentation (Dice: 0.79 {+/-} 0.04 vs. AI=0.74 {+/-} 0.03; HD95: 2.8 {+/-} 0.14 mm vs. AI=3.18 {+/-} 0.16 mm, p<0.05). Conclusion: The derived optimal model achieves detection and segmentation performance comparable to an expert physician in a parallel comparison.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab

Ultra-low-field MRI for imaging of severe multiple sclerosis: a case-controlled study

Bergsland, N., Burnham, A., Dwyer, M. G., Bartnik, A., Schweser, F., Kennedy, C., Tranquille, A., Semy, M., Schnee, E., Young-Hong, D., Eckert, S., Hojnacki, D., Reilly, C., Benedict, R. H., Weinstock-Guttman, B., Zivadinov, R.

•preprint•Aug 27 2025

BackgroundSevere multiple sclerosis (MS) presents challenges for clinical research due to mobility constraints and specialized care needs. Traditional MRI studies often exclude this population, limiting understanding of severe MS progression. Portable, ultra-low-field MRI enables bedside imaging. ObjectivesTo (i) assess the feasibility of portable MRI in severe MS, (ii) compare measurement approaches for automated tissue volumetry from ultra-low-field MRI. MethodsThis prospective study enrolled 40 progressive MS patients (24 severely disabled, 16 less severe) from academic and skilled nursing settings. Participants underwent 0.064T MRI for tissue volumetry using conventional and artificial intelligence (AI)-driven segmentation. Clinical assessments included physical disability and cognition. Group comparisons and MRI-clinical associations were assessed. ResultsMRI passed rigorous quality control, reflecting complete brain coverage and lack of motion artifact, in 38/40 participants. In terms of severe versus less severe disease, the largest effect sizes were obtained with conventionally-calculated gray matter (GM) volume (partial 2=0.360), cortical GM volume (partial 2=0.349), and whole brain volume (partial 2=0.290) while an AI-based approach yielded the highest effect size for white matter volume (partial 2=0.209). For clinical outcomes, the most consistent associations were found using conventional processing while AI-based methods were dependent on algorithm and input image, especially for cortical GM volume. ConclusionPortable, ultralow-field MRI is a feasible bedside tool that can provide insights into late-stage neurodegeneration in individuals living with severe MS. However, careful consideration is required in implementing tissue volumetry pipelines as findings are heavily dependent on the choice of algorithm and input.

MRI Segmentation Neurological Prospective Clinical Pilot Academic Lab

Whole-genome sequencing analysis of left ventricular structure and sphericity in 80,000 people

Pirruccello, J.

•preprint•Aug 26 2025

BackgroundSphericity is a measurement of how closely an object approximates a globe. The sphericity of the blood pool of the left ventricle (LV), is an emerging measure linked to myocardial dysfunction. MethodsVideo-based deep learning models were trained for semantic segmentation (pixel labeling) in cardiac magnetic resonance imaging in 84,327 UK Biobank participants. These labeled pixels were co-oriented in 3D and used to construct surface meshes. LV ejection fraction, mass, volume, surface area, and sphericity were calculated. Epidemiologic and genetic analyses were conducted. Polygenic score validation was performed in All of Us. Results3D LV sphericity was found to be more strongly associated (HR 10.3 per SD, 95% CI 6.1-17.3) than LV ejection fraction (HR 2.9 per SD reduction, 95% CI 2.4-3.6) with dilated cardiomyopathy (DCM). Paired with whole genome sequencing, these measurements linked LV structure and function to 366 distinct common and low-frequency genetic loci--and 17 genes with rare variant burden--spanning a 25-fold range of effect size. The discoveries included 22 out of the 26 loci that were recently associated with DCM. LV genome-wide polygenic scores were equivalent to, or outperformed, dedicated hypertrophic cardiomyopathy (HCM) and DCM polygenic scores for disease prediction. In All of Us, those in the polygenic extreme 1% had an estimated 6.6% risk of DCM by age 80, compared to 33% for carriers of rare truncating variants in the gene TTN. Conclusions3D sphericity is a distinct, heritable LV measurement that is intricately linked to risk for HCM and DCM. The genetic findings from this study raise the possibility that the majority of common genetic loci that will be discovered in future large-scale DCM analyses are present in the current results.

MRI Segmentation Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Deep Learning-Assisted Skeletal Muscle Radiation Attenuation at C3 Predicts Survival in Head and Neck Cancer

Barajas Ordonez, F., Xie, K., Ferreira, A., Siepmann, R., Chargi, N., Nebelung, S., Truhn, D., Berge, S., Bruners, P., Egger, J., Hölzle, F., Wirth, M., Kuhl, C., Puladi, B.

•preprint•Aug 21 2025

BackgroundHead and neck cancer (HNC) patients face an increased risk of malnutrition due to lifestyle, tumor localization, and treatment effects. While skeletal muscle area (SMA) and radiation attenuation (SM-RA) at the third lumbar vertebra (L3) are established prognostic markers, L3 is not routinely available in head and neck imaging. The prognostic value of SM-RA at the third cervical vertebra (C3) remains unclear. This study assesses whether SMA and SM-RA at C3 predict locoregional control (LRC) and overall survival (OS) in HNC. MethodsWe analyzed 904 HNC cases with head and neck CT scans. A deep learning pipeline identified C3, and SMA/SM-RA were quantified via automated segmentation with manual verification. Cox proportional hazards models assessed associations with LRC and OS, adjusting for clinical factors. ResultsMedian SMA and SM-RA were 36.64 cm{superscript 2} (IQR: 30.12-42.44) and 50.77 HU (IQR: 43.04-57.39). In multivariate analysis, lower SMA (HR 1.62, 95% CI: 1.02-2.58, p = 0.04), lower SM-RA (HR 1.89, 95% CI: 1.30-2.79, p < 0.001), and advanced T stage (HR 1.50, 95% CI: 1.06-2.12, p = 0.02) were prognostic for LRC. OS predictors included advanced T stage (HR 2.17, 95% CI: 1.64-2.87, p < 0.001), age [≥]70 years (HR 1.40, 95% CI: 1.00-1.96, p = 0.05), male sex (HR 1.64, 95% CI: 1.02-2.63, p = 0.04), and lower SM-RA (HR 2.15, 95% CI: 1.56-2.96, p < 0.001). ConclusionDeep learning-assisted SM-RA assessment at C3 outperforms SMA for LRC and OS in HNC, supporting its use as a routine biomarker and L3 alternative.

CT Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Automated Deep Learning Pipeline for Callosal Angle Quantification

shirzadeh barough, s., Bilgel, M., Ventura, C., Moghekar, A., Albert, M., Miller, M. I., Moghekar, A.

•preprint•Aug 21 2025

BACKGROUND AND PURPOSENormal pressure hydrocephalus (NPH) is a potentially treatable neurodegenerative disorder that remains underdiagnosed due to its clinical overlap with other conditions and the labor-intensive nature of manual imaging analyses. Imaging biomarkers, such as the callosal angle (CA), Evans Index (EI), and Disproportionately Enlarged Subarachnoid Space Hydrocephalus (DESH), play a crucial role in NPH diagnosis but are often limited by subjective interpretations. To address these challenges, we developed a fully automated and robust deep learning framework for measuring the CA directly from raw T1 MPRAGE and non-MPRAGE MRI scans. MATERIALS AND METHODSOur method integrates two complementary modules. First, a BrainSignsNET model is employed to accurately detect key anatomical landmarks, notably the anterior commissure (AC) and posterior commissure (PC). Preprocessed 3D MRI scans, reoriented to the Right Anterior Superior (RAS) system and resized to standardized cubes while preserving aspect ratios, serve as input for landmark localization. After detecting these landmarks, a coronal slice, perpendicular to the AC-PC line at the PC level, is extracted for subsequent analysis. Second, a UNet-based segmentation network, featuring a pretrained EfficientNetB0 encoder, generates multiclass masks of the lateral ventricles from the coronal slices which then used for calculation of the Callosal Angle. RESULTSTraining and internal validation were performed using datasets from the Baltimore Longitudinal Study of Aging (BLSA) and BIOCARD, while external validation utilized 216 clinical MRI scans from Johns Hopkins Bayview Hospital. Our framework achieved high concordance with manual measurements, demonstrating a strong correlation (r = 0.98, p < 0.001) and a mean absolute error (MAE) of 2.95 (SD 1.58) degrees. Moreover, error analysis confirmed that CA measurement performance was independent of patient age, gender, and EI, underscoring the broad applicability of this method. CONCLUSIONSThese results indicate that our fully automated CA measurement framework is a reliable and reproducible alternative to manual methods, outperforms reported interobserver variability in assessing the callosal angle, and offers significant potential to enhance early detection and diagnosis of NPH in both research and clinical settings.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab

A Cardiac-specific CT Foundation Model for Heart Transplantation

Xu, H., Woicik, A., Asadian, S., Shen, J., Zhang, Z., Nabipoor, A., Musi, J. P., Keenan, J., Khorsandi, M., Al-Alao, B., Dimarakis, I., Chalian, H., Lin, Y., Fishbein, D., Pal, J., Wang, S., Lin, S.

•preprint•Aug 19 2025

Heart failure is a major cause of morbitidy and mortality, with the severest forms requiring heart transplantation. Heart size matching between the donor and recipient is a critical step in ensuring a successful transplantation. Currently, a set of equations based on population measures of height, weight, sex and age, viz. predicted heart mass (PHM), are used but can be improved upon by personalized information from recipient and donor chest CT images. Here, we developed GigaHeart, the first heart-specific foundation model pretrained on 180,897 chest CT volumes from 56,607 patients. The key idea of GigaHeart is to direct the foundation models attention towards the heart by contrasting the heart region and the entire chest, thereby encouraging the model to capture fine-grained cardiac features. GigaHeart achieves the best performance on 8 cardiac-specific classification tasks and further, exhibits superior performance on cross-modal tasks by jointly modeling CT images and reports. We similarly developed a thorax-specific foundation model and observed promising performance on 9 thorax-specific tasks, indicating the potential to extend GigaHeart to other organ-specific foundation models. More importantly, GigaHeart addresses the heart sizing problem. It avoids oversizing by correctly segmenting the sizes of hearts of donors and recipients. In regressions against actual heart masses, our AI-segmented total cardiac volumes (TCVs) has a 33.3% R2 improvement when compared to PHM. Meanwhile, GigaHeart also solves the undersizing problem by adding a regression layer to the model. Specifically, GigaHeart reduces the mean squared error by 57% against PHM. In total, we show that GigaHeart increases the acceptable range of donor heart sizes and matches more accurately than the widely used PHM equations. In all, GigaHeart is a state-of-the-art, cardiac-specific foundation model with the key innovation of directing the models attention to the heart. GigaHeart can be finetuned for accomplishing a number of tasks accurately, of which AI-assisted heart sizing is a novel example.

CT Segmentation Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags