Latest Papers on Radiology AI. Tags: None

Explainable multimodal deep learning for predicting thyroid cancer lateral lymph node metastasis using ultrasound imaging.

Shen P, Yang Z, Sun J, Wang Y, Qiu C, Wang Y, Ren Y, Liu S, Cai W, Lu H, Yao S

•papers•Aug 1 2025

Preoperative prediction of lateral lymph node metastasis is clinically crucial for guiding surgical strategy and prognosis assessment, yet precise prediction methods are lacking. We therefore develop Lateral Lymph Node Metastasis Network (LLNM-Net), a bidirectional-attention deep-learning model that fuses multimodal data (preoperative ultrasound images, radiology reports, pathological findings, and demographics) from 29,615 patients and 9836 surgical cases across seven centers. Integrating nodule morphology and position with clinical text, LLNM-Net achieves an Area Under the Curve (AUC) of 0.944 and 84.7% accuracy in multicenter testing, outperforming human experts (64.3% accuracy) and surpassing previous models by 7.4%. Here we show tumors within 0.25 cm of the thyroid capsule carry >72% metastasis risk, with middle and upper lobes as high-risk regions. Leveraging location, shape, echogenicity, margins, demographics, and clinician inputs, LLNM-Net further attains an AUC of 0.983 for identifying high-risk patients. The model is thus a promising for tool for preoperative screening and risk stratification.

Ultrasound Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Transparent brain tumor detection using DenseNet169 and LIME.

Abraham LA, Palanisamy G, Veerapu G

•papers•Aug 1 2025

A crucial area of research in the field of medical imaging is that of brain tumor classification, which greatly aids diagnosis and facilitates treatment planning. This paper proposes DenseNet169-LIME-TumorNet, a model based on deep learning and an integrated combination of DenseNet169 with LIME to boost the performance of brain tumor classification and its interpretability. The model was trained and evaluated on the publicly available Brain Tumor MRI Dataset containing 2,870 images spanning three tumor types. Dense169-LIME-TumorNet achieves a classification accuracy of 98.78%, outperforming widely used architectures including Inception V3, ResNet50, MobileNet V2, EfficientNet variants, and other DenseNet configurations. The integration of LIME provides visual explanations that enhance transparency and reliability in clinical decision-making. Furthermore, the model demonstrates minimal computational overhead, enabling faster inference and deployment in resource-constrained clinical environments, thereby highlighting its practical utility for real-time diagnostic support. Work in the future should run towards creating generalization through the adoption of a multi-modal learning approach, hybrid deep learning development, and real-time application development for AI-assisted diagnosis.

MRI Classification Neurological Methodology In Silico GenAI

Development and Validation of a Brain Aging Biomarker in Middle-Aged and Older Adults: Deep Learning Approach.

Li Z, Li J, Li J, Wang M, Xu A, Huang Y, Yu Q, Zhang L, Li Y, Li Z, Wu X, Bu J, Li W

•papers•Aug 1 2025

Precise assessment of brain aging is crucial for early detection of neurodegenerative disorders and aiding clinical practice. Existing magnetic resonance imaging (MRI)-based methods excel in this task, but they still have room for improvement in capturing local morphological variations across brain regions and preserving the inherent neurobiological topological structures. To develop and validate a deep learning framework incorporating both connectivity and complexity for accurate brain aging estimation, facilitating early identification of neurodegenerative diseases. We used 5889 T1-weighted MRI scans from the Alzheimer's Disease Neuroimaging Initiative dataset. We proposed a novel brain vision graph neural network (BVGN), incorporating neurobiologically informed feature extraction modules and global association mechanisms to provide a sensitive deep learning-based imaging biomarker. Model performance was evaluated using mean absolute error (MAE) against benchmark models, while generalization capability was further validated on an external UK Biobank dataset. We calculated the brain age gap across distinct cognitive states and conducted multiple logistic regressions to compare its discriminative capacity against conventional cognitive-related variables in distinguishing cognitively normal (CN) and mild cognitive impairment (MCI) states. Longitudinal track, Cox regression, and Kaplan-Meier plots were used to investigate the longitudinal performance of the brain age gap. The BVGN model achieved an MAE of 2.39 years, surpassing current state-of-the-art approaches while obtaining an interpretable saliency map and graph theory supported by medical evidence. Furthermore, its performance was validated on the UK Biobank cohort (N=34,352) with an MAE of 2.49 years. The brain age gap derived from BVGN exhibited significant difference across cognitive states (CN vs MCI vs Alzheimer disease; P<.001), and demonstrated the highest discriminative capacity between CN and MCI than general cognitive assessments, brain volume features, and apolipoprotein E4 carriage (area under the receiver operating characteristic curve [AUC] of 0.885 vs AUC ranging from 0.646 to 0.815). Brain age gap exhibited clinical feasibility combined with Functional Activities Questionnaire, with improved discriminative capacity in models achieving lower MAEs (AUC of 0.945 vs 0.923 and 0.911; AUC of 0.935 vs 0.900 and 0.881). An increasing brain age gap identified by BVGN may indicate underlying pathological changes in the CN to MCI progression, with each unit increase linked to a 55% (hazard ratio=1.55, 95% CI 1.13-2.13; P=.006) higher risk of cognitive decline in individuals who are CN and a 29% (hazard ratio=1.29, 95% CI 1.09-1.51; P=.002) increase in individuals with MCI. BVGN offers a precise framework for brain aging assessment, demonstrates strong generalization on an external large-scale dataset, and proposes novel interpretability strategies to elucidate multiregional cooperative aging patterns. The brain age gap derived from BVGN is validated as a sensitive biomarker for early identification of MCI and predicting cognitive decline, offering substantial potential for clinical applications.

MRI Classification Neurological Retrospective Clinical In Silico Benchmark SOTA

Your other Left! Vision-Language Models Fail to Identify Relative Positions in Medical Images

Daniel Wolf, Heiko Hillenhagen, Billurvan Taskin, Alex Bäuerle, Meinrad Beer, Michael Götz, Timo Ropinski

•preprint•Aug 1 2025

Clinical decision-making relies heavily on understanding relative positions of anatomical structures and anomalies. Therefore, for Vision-Language Models (VLMs) to be applicable in clinical practice, the ability to accurately determine relative positions on medical images is a fundamental prerequisite. Despite its importance, this capability remains highly underexplored. To address this gap, we evaluate the ability of state-of-the-art VLMs, GPT-4o, Llama3.2, Pixtral, and JanusPro, and find that all models fail at this fundamental task. Inspired by successful approaches in computer vision, we investigate whether visual prompts, such as alphanumeric or colored markers placed on anatomical structures, can enhance performance. While these markers provide moderate improvements, results remain significantly lower on medical images compared to observations made on natural images. Our evaluations suggest that, in medical imaging, VLMs rely more on prior anatomical knowledge than on actual image content for answering relative position questions, often leading to incorrect conclusions. To facilitate further research in this area, we introduce the MIRP , Medical Imaging Relative Positioning, benchmark dataset, designed to systematically evaluate the capability to identify relative positions in medical images.

Mixed Modality Classification Dataset Release In Silico Academic Lab Open Dataset Benchmark SOTA

Minimum Data, Maximum Impact: 20 annotated samples for explainable lung nodule classification

Luisa Gallée, Catharina Silvia Lisson, Christoph Gerhard Lisson, Daniela Drees, Felix Weig, Daniel Vogele, Meinrad Beer, Michael Götz

•preprint•Aug 1 2025

Classification models that provide human-interpretable explanations enhance clinicians' trust and usability in medical image diagnosis. One research focus is the integration and prediction of pathology-related visual attributes used by radiologists alongside the diagnosis, aligning AI decision-making with clinical reasoning. Radiologists use attributes like shape and texture as established diagnostic criteria and mirroring these in AI decision-making both enhances transparency and enables explicit validation of model outputs. However, the adoption of such models is limited by the scarcity of large-scale medical image datasets annotated with these attributes. To address this challenge, we propose synthesizing attribute-annotated data using a generative model. We enhance the Diffusion Model with attribute conditioning and train it using only 20 attribute-labeled lung nodule samples from the LIDC-IDRI dataset. Incorporating its generated images into the training of an explainable model boosts performance, increasing attribute prediction accuracy by 13.4% and target prediction accuracy by 1.8% compared to training with only the small real attribute-annotated dataset. This work highlights the potential of synthetic data to overcome dataset limitations, enhancing the applicability of explainable models in medical image analysis.

CT Classification Chest Methodology In Silico GenAI

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeböck

•preprint•Aug 1 2025

Accurate segmentation of small lesions in Breast Dynamic Contrast-Enhanced MRI (DCE-MRI) is critical for early cancer detection, especially in high-risk patients. While recent deep learning methods have advanced lesion segmentation, they primarily target large lesions and neglect valuable longitudinal and clinical information routinely used by radiologists. In real-world screening, detecting subtle or emerging lesions requires radiologists to compare across timepoints and consider previous radiology assessments, such as the BI-RADS score. We propose LesiOnTime, a novel 3D segmentation approach that mimics clinical diagnostic workflows by jointly leveraging longitudinal imaging and BIRADS scores. The key components are: (1) a Temporal Prior Attention (TPA) block that dynamically integrates information from previous and current scans; and (2) a BI-RADS Consistency Regularization (BCR) loss that enforces latent space alignment for scans with similar radiological assessments, thus embedding domain knowledge into the training process. Evaluated on a curated in-house longitudinal dataset of high-risk patients with DCE-MRI, our approach outperforms state-of-the-art single-timepoint and longitudinal baselines by 5% in terms of Dice. Ablation studies demonstrate that both TPA and BCR contribute complementary performance gains. These results highlight the importance of incorporating temporal and clinical context for reliable early lesion segmentation in real-world breast cancer screening. Our code is publicly available at https://github.com/cirmuw/LesiOnTime

MRI Segmentation Breast Methodology In Silico Academic Lab Open Code

Weakly Supervised Intracranial Aneurysm Detection and Segmentation in MR angiography via Multi-task UNet with Vesselness Prior

Erin Rainville, Amirhossein Rasoulian, Hassan Rivaz, Yiming Xiao

•preprint•Aug 1 2025

Intracranial aneurysms (IAs) are abnormal dilations of cerebral blood vessels that, if ruptured, can lead to life-threatening consequences. However, their small size and soft contrast in radiological scans often make it difficult to perform accurate and efficient detection and morphological analyses, which are critical in the clinical care of the disorder. Furthermore, the lack of large public datasets with voxel-wise expert annotations pose challenges for developing deep learning algorithms to address the issues. Therefore, we proposed a novel weakly supervised 3D multi-task UNet that integrates vesselness priors to jointly perform aneurysm detection and segmentation in time-of-flight MR angiography (TOF-MRA). Specifically, to robustly guide IA detection and segmentation, we employ the popular Frangi's vesselness filter to derive soft cerebrovascular priors for both network input and an attention block to conduct segmentation from the decoder and detection from an auxiliary branch. We train our model on the Lausanne dataset with coarse ground truth segmentation, and evaluate it on the test set with refined labels from the same database. To further assess our model's generalizability, we also validate it externally on the ADAM dataset. Our results demonstrate the superior performance of the proposed technique over the SOTA techniques for aneurysm segmentation (Dice = 0.614, 95%HD =1.38mm) and detection (false positive rate = 1.47, sensitivity = 92.9%).

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Cerebral Amyloid Deposition With 18F-Florbetapir PET Mediates Retinal Vascular Density and Cognitive Impairment in Alzheimer's Disease.

Chen Z, He HL, Qi Z, Bi S, Yang H, Chen X, Xu T, Jin ZB, Yan S, Lu J

•papers•Aug 1 2025

Alzheimer's disease (AD) is accompanied by alterations in retinal vascular density (VD), but the mechanisms remain unclear. This study investigated the relationship among cerebral amyloid-β (Aβ) deposition, VD, and cognitive decline. We enrolled 92 participants, including 47 AD patients and 45 healthy control (HC) participants. VD across retinal subregions was quantified using deep learning-based fundus photography, and cerebral Aβ deposition was measured with 18F-florbetapir (18F-AV45) PET/MRI. Using the minimum bounding circle of the optic disc as the diameter (papilla-diameter, PD), VD (total, 0.5-1.0 PD, 1.0-1.5 PD, 1.5-2.0 PD, 2.0-2.5 PD) was calculated. Standardized uptake value ratio (SUVR) for Aβ deposition was computed for global and regional cortical areas, using the cerebellar cortex as the reference region. Cognitive performance was assessed with the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA). Pearson correlation, multiple linear regression, and mediation analyses were used to explore Aβ deposition, VD, and cognition. AD patients exhibited significantly lower VD in all subregions compared to HC (p < 0.05). Reduced VD correlated with higher SUVR in the global cortex and a decline in cognitive abilities (p < 0.05). Mediation analysis indicated that VD influenced MMSE and MoCA through SUVR in the global cortex, with the most pronounced effects observed in the 1.0-1.5 PD range. Retinal VD is associated with cognitive decline, a relationship primarily mediated by cerebral Aβ deposition measured via 18F-AV45 PET. These findings highlight the potential of retinal VD as a biomarker for early detection in AD.

Mixed Modality Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Rapid review: Growing usage of Multimodal Large Language Models in healthcare.

Gupta P, Zhang Z, Song M, Michalowski M, Hu X, Stiglic G, Topaz M

•papers•Aug 1 2025

Recent advancements in large language models (LLMs) have led to multimodal LLMs (MLLMs), which integrate multiple data modalities beyond text. Although MLLMs show promise, there is a gap in the literature that empirically demonstrates their impact in healthcare. This paper summarizes the applications of MLLMs in healthcare, highlighting their potential to transform health practices. A rapid literature review was conducted in August 2024 using World Health Organization (WHO) rapid-review methodology and PRISMA standards, with searches across four databases (Scopus, Medline, PubMed and ACM Digital Library) and top-tier conferences-including NeurIPS, ICML, AAAI, MICCAI, CVPR, ACL and EMNLP. Articles on MLLMs healthcare applications were included for analysis based on inclusion and exclusion criteria. The search yielded 115 articles, 39 included in the final analysis. Of these, 77% appeared online (preprints and published) in 2024, reflecting the emergence of MLLMs. 80% of studies were from Asia and North America (mainly China and US), with Europe lagging. Studies split evenly between pre-built MLLMs evaluations (60% focused on GPT versions) and custom MLLMs/frameworks development with task-specific customizations. About 81% of studies examined MLLMs for diagnosis and reporting in radiology, pathology, and ophthalmology, with additional applications in education, surgery, and mental health. Prompting strategies, used in 80% of studies, improved performance in nearly half. However, evaluation practices were inconsistent with 67% reported accuracy. Error analysis was mostly anecdotal, with only 18% categorized failure types. Only 13% validated explainability through clinician feedback. Clinical deployment was demonstrated in just 3% of studies, and workflow integration, governance, and safety were rarely addressed. MLLMs offer substantial potential for healthcare transformation through multimodal data integration. Yet, methodological inconsistencies, limited validation, and underdeveloped deployment strategies highlight the need for standardized evaluation metrics, structured error analysis, and human-centered design to support safe, scalable, and trustworthy clinical adoption.

Mixed Modality LLM Radiology Report Review Concept Academic Lab GenAI Ethics

Optimization strategy for fat-suppressed T2-weighted images in liver imaging: The combined application of AI-assisted compressed sensing and respiratory triggering.

Feng M, Li S, Song X, Mao W, Liu Y, Yuan Z

•papers•Aug 1 2025

This study aimed to optimize the imaging time and image quality of T2WI-FS through the integration of Artificial Intelligence-Assisted Compressed Sensing (ACS) and respiratory triggering (RT). A prospective cohort study was conducted on one hundred thirty-four patients (99 males, 35 females; average age: 57.93 ± 9.40 years) undergoing liver MRI between March and July 2024. All patients were scanned using both breath-hold ACS-assisted T2WI (BH-ACS-T2WI) and respiratory-triggered ACS-assisted T2WI (RT-ACS-T2WI) sequences. Two experienced radiologists retrospectively analyzed regions of interest (ROIs), recorded primary lesions, and assessed key metrics including signal intensity (SI), standard deviation (SD), signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), motion artifacts, hepatic vessel clarity, liver edge sharpness, lesion conspicuity, and overall image quality. Statistical comparisons were conducted using Mann-Whitney U test, Wilcoxon signed-rank test and intraclass correlation coefficient (ICC). Compared to BH-ACS-T2WI, RT-ACS-T2WI significantly reduced average imaging time from 38 s to 22.91 ± 3.36 s, achieving a 40 % reduction in scan duration. Additionally, RT-ACS-T2WI demonstrated superior performance across multiple parameters, including SI, SD, SNR, CNR, motion artifact reduction, hepatic vessel clarity, liver edge sharpness, lesion conspicuity (≤5 mm), and overall image quality (P < 0.05). Notably, the lesion detection rate was slightly higher with RT-ACS-T2WI (94 %) compared to BH-ACS-T2WI (90 %). The RT-ACS-T2WI sequence not only enhanced image quality but also reduced imaging time to approximately 23 s, making it particularly beneficial for patients unable to perform prolonged breath-holding maneuvers. This approach represents a promising advancement in optimizing liver MRI protocols.

MRI Reconstruction Abdominal Prospective Clinical Pilot Academic Lab

Filter Papers

Tags

Explainable multimodal deep learning for predicting thyroid cancer lateral lymph node metastasis using ultrasound imaging.

Transparent brain tumor detection using DenseNet169 and LIME.

Development and Validation of a Brain Aging Biomarker in Middle-Aged and Older Adults: Deep Learning Approach.

Your other Left! Vision-Language Models Fail to Identify Relative Positions in Medical Images

Minimum Data, Maximum Impact: 20 annotated samples for explainable lung nodule classification

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Weakly Supervised Intracranial Aneurysm Detection and Segmentation in MR angiography via Multi-task UNet with Vesselness Prior

Cerebral Amyloid Deposition With <sup>18</sup>F-Florbetapir PET Mediates Retinal Vascular Density and Cognitive Impairment in Alzheimer's Disease.

Rapid review: Growing usage of Multimodal Large Language Models in healthcare.

Optimization strategy for fat-suppressed T2-weighted images in liver imaging: The combined application of AI-assisted compressed sensing and respiratory triggering.

Ready to Sharpen Your Edge?